By trade I solve business problems with code, this often entails hours of snorkeling though stack traces and code I’m trying to understand. Not to mention several google tabs open all about the current exception I just cannot wrap my head around.
Recently I was tasked with a simple case to update the spelling of an area called Whanganui, seems simple enough right?
CTRL F find the mistake, update it
CRTL S … boom ship it, case closed. Well its not always that simple…
The problem I faced was a mixture of database de-normalization, years of legacy code, business rules by business unit (or vertical) and to be honest a lack of understanding of all the processes that these business units run daily. Big company’s have many working parts, this is how they make money. This is how they can pay their bills.
So sweet, I found tables to change. There were 3 with the area data and 2 with the business unit specific data. I figured out the SQL Statements, had them reviewed and began testing. (In the end there were more tables I changed)
This was where I made my first mistake, I hyper fixated on only my business unit’s logic, the code caching mechanisms and the exceptions I managed to get the code base to throw while testing. This is exciting stuff for me, if I break code I generally get a better understanding of it. I eventually got the branch to behave in manner that I deemed ready for test, so we tested and shipped the case.
Well then all hell broke loose with exceptions for days in classes that I knew I had tested. Man it was frustrating but we had to roll the change back as it was impacting some production traffic. Thankfully my employer has a zero blame culture, we did our post
system bork ritual’s to understand why things got unhappy and learn from it so we can do better next time.
I realized that my hyper fixation on the
KNOWN was not the correct approach for this problem, we needed extend our focus to the unknown. Sounds weird right? The unknown is, well unknown… it is however reasonable to assume code will break if you give it something weird. It expects
Int 32 but you give it something random like a lizard.
In World War 2, Abraham Wald was smart enough not to focus on the known but to use the
KNOWN to figure out the
UN-KNOWN, the Navy was trying to figure out how to keep their planes in the air after being shot. It was a war, people shot at planes. They looked at planes that had been shot and determined that these areas needed to be beefed up with armor.
According to him, the statisticians were looking at the planes that came back, meaning that the damage was not critical. Wald pointed out that they should do the exact opposite of what the Navy was planning to do. According to him, they should understand that the undamaged areas on the diagram were the reason that the aircraft was able to make it back. - boredpanda.com
In the same way, I needed to focus my changes not only in the database but also in the code base to ethically hack the code to rather log a WARNING than throw exceptions when the database data is processed and it received something
UNKNOWN. This is beefing up the important parts which is the same as the aeroplane engines, cockpit ect which do not have any bullet holes in the picture above but also understanding that the areas that got shot can still function.
As software developers, we fail in two ways: we build the thing wrong, or we build the wrong things.
Not only did I build the thing wrong, I hyper focused on only one area of the business and its
Survival Bias. In the end we shipped the case and there was no
BORK in production but its only because I started to focus on the
UNKNOWN and stopped hyper fixating on the