October 3rd, 2016
(written by lawrence krubner, however indented passages are often quotes). You can contact lawrence at: email@example.com
Checkout/Ordering team: “Your Payment Processing release broke Checkout in production.”
Payment Processing team: “Didn’t know this would affect Checkout. Had no time to look into it because we were too busy working on the new payment functionality in the iOS app.”
Mobile App Product Visionary: “I want my own dedicated team. This happened because Payment Processing is spread too thin to work on the app and the backend at the same time.”
Development Manager: “Org chart change – let’s have all server-side developers on the same team so these disconnects no longer happen.”
IT Manager: “The risk of letting dev teams own the infrastructure is too great. From now on, IT approval is required for any database- and server changes.”
QA Manager: “Our new policy is to only sign off on production releases after thorough regression testing of all system components.”
From a safety point of view, this doesn’t sound unreasonable: The system is too large and complex, so let’s put checks & balances in place to manage the inherent risks. As long as everybody plays by the rules, all should be well. However, from a getting-things-done perspective there is a problem – None of the parties involved owns enough of the stack to deliver something independently from the others. In order to build new features, a lot of coordination and juggling of priorities and resources is required. The resulting need for more management and formal process slows things down. Communication is often second hand, belated and lacking, causing disconnects. In an industry where the ability to execute well under conditions of rapid change is essential, this is sub-optimal. At the same time, the dynamics in play are extremely difficult to change because they are driven by deep human needs – the desire to avoid blame and to feel safe.
Introducing “Agile” processes tries to address this, but in practice the results are often spotty because only the organizational part of the problem is addressed. This ignores that there is a kind of three-legged race relationship between how people are able to collaborate and the architecture of the underlying system. Agile’s impact is blunted if the code base is too large, too hard to understand and too side effect prone to lend itself to being changed safely.
The conundrum presented by increasing system size has been addressed very successfully in more mature industries, by finding ways of building complex systems from easier to handle simpler components. As a result, rotating tires on a car requires no knowledge of how the carburetor works, seat belts and fan belts are not located in a shared belt sub-assembly and opening the trunk won’t switch on the wipers.