Infrastructure failure is not an exceptional condition to be managed until normal service resumes. For many of the world's most consequential operations, it is the permanent operating environment.
What Infrastructure Failure Actually Means
Working with broken infrastructure requires a fundamental reorientation of operational thinking. In well-functioning infrastructure environments, the infrastructure is a given — the power is on, the internet is connected, the road is passable, the supply chain delivers on schedule — and the operational challenge is to use that infrastructure efficiently. In broken infrastructure environments, the infrastructure itself is a variable — unreliable, intermittently available, and occasionally catastrophically absent — and the operational challenge is to design systems that produce adequate outcomes across the full range of infrastructure states that the environment produces.
The operational principles that work in functioning infrastructure environments often fail in broken ones. The just-in-time supply chain that assumes reliable transportation breaks down when transportation is unreliable. The digital system that assumes stable power fails when power is intermittent. The communication protocol that assumes network availability is unavailable when the network is down. Designing for broken infrastructure requires the explicit recognition that the infrastructure state is a variable rather than a constant, and the explicit design of systems that function adequately across the range of states the variable takes.
The Redundancy Investment
Designing for broken infrastructure requires investing in redundancy — the backup systems, the alternative routes, the buffer stocks, and the manual fallback procedures that maintain adequate function when the primary infrastructure fails. Redundancy is expensive in functioning infrastructure environments because it provides insurance against low-probability events. It is cost-effective in broken infrastructure environments because the events it insures against are high-probability. The key analytical move is to assess the actual reliability of the infrastructure in the operating environment rather than assuming the reliability of the infrastructure environment for which the system was designed.
Broken infrastructure is the permanent operating environment for a large fraction of the world's most consequential operations. The systems designed for it — redundant, adaptive, locally self-sufficient — are not inferior to systems designed for functioning infrastructure. They are appropriate to the environment they were designed for, which is a form of superiority that matters more than the elegance of systems that only work when conditions are ideal.
Discussion