Why the Internet Keeps Breaking and What Comes Next
If your apps suddenly stopped working today, you weren’t alone, the internet itself had a minor meltdown.From blank social feeds to frozen banking apps, millions of people were left staring at spinning wheels and error screens, wondering what just happened.
Today, for nearly four hours, a significant portion of the internet simply stopped working.
At 11:20 UTC a non-malicious traffic anomaly inside Cloudflare’s internal systems triggered a cascade of HTTP 500 errors across its global network. The result was immediate and far-reaching: X (Twitter) feeds went blank, ChatGPT and the OpenAI API became unreachable, Spotify stopped playing, Discord went dark, major banking apps failed, and even real-time public transport information vanished in several cities. Millions of people and thousands of businesses were affected simultaneously because one company- Cloudflare- now sits in front of roughly 20–25 % of all web traffic.
Cloudflare is far more than a hosting provider. It is a reverse proxy, content delivery network (CDN), DNS resolver (1.1.1.1), DDoS mitigation platform, web application firewall, and zero-trust gateway rolled into one. For many organisations, from global enterprises to solo developers it has become the default way to make websites fast, secure, and resilient. That efficiency, however, comes at a price: when Cloudflare falters, the blast radius is planetary.
This is not the first time we have seen an outage of this magnitude, and it will not be the last.
- June 2021- Fastly: one invalid customer configuration took down the BBC, Reddit, Amazon, and much of GOV.UK for an hour.
- July 2024- CrowdStrike: a single defective update grounded airlines, closed hospitals, and cost the global economy an estimated $5.4 billion.
- 2025- Multiple AWS US-EAST-1 incidents, including October 2025, routinely disrupt Netflix, Slack, Robinhood, and Disney+.
Each incident shares the same root pattern: centralisation converts a local software bug, traffic spike, or certificate error into a global crisis.
The economics are compelling. Building your own global CDN, DDoS protection, and zero-trust platform is prohibitively expensive, so organisations understandably consolidate onto a handful of superbly engineered providers. Those providers deliver extraordinary performance and security at a fraction of the cost of doing it yourself. The unintended consequence is that the internet’s failure domains have quietly consolidated into a small number of companies. When any one of them has a bad day, the world feels it.
True resilience now requires deliberate architectural choices that most organisations currently avoid because they add complexity and cost:
- Multi-CDN routing with automatic failover
- Diversified authoritative DNS across unrelated providers
- Rigorous chaos engineering and regular “game day” exercises at global scale
- Progressive adoption of decentralised edge alternatives (IPFS, Fleek, Fly.io, etc.) for non-critical workloads
- Large enterprises and hyperscalers already implement many of these measures; most mid-sized and smaller organisations do not, often because the short-term savings of single-vendor reliance are simply too attractive.
At Larkspur International we have worked with partners to help design systems that remain available even when a major third-party provider fails. That work has never felt more relevant than it does today.
The internet was originally designed to route around damage. Somewhere along the way we rebuilt it to route through a handful of indispensable gatekeepers. Outages like today’s are not aberrations- they are the predictable consequence of the architecture we have collectively chosen.
The good news is that the fixes are known, tested, and increasingly affordable. The difficult part is making the deliberate decision to invest in resilience rather than continuing to enjoy the convenience and cost advantages of centralisation, until the next global outage forces the issue.
If today affected your organisation and you would like to explore practical ways to reduce single-provider risk without sacrificing performance or breaking the budget, please feel free to send us a message. We are always happy to share what we have learned.
Stay safe online.
