Surfing Complexity
I am dreading our LLM-written incident report future
Dear researchers column
I can’t bear to read AI-generated prose
The demon of the gaps
Form may follow function, but use doesn’t follow design
The coming coordination calamity
Reliability as a game of improving the odds
Flipping the bozo bit on flips the learning off
How incidents can teach us about what’s already working well
My SREcon talk
Life comes at you fast
The normal work of creating reliability
Thoughts on the Bluesky public incident write-up
References from my SREcon talk on stories
Quick thoughts on GitHub CTO’s post on availability
Grow fast and overload things
Saturation
Quick takes on Feb 20 Cloudflare outage
Poor Deming never stood a chance
Lots of AI SRE, no AI incident management
Nobody knows how the whole system works
On variability
Ashby taught us we have to fight fire with fire
Because coordination is expensive
Amdahl, Gustafson, coding agents, and you
From Rasmussen to Moylan
Telling the wrong story
Verizon outage report predictions
On intuition and anxiety
The dangers of SSL certificates
Saturation: Waymo edition
Another way to rate incidents
Quick takes on the Triple Zero Outage at Optus – the Schott Review
Why I don’t like “Correction of Error”
AWS re:Invent talk on their Oct ’25 incident
Quick takes on the Dec 5 Cloudflare outage
Incidents: the exceptional as routine
Fun with incident data and statistical process control
Brief thoughts on the recent Cloudflare outage
You’ll never see attrition referenced in an RCA
Quick thoughts on the recent AWS outage
This is fine!
Caveat promptor
The illegible nature of software development talent
Two thought experiments
A statistic is as a statistic does
Fixation: the ever-present risk during incident handling
The hidden trade-offs of fine-grained progressive rollouts
Nothing fails like a history of success
My favorite developer productivity research method that nobody uses
The problems that accountability can’t fix
Easy will always trump simple
The trap of tech that’s great in the small but not in the large
Formal specs as sets of behaviors
Cloudflare and the infinite sadness of migrations