RSS.Social

Surfing Complexity

follow: @[email protected]

Posts

I am dreading our LLM-written incident report future

Dear researchers column

I can’t bear to read AI-generated prose

The demon of the gaps

Form may follow function, but use doesn’t follow design

The coming coordination calamity

Reliability as a game of improving the odds

Flipping the bozo bit on flips the learning off

How incidents can teach us about what’s already working well

My SREcon talk

Life comes at you fast

The normal work of creating reliability

Thoughts on the Bluesky public incident write-up

References from my SREcon talk on stories

Quick thoughts on GitHub CTO’s post on availability

Grow fast and overload things

Saturation

Quick takes on Feb 20 Cloudflare outage

Poor Deming never stood a chance

Lots of AI SRE, no AI incident management

Nobody knows how the whole system works

On variability

Ashby taught us we have to fight fire with fire

Because coordination is expensive

Amdahl, Gustafson, coding agents, and you

From Rasmussen to Moylan

Telling the wrong story

Verizon outage report predictions

On intuition and anxiety

The dangers of SSL certificates

Saturation: Waymo edition

Another way to rate incidents

Quick takes on the Triple Zero Outage at Optus – the Schott Review

Why I don’t like “Correction of Error”

AWS re:Invent talk on their Oct ’25 incident

Quick takes on the Dec 5 Cloudflare outage

Incidents: the exceptional as routine

Fun with incident data and statistical process control

Brief thoughts on the recent Cloudflare outage

You’ll never see attrition referenced in an RCA

Quick thoughts on the recent AWS outage

This is fine!

Caveat promptor

The illegible nature of software development talent

Two thought experiments

A statistic is as a statistic does

Fixation: the ever-present risk during incident handling

The hidden trade-offs of fine-grained progressive rollouts

Nothing fails like a history of success

My favorite developer productivity research method that nobody uses

The problems that accountability can’t fix

Easy will always trump simple

The trap of tech that’s great in the small but not in the large

Formal specs as sets of behaviors

Cloudflare and the infinite sadness of migrations