Published on

Why I Think AI + Automation Is About to Reshape Operations Teams

Authors
  • avatar
    Name
    Ryan Todd
    Twitter

First post. The point isn't to be right about everything — it's to be in the room.

Intro

I've spent most of my career in operations-adjacent engineering: the systems that make a business actually run, not the ones in the demo. Vendor feeds. Inventory sync. Pricing logic. The quiet 3 a.m. cron jobs that no one notices until they stop firing.

For a long time the conversation around "AI" felt unrelated to that world. It was about chatbots, image generators, and clever demos. None of that touched the part of operations that hurts: tickets, exceptions, drift between systems, the steady tax of human-in-the-loop decisions that should have been automated five years ago.

That's changing fast. And I don't think most operations teams have updated their mental model yet.

The misconception about AI in operations

The common pitch — "AI will replace your team" — is a category error. The interesting question isn't replacement, it's which of the small, irritating decisions inside an operations workflow are now cheap enough to automate that they finally get automated.

Up until recently, automating a decision meant writing a rule. Rules are brittle, expensive to maintain, and fall over on the edge cases that make up half the work. So the rule never gets written, and a person keeps doing it. AI changes that ratio. Not because models are smart in some general sense, but because good enough on the long tail is suddenly within reach for things that used to require a heuristic team.

What operational automation actually looks like now

The shape I keep seeing — both in my own work and in patterns I'd bet are going to be everywhere within 18 months:

  • Classification at the edge. Inbound tickets, vendor emails, product attributes, inventory exceptions. Things that used to need a human to triage now route themselves.
  • Decisions with a memory. Models that don't just answer the prompt but maintain state across days or weeks of a workflow — closer to a coworker than a function.
  • Exception handling that learns. The 10% of cases that broke the rule-based system are now the input to the next iteration, not the reason the whole thing stalls.

The thing that makes this hard isn't the model. It's the plumbing — data freshness, idempotency, observability, knowing what to do when the model is confidently wrong. That's just systems engineering. Which is the part most operations teams already have.

A note on platform engineering

A lot of this only works if the surface area underneath it is sane. Lambda + a queue + a state store will take you further than you think, but only if you've actually internalized the boring stuff: small functions, clear contracts, idempotent workers, alarms that tell you something useful.

The reason I think the next wave of operational automation will live on platform-style infrastructure isn't ideology — it's that an LLM in a loop is just another worker, and platforms are how we already run workers at scale. The interesting work isn't picking the model. It's deciding what the model is allowed to do without asking.

Closing

I don't think there's a "before and after" moment here. There's a slow re-platforming of operations work onto a stack where AI is a primitive, not a feature. The teams that figure out how to fold that into their existing systems thinking are going to look very different in three years from the teams that didn't.

I'll write more about specifics — patterns, anti-patterns, the things I've shipped that worked and the ones that didn't — as I go.

If you're working on any of this and want to compare notes, I'd love to hear from you.