Agentic systems are starting to operate the very platforms that host them. We share what we've learned running them in production.
For two decades, cloud infrastructure has been managed by a thin layer of humans wielding a thick layer of tooling. That balance is shifting.
Across our managed estate, we now run agentic systems that triage incidents, file change requests, and even author Terraform — under policy guardrails that we and our customers control. The result is not the replacement of operations engineers; it is a sharp increase in what each one can safely supervise.
This piece walks through the platform investments that make agentic operations safe at scale: deterministic policy, structured tool access, observable reasoning, and the human approval surfaces that turn an autonomous agent into a trustworthy colleague.