Agent authority in security operations
The interesting question isn't what an agent can do in security operations. It's what it should be permitted to do, and how you prove the permission held.
Agent authority in security operations
The interesting question isn’t what Claude can do in security operations. It’s what it should be permitted to do, and how you prove the permission holds.
Most writing about AI in the SOC is about capability. Can the model triage an alert, write a detection, summarize an incident? The answer to all of it is yes, and has been for a while, so it’s not a very interesting question anymore. The interesting question is one of authority. When you let a model act inside the systems that protect a company, what exactly is it allowed to do, and how do you prove that the boundary held when something goes wrong at 3am?
I’ve spent the last couple of years building governed agentic systems for security work, and the framing I keep coming back to is one I’ve written about elsewhere as the Determinism Ladder: the goal isn’t to make AI behavior more probabilistic and clever, it’s to move each piece of behavior down into the most deterministic, auditable layer that can still do the job. In a SOC, that ladder turns into a set of permission tiers. Here’s how I’ve come to think about them, using Claude as the worked example because it’s the model I actually build on, though the pattern is model-agnostic.
Authoring detections: allowed, behind a gate. Letting a model draft a detection rule is one of the highest-leverage things you can do, and one of the safest, because a detection is inert until it ships. The agent drafts an ATT&CK-mapped rule from prior art, but the rule then has to clear a deterministic test harness (does it fire on the known-true samples, stay quiet on the known-false ones) before a human ever sees it. The model proposes; the harness disposes. The permission here is broad precisely because the verification is cheap and the blast radius before deployment is zero.
Investigation and enrichment: allowed, if every claim is cited. When an analyst asks “what do we have for this technique” or “what’s the context on this host,” a model answering from memory is worse than useless, because a confident wrong answer in an investigation costs more than no answer. So the permission comes with a hard condition: every claim has to carry a citation back to a real source, the knowledge graph or the log, or it doesn’t render. The model is allowed to reason over your evidence; it is not allowed to invent any. That single rule, citation-or-silence, is the difference between an assistant and a liability.
Response and containment: allowed only on rails. This is where capability and authority diverge most sharply. A model that can suggest isolating a host is helpful. A model that can isolate a host on its own is a new and exciting way to take down production. So write-paths are the most tightly governed tier: the agent can execute response actions only along pre-approved, audited playbook rails, with the irreversible steps gated behind a human. The agent’s job is to do the toil-heavy enrichment and evidence capture instantly so the human decision is fast and well-informed, not to make the irreversible call itself.
Underneath all three tiers sits the part that makes the whole thing improvable rather than merely safe: a deterministic runtime that logs every agent action as a cited, replayable event. This is the same idea I build into my agentic systems generally, where a failed verification feeds its own explanation back into the next attempt (the engine is public). In a SOC that audit trail does double duty. It’s your forensic record when you need to reconstruct what the agent did, and it’s your training signal: the actions that went wrong become the corrective examples that fix the next version, which is exactly the Discipline Patch idea applied to operations instead of a single model.
None of this slows the SOC down. It does the opposite. The reason teams hesitate to put agents on real security work isn’t that the models aren’t capable, it’s that unbounded autonomy is correctly seen as unsafe. Every model is two creatures under one name, the diligent clerk and the confident fabulist, and you never get to choose which one shows up to a given prompt. So the agents stay stuck in a sandbox doing demos. Permission tiers are how you build the house so the second creature can’t reach the door-handle, and they’re what get agents out of the sandbox: you can hand one real, high-volume work the moment you can prove, deterministically, that it can’t cross the line that matters. Capability gets you the demo. Authority, made auditable, gets you production.
Axioms applied in this essay
This article tested 5 of the StoneyTECH engineering axioms. Each verdict is the result of applying that axiom in this specific argument.
- #2 Push work down toward determinism held
The whole piece wears this axiom as a thesis: move each agent behavior down to the most deterministic, auditable layer still able to do the job.
- #7 Every escalation in code, not in backlogs held
Response rails are escalation paths written as code and playbooks, not as intentions in a backlog.
- #11 Cite or be silent held
The investigation tier is citation-or-silence made into a hard rendering condition.
- #13 Ship with the failure mode named held
The close names the failure mode: the confident fabulist shows up to a prompt unannounced; the tiers assume it.
- #17 Threat-model the surface (assume adversarial input) held
The piece marks write-paths as the attack surface: most tightly governed tier, irreversible steps gated behind a human.
