Builds · case study

Governing an Autonomous Detection & Response Platform

A reference architecture for putting AI agents to work in a security operations center — without surrendering determinism, auditability, or human control.

Case study

The constraint isn't skill — it's capacity at volume.

A capable detection team can research, author, test, and ship a detection for a critical zero-day within hours. That responsiveness is a strength. The problem is the steady-state volume underneath it: new TTPs, log sources, and compliance requirements arrive faster than any fixed-size team can work through them sequentially. Most detection work isn't an emergency — it can wait — but "can wait" accumulates into a widening gap between what a team could detect and what it does. Frontier models make a new option viable for the first time: let governed agents handle the routine volume so human experts spend their judgment where it actually matters.

What I built.

A six-part agentic detection & response platform on Google Cloud, layered on top of a commercial security stack (SIEM / EDR / threat-intel) rather than replacing it:

A unified, queryable security knowledge graph — detections, threat intel, ATT&CK mappings, audit history, and playbooks in one substrate, every fact carrying provenance.
An automated threat-intelligence ingestion & enrichment pipeline — continuously normalizes feeds, advisories, and bulletins into the graph with severity, technique mapping, and IOC extraction.
An automated detection-authoring pipeline — agents draft MITRE ATT&CK-mapped rules from prior art, run an SPL / YARA-L test harness, and close coverage gaps.
A deterministic agent-orchestration runtime — every agent action passes a workflow state machine with gates, branch-scoped tests, and a cited audit trail.
A natural-language analyst interface — analysts query the knowledge graph in plain English; every query is authenticated, scope-enforced, and logged.
Governed response automation — incident enrichment and playbook execution along audited, permissioned write-paths.

The governance model — the Determinism Ladder in practice.

This is not "let an LLM try things." Every agent action is governed by a state machine with pre-commit verification, branch-scoped tests, and a cited audit trail tracing each decision to a source. Specialized agents generate, challenge, and verify one another through a generate → verify → refine loop, with a judge model that scores each pass and halts deterministically on convergence — preventing both premature shipping and infinite rework. The result is the principle behind the Determinism Ladder: AI behavior moved out of probability layers and into auditable authority layers. Agentic productivity without unbounded-autonomy risk; human-in-the-loop by design.

Determinism Ladder

Measured results

Intel to draft detection days → minutes

Cross-format rule translation hours → ~90 seconds

ATT&CK technique lookup ~45 minutes → under 10 seconds

Coverage-gap analysis multi-day, manual → continuous / on-demand

Prototype build velocity ~250K LOC · 1,186 CI-green PRs · 98 days