Chapter 24 — Security: Defense in Depth for Agent Systems #

"Security is always excessive until it's not enough." -- Robbie Sinclair


📖 20 min read | 👤 David (VP Data), Sarah (MLOps), Dr. Chen (Researcher) | 🏷️ Part VI: Orchestration

What you'll learn:


The Problem: The Overprivileged Agent #

David, the VP of Data, is reviewing the architecture for a new customer analytics platform. The proposal looks solid: five agents coordinated by a DIO, producing churn predictions from customer data.

Then he asks the question that nobody wants to hear: "What stops the Data Scientist agent from reading PII columns it does not need? What stops the MLOps agent from deploying a model without quality gate approval? What stops any agent from spending the entire budget on a single LLM call?"

The room goes quiet. In most agent frameworks, the answer is: nothing. Agents have whatever permissions the LLM provider grants them. There is no access control, no privilege separation, no budget isolation.

This is the overprivileged agent problem, and it is the security equivalent of running every microservice as root.


The 6-Layer Security Model #

The Neam agent stack implements defense in depth through six security layers. Each layer is independent -- a breach at one layer does not compromise the others.

6-Layer Security Model
LayerNameComponents
Layer 6 Budget Enforcement Per-agent cost limits, Per-agent token limits, DIO-level aggregate budget
Layer 5 Audit Trail Every agent action logged, RACI assignment recorded, Tamper-evident (append-only)
Layer 4 Guards & Guardchains Input validation (before LLM call), Output validation (after LLM response), Content filtering (PII, prompt injection)
Layer 3 Data Classification Column-level PII tagging, Schema-level access policies, Row-level filtering by agent role
Layer 2 Authorization Agent capability declarations, Least-privilege access to data sources, Phase-gated permissions (R/C/I only)
Layer 1 Authentication Agent identity verification, Provider API key management, Infrastructure credential isolation

Layer 1: Authentication #

Every agent in the system has a verified identity. No anonymous agents.

FLOW Agent Identity Chain
flowchart LR
  A["Agent Declaration\nname: ChurnDS\ntype: datascientist"] --> B["Provider Credentials\nOPENAI_API_KEY\n(env var, never in code)"]
  B --> C["Infrastructure Credentials\nSIMSHOP_PG_URL\n(env var, never in code)"]

Security properties:

NEAM
// Credential isolation in Neam
sql_connection SimShopDB {
    platform: "postgres",
    connection: env("SIMSHOP_PG_URL"),  // from environment, not source code
    database: "simshop"
}

agent ChurnDS {
    provider: "openai",
    model: "gpt-4o",
    budget: AgentBudget  // isolated budget, not shared
}

⚠️ Never commit API keys or database credentials to source control. Neam's env() function reads from environment variables at runtime. The DataSims setup uses Docker environment variables for all credentials -- see docker/docker-compose.yml.


Layer 2: Authorization #

Authorization in the Neam agent stack follows the principle of least privilege: each agent can only access the data and operations required for its role.

Capability DIO BA DS Causal Test MLOps
Read OLTP tables-RRRR-
Read warehouse tables-RRRRR
Write feature tables--W---
Write prediction tables-----W
Read PII columns-R----
Register models--W--W
Deploy to production-----W
Manage budgetsW-----
Dispatch agentsW-----

Key enforcement rules:

Phase-Gated Permissions #

Permissions are further scoped by the RACI assignment. An agent's permissions are active only during phases where it is R or C:

Phase RACI Role ChurnDS Permissions
Phase 1 (Requirements)CRead-only access to BRD
Phase 2 (Features)RRead warehouse, write features
Phase 3 (Model)RRead features, write model
Phase 4 (Causal)CRead-only access to model outputs
Phase 5 (Testing)CRead-only access to test results
Phase 6 (Deployment)INo data access (informed only)
Phase 7 (Monitoring)INo data access (informed only)

Layer 3: Data Classification #

Data classification tags every column in the data catalog with a sensitivity level:

Classification Levels
LevelDescription
PUBLICAggregated metrics, product catalog
INTERNALTransaction summaries, model scores
CONFIDENTIALCustomer names, email addresses
RESTRICTEDFinancial data, health records (if applicable)

In the SimShop environment, PII columns are explicitly declared:

NEAM
infrastructure_profile SimShopInfra {
    governance: {
        regulations: ["GDPR"],
        pii_columns: [
            "email",
            "phone",
            "date_of_birth",
            "first_name",
            "last_name"
        ]
    }
}

When an agent attempts to access a PII column:

PII Access Attempt Example

Agent: ChurnDS

Query: SELECT email, days_since_last_order FROM churn_features

Classification: "email" is CONFIDENTIAL (PII)

ChurnDS permission: No PII access

Result: Query BLOCKED

Audit log: "ChurnDS attempted to access PII column 'email'. Access denied per governance policy. Agent has no PII grant."

Alternative: Agent receives hashed/anonymized version
SELECT hash(email) as customer_hash, days_since_last_order FROM churn_features

💡 Data classification is enforced at the query layer, not the application layer. Even if an agent constructs a SQL query that references a PII column, the query execution layer blocks access. This prevents prompt injection attacks where an adversarial prompt tricks an agent into requesting unauthorized data.


Layer 4: Guards and Guardchains #

Guards are content-level security filters that inspect LLM inputs and outputs. Guardchains compose multiple guards into an ordered pipeline.

FLOW Guardchain Pipeline
flowchart TD
  A["User Input"] --> B["Guard 1: Injection Detection\nBlock prompt injection attempts\n('ignore previous instructions...')"]
  B --> C["Guard 2: PII Scrubber\nRedact PII from prompts before\nsending to LLM provider"]
  C --> D["Guard 3: Output Validator\nVerify output doesn't contain\nhallucinated data or PII leakage"]
  D --> E["Validated Output"]

In Neam, guards are declared as part of the agent definition:

NEAM
guard PIIFilter {
    on: "output",
    action: "redact",
    patterns: ["\\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\\.[A-Z|a-z]{2,}\\b",
               "\\b\\d{3}[-.]?\\d{3}[-.]?\\d{4}\\b"],
    replacement: "[REDACTED]"
}

guard InjectionDetector {
    on: "input",
    action: "block",
    patterns: ["ignore previous", "disregard instructions",
               "system prompt", "you are now"]
}

agent ChurnDS {
    provider: "openai", model: "gpt-4o",
    guardchains: [InjectionDetector, PIIFilter],
    budget: AgentBudget
}

Guard Types #

Guard TypeDirectionPurpose
Injection detectionInputBlock prompt injection attempts
PII scrubberInput + OutputRemove PII before sending to LLM
Output validatorOutputVerify response format and content
Token limiterInputPrevent excessively large prompts
Content classifierOutputFlag potentially harmful content
Schema validatorOutputEnsure structured output matches expected schema

Layer 5: Audit Trail #

Every action in the system is logged in an append-only audit trail. This is the RACI audit trail from Chapter 21, extended with security events:

JSON
{
  "timestamp": "2026-03-15T06:12:33Z",
  "event_type": "agent_action",
  "agent": "ChurnDS",
  "action": "query_execution",
  "target": "ml_features.churn_features",
  "columns_accessed": ["days_since_last_order", "support_tickets_30d",
                        "login_trend_30d", "spend_trend_30d"],
  "pii_columns_requested": [],
  "authorization": "granted",
  "raci_role": "R",
  "phase": "feature_engineering",
  "budget_before": 48.70,
  "budget_after": 47.30,
  "guard_results": {
    "InjectionDetector": "pass",
    "PIIFilter": "pass (no PII in output)"
  }
}

Security-relevant audit events:

Event TypeLogged Information
agent_actionWho, what, when, authorization
access_deniedAgent, resource, reason for denial
guard_triggeredGuard name, action taken, content
budget_eventAgent, amount, remaining balance
escalationError details, recommended actions
provider_failoverFrom provider, to provider, reason
quality_gatePhase, result, checks performed
pii_access_attemptAgent, column, authorization result

🎯 The audit trail is the foundation for compliance. GDPR Article 30 requires records of processing activities. SOC 2 requires access logs. The Neam audit trail satisfies both -- every data access by every agent is recorded with full context.


Layer 6: Budget Enforcement #

Budget enforcement is the final security layer. It prevents runaway costs and serves as a resource isolation mechanism:

Budget Enforcement

DIO Budget: $500.00

AgentAllocation
ChurnBA$50.00 (cost ceiling)
ChurnDS$50.00 (cost ceiling)
ChurnCausal$50.00 (cost ceiling)
ChurnTester$50.00 (cost ceiling)
ChurnMLOps$50.00 (cost ceiling)
DIO reserve$250.00 (for retries, reallocation)

Enforcement rules:

  1. Agent CANNOT exceed its allocated budget
  2. Agent CANNOT borrow from another agent's budget
  3. Only DIO can reallocate budget between agents
  4. Total spend CANNOT exceed DIO budget ($500.00)
  5. Budget check happens BEFORE every LLM call

Budget enforcement prevents several attack vectors:

Attack VectorHow Budget Stops It
Infinite loop agentBudget exhaustion triggers graceful degradation
Denial of serviceAgent cannot consume other agents' budget
Runaway costHard ceiling on total spend
Resource hoggingPer-agent limits prevent monopolization

In the DataSims experiments, the full system completed the churn prediction lifecycle for $23.50 against a $500.00 budget. The budget ceiling prevented any single experiment from consuming excessive resources.


Zero-Trust Between Agents #

A critical design principle: agents do not trust each other. Every inter-agent communication is validated:

FLOW Traditional vs Zero-Trust Model
flowchart TD
  subgraph Traditional["Traditional Trust Model"]
    direction LR
    T1["Agent A"] -- "trusted" --> T2["Agent B"] -- "trusted" --> T3["Agent C"]
  end

  subgraph ZeroTrust["Zero-Trust Model"]
    direction TB
    Z1["Agent A"] -- "validated" --> Z_DIO["DIO"]
    Z_DIO -- "validated" --> Z2["Agent B"]
    Z1 -. "auth check\ncontent guard\nbudget check" .-> Z1
    Z_DIO -. "auth check\ncontent guard\nbudget check" .-> Z_DIO
  end

In the zero-trust model:

This prevents lateral movement -- if an agent is compromised (e.g., by prompt injection), it cannot escalate privileges or access resources beyond its authorization.


The Security Stack in Practice #

Here is how all six layers work together when the ChurnDS agent processes a feature engineering task:

Security Stack in Action: ChurnDS Feature Engineering
  1. AUTHENTICATION
    ChurnDS identity verified, API key loaded from env
  2. AUTHORIZATION
    Phase: feature_engineering, RACI role: R
    Permissions: Read warehouse tables, write feature tables
    PII access: DENIED
  3. DATA CLASSIFICATION
    Query: SELECT ... FROM simshop_dw.fact_orders
    Classification check: No PII columns in query → ALLOWED
  4. GUARDS
    Input guard: No injection patterns detected → PASS
    Output guard: No PII in LLM response → PASS
  5. AUDIT TRAIL
    Action logged: agent=ChurnDS, action=query, target=fact_orders, authorization=granted, budget_before=$48.70, budget_after=$47.30
  6. BUDGET ENFORCEMENT
    LLM call cost: $1.40, remaining budget: $47.30 → ALLOWED

All six layers execute in milliseconds. The security overhead is negligible compared to the LLM call latency.


Key Takeaways #

For Further Exploration #