What Two Decades in Security Taught Me About Securing AI Agents
I've been doing security for over twenty years. Started in the Marines, where you learn fast that systems either work under pressure or they don't. There's no "mostly secure." After the Corps, I spent the next two decades building and breaking enterprise infrastructure, cloud architectures, and distributed systems. Firewalls, IAM policies, network segmentation, incident response. The full stack of keeping things from falling apart.
Now I run 60+ AI agents in production across hybrid infrastructure. And here's what I keep telling people: almost nothing about securing agents is actually new. The principles are the same ones I've been applying since my first SOC rotation. Identity. Least privilege. Blast radius. Zero trust. The actors changed. The playbook didn't.
Identity Management Didn't Change. The Actor Did.
Twenty years ago, I was writing policies about which users could access which systems. Ten years ago, I was writing policies about which services could call which APIs. Today, I'm writing policies about which agents can use which tools.
It's the same problem. Who is this entity? What should it be allowed to do? How do we verify it's actually who it claims to be? How do we revoke access when something goes wrong?
The difference is that agents are a new kind of identity. They're not users with passwords. They're not services with static API keys and predictable behavior. They're non-deterministic actors that decide at runtime what to do based on context. That makes identity management harder, but it doesn't make it different in kind.
Every agent in my systems has an identity. Not just a name, but a defined scope, a set of permitted tools, and a trust level. When Agent A talks to Agent B, there's an identity assertion involved. This isn't theoretical. It's implemented. Because if you can't answer "who did this and were they allowed to?" you don't have security. You have hope.
Least Privilege Is Least Privilege
The principle of least privilege is older than most people reading this. Give an entity only the permissions it needs to do its job. Nothing more. Revoke them when they're no longer needed.
This applies to agents exactly the way it applies to IAM roles, service accounts, and user permissions. Probably more so, because an agent with excess permissions is an agent that can cause excess damage when (not if) something goes sideways.
I see teams giving their agents admin-level tool access because "it needs flexibility." That's the same argument developers made about running services as root in 2008. We know how that turned out. If your agent needs to read a database, give it read access to the specific tables it needs. Not a connection string with write privileges to the whole cluster.
Where this gets tricky with agents is that tool usage is contextual. A traditional service either calls an endpoint or it doesn't. It's in the code. An agent might call any of its available tools depending on the input. You can't predict which tools it will use on any given run. So you need to design the permission boundary for the worst case, not the common case.
That's a new wrinkle on an old principle. The principle is the same. The implementation requires more thought.
Blast Radius Containment: Same Game, Different Board
In traditional infrastructure security, we think about blast radius constantly. If this server gets compromised, what can the attacker reach? If this service account leaks, what's the worst-case impact? You segment networks. You isolate workloads. You design so that a breach in one zone doesn't cascade into another.
Agent architectures need the same discipline. If Agent A gets compromised through prompt injection, what's the blast radius? Can it reach Agent B? Can it access Agent C's tools? Can it exfiltrate data through Agent D's email capability?
In my systems, every agent operates in a defined blast radius. The research agents can't touch production data. The production agents can't initiate external communications. The agents that handle sensitive data run on local infrastructure, isolated from the cloud-deployed agents. A compromised research agent is a contained incident, not a cascading failure.
This isn't innovative. It's network segmentation applied to agent architecture. The same principle that kept a compromised DMZ server from reaching your internal database in 2005 keeps a compromised collection agent from reaching your analytical outputs in 2026.
Zero Trust Isn't a Buzzword Here. It's a Survival Requirement.
Zero trust started as a network architecture concept. Don't trust anything inside or outside the perimeter. Verify everything. Assume breach.
For agent systems, zero trust is more relevant than it ever was for traditional networks. Because agents are non-deterministic. You literally cannot predict with certainty what an agent will do next. The model might hallucinate. The context might be poisoned. The upstream agent might have been compromised. You have to assume that any agent, at any time, could produce output that's wrong or malicious.
That means every inter-agent communication gets validated. Every tool call gets checked against the agent's permissions. Every output gets verified before it triggers a downstream action. No implicit trust. No "well, Agent A is our agent, so its output must be fine."
I've watched organizations spend millions on zero trust network architecture and then deploy agent systems where every agent blindly trusts every other agent's output. The cognitive dissonance is staggering.
Want to see how this applies to your architecture?
Try our interactive threat surface analysis — no signup, runs in your browser.
Map Your Threat Surface →Where the Analogy Breaks
I'd be dishonest if I said traditional security maps perfectly to agent security. It doesn't. There are real differences that require new thinking.
Non-determinism. Traditional services are deterministic. Same input, same output. You can test them exhaustively. Agents are probabilistic. The same input might produce different outputs on different runs. This means your security controls can't rely on behavioral prediction. They have to work regardless of what the agent decides to do.
Context-dependent behavior. An agent's behavior changes based on its context window. The same agent with different context might make completely different tool calls. This means security controls need to account for context manipulation as an attack vector. An attacker doesn't need to compromise the agent. They just need to change what it sees.
Natural language as an attack surface. Traditional systems have well-defined input formats. SQL. HTTP. JSON. You can validate inputs against schemas. Agent inputs are natural language. The boundary between "legitimate instruction" and "injected instruction" is fuzzy by nature. This is genuinely new, and it requires new defensive patterns.
Emergent behavior in multi-agent systems. When you have dozens of agents interacting, you get emergent behaviors that nobody designed. Agent A's output, combined with Agent B's context, triggers an action in Agent C that none of them were individually programmed to take. Testing individual agents isn't enough. You have to test the system.
Why Security Veterans Are Better Positioned for This
There's a common assumption that AI agent security is an ML problem, so ML researchers should lead it. I disagree.
ML researchers understand the model layer. They can tell you about gradient attacks, adversarial examples, and training data poisoning. That's valuable. But agent security is 80% infrastructure security and 20% ML security. The big risks aren't in the weights. They're in the permissions, the network paths, the trust boundaries, the data flows.
An ML researcher can tell you that a prompt injection is possible. A security architect can tell you that the prompt injection gives the attacker access to a tool that can query your customer database, exfiltrate the results through an email tool, and cover its tracks by modifying the audit log through a logging agent that trusts the compromised agent's output.
That's an attack chain. Security people think in attack chains. We've been doing it for decades. The fact that one link in the chain is a language model instead of a buffer overflow doesn't change how you analyze it.
STRIDE still works. Spoofing, Tampering, Repudiation, Information Disclosure, Denial of Service, Elevation of Privilege. Apply it to every agent interaction, every tool permission, every data flow. The methodology maps cleanly because agent systems are, at their core, distributed systems with non-human actors.
The Implementation Is New. The Thinking Isn't.
If I could tell every CISO one thing about AI agent security, it would be this: you already know how to do this. The principles your team has been applying for years are the right principles. Identity management. Least privilege. Blast radius containment. Zero trust. Defense in depth. Assume breach.
You don't need to hire an ML team to secure your agents. You need your security architects to understand how agents work. You need them to map the agent architecture the same way they'd map a microservices architecture. Identify the trust boundaries. Model the threats. Design the controls.
The tools are slightly different. The actors are slightly different. But the game is the same game we've been playing since someone first connected a computer to a network and another person tried to break in.
Twenty years taught me that security fundamentals don't change when the technology does. They just get applied in new places. AI agents are the newest place. The fundamentals still hold.
Ready to secure your agent architecture?
Start with a 30-minute discovery session. We'll map your agent environment, identify your biggest risks, and outline a path forward.
Book a Discovery Session