Threat Modeling for AI Agents
Build a threat model for any AI agent using OWASP's Top 10 for Agentic Applications, the AWS Scoping Matrix, and the Rule of Two framework. Systematic thinking about agent risks.
Premium Course Content
This lesson is part of a premium course. Upgrade to Pro to unlock all premium courses and content.
- Access all premium courses
- 1000+ AI skill templates included
- New content added weekly
From Chaos to Framework
🔄 Quick Recall: In the last lesson, you saw the damage: 1-click RCE, 341 malicious skills, credential leaks, and 135,000 exposed instances. That’s a lot of threats. Without a framework, it’s overwhelming. With one, it becomes a checklist.
Threat modeling is how security professionals turn “everything is scary” into “here’s what matters, here’s what to do.” For AI agents, three frameworks have emerged in 2025-2026 that finally give us structured ways to think about these risks.
By the end of this lesson, you’ll be able to:
- Use the OWASP Top 10 for Agentic Applications to classify agent threats
- Apply the AWS Scoping Matrix to determine what security controls your agent needs
Framework 1: OWASP Top 10 for Agentic Applications
The OWASP GenAI Security Project published the first framework specifically designed for autonomous AI agents in 2026. Peer-reviewed by 100+ security experts, it identifies the ten most critical risks for agents that plan, persist, and delegate.
Here’s the list, mapped to what you already know from Lesson 1:
| # | Risk | What It Means | Real Example |
|---|---|---|---|
| A1 | Tool Misuse | Agent uses tools in unintended ways | Malicious skill tells agent to run shell commands |
| A2 | Excessive Agency | Agent has more permissions than needed | OpenClaw accessing all files when it only needs one folder |
| A3 | Insecure Output Handling | Agent output triggers unintended actions | Agent generates SQL that gets executed without sanitization |
| A4 | Data Leakage | Sensitive data escapes through agent actions | 283 skills leaking API keys through context window |
| A5 | Memory Poisoning | Agent’s persistent memory gets corrupted | Zenity’s SOUL.md backdoor persisting after skill removal |
| A6 | Multi-Agent Delegation | Sub-agents inherit excessive permissions | Sub-agent spawned with full system access instead of read-only |
| A7 | Insecure Plugins | Third-party integrations introduce vulnerabilities | ClawHavoc’s 341 malicious ClawHub skills |
| A8 | Cascading Hallucinations | Hallucinated output feeds into real actions | Agent invents a CLI command and executes it |
| A9 | Identity Spoofing | Agents impersonate users or other agents | Agent sends emails “from” the user without explicit approval |
| A10 | Supply Chain Risks | Dependencies and skills from untrusted sources | Compromised GitHub accounts publishing backdoored skills |
Notice how Lesson 1’s categories map directly: RCE falls under A1/A2, malicious skills under A7/A10, credential exposure under A4, and exposed infrastructure under A2.
✅ Quick Check: An agent reads a Google Doc, finds hidden instructions, and sends your SSH key to an external server. Which OWASP risks are involved? (Answer: A1 (tool misuse — agent used network tool to exfiltrate), A4 (data leakage — SSH key escaped), and potentially A5 (if the instructions persist in memory). The trigger is indirect prompt injection, which cuts across multiple categories.)
Framework 2: AWS Agentic AI Security Scoping Matrix
AWS Security published a framework that answers a different question: not “what can go wrong?” but “how much security does my agent need?”
The matrix has four scope levels:
| Level | Description | Example | Security Needs |
|---|---|---|---|
| No Agency | Static responses, no tool use | Basic chatbot | Standard LLM security |
| Prescribed | Fixed workflows, pre-defined tools | “Summarize this file” with specific file path | Input validation, output checking |
| Supervised | Dynamic tool selection, human approval | Coding assistant that proposes changes, you approve | Permission boundaries, audit logs |
| Full Agency | Autonomous planning and execution | OpenClaw running overnight workflows | Complete security stack |
For each scope level, AWS defines six security dimensions:
- Identity & Access — How does the agent authenticate? What can it access?
- Data Protection — How are credentials and sensitive data handled?
- Network Security — What can the agent reach over the network?
- Monitoring & Logging — What actions are recorded and alertable?
- Incident Response — How do you detect and stop a compromised agent?
- Compliance — What regulatory requirements apply?
Personal AI agents like OpenClaw operate at Full Agency — they plan multi-step tasks, select tools dynamically, and execute with minimal human oversight. That means they need controls across all six dimensions.
The practical insight: If you’re using an AI coding assistant with approval prompts (Supervised), you need less security infrastructure than if you’re running an autonomous agent overnight (Full Agency). Match your security investment to the scope.
Framework 3: The Rule of Two
Meta AI’s research, popularized by Simon Willison, identified three properties that make an agent dangerous when combined:
- Access to private data — files, credentials, messages, databases
- Exposure to untrusted content — emails, web pages, shared docs, user input
- Ability to communicate externally — send messages, make API calls, write files
The Rule of Two: An agent with any two of these three is manageable. An agent with all three is a security risk by design — because untrusted content can manipulate the agent into exfiltrating private data through external communication.
Let’s apply it:
| Agent Configuration | Properties | Risk Level |
|---|---|---|
| Chatbot (no tools) | None of the three | Low |
| File organizer (local only) | Private data + no untrusted input + no external comms | Low |
| Email assistant (read-only) | Private data + untrusted content + no external comms | Medium |
| OpenClaw (default) | All three | High |
| OpenClaw (hardened, no outbound) | Private data + untrusted content + no external comms | Medium |
The Rule of Two gives you a quick risk assessment for any agent: count the properties. Three means you need every security control in this course.
✅ Quick Check: You configure an AI agent to read your local files and generate reports but disable all network access. According to the Rule of Two, what’s the risk level? (Answer: Medium at most. The agent has access to private data but cannot communicate externally. Even if it processes untrusted content from a file, it can’t exfiltrate anything because outbound communication is blocked. You’ve eliminated one leg of the trifecta.)
Building Your Threat Model
Here’s a practical 4-step process combining all three frameworks:
Step 1: Classify scope (AWS Matrix) What level of agency does your agent have? No Agency, Prescribed, Supervised, or Full Agency?
Step 2: Count trifecta properties (Rule of Two) Does your agent have access to private data? Process untrusted content? Communicate externally? Count them.
Step 3: Identify relevant OWASP risks Based on your agent’s capabilities, which of the Top 10 apply? An agent with no tools doesn’t face A1. An agent with no memory doesn’t face A5.
Step 4: Prioritize by likelihood and impact Not all risks are equal. A7 (insecure plugins) matters most if you install third-party skills. A4 (data leakage) matters most if you work with sensitive data.
Example: Personal OpenClaw instance
- Scope: Full Agency (autonomous, multi-step, tool selection)
- Trifecta: 3/3 (files + email/web + API calls) — high risk
- OWASP: A1, A2, A4, A5, A7, A10 are primary concerns
- Priority: A7 (skills) and A4 (credentials) based on Lesson 1 data
This gives you a clear map for the rest of the course: Docker isolation (Lesson 3) addresses A2; permission boundaries (Lesson 4) address A1/A2; skill vetting (Lesson 5) addresses A7/A10; monitoring (Lesson 6) addresses A4; prompt injection defense (Lesson 7) cuts across all of them.
Key Takeaways
- OWASP Top 10 for Agentic Applications is the first peer-reviewed framework specifically for autonomous AI agents — covering tool misuse, memory poisoning, and cascading failures
- AWS Scoping Matrix classifies agents into 4 levels (No Agency → Full Agency) with 6 security dimensions per level
- The Rule of Two says agents with all 3 properties (private data + untrusted content + external communication) are high-risk by design
- Your threat model combines all three: classify scope, count trifecta properties, identify OWASP risks, then prioritize
- Not all agents need the same security — match controls to your agent’s actual scope and capabilities
Up Next
Your threat model tells you what to protect against. The next lesson covers the first and most impactful defense: Docker isolation — the specific container hardening flags that block the most common agent exploits.
Knowledge Check
Complete the quiz above first
Lesson completed!