What Can Go Wrong (and Already Has)
Real attacks on real AI agents: 1-click RCE, 341 malicious skills, 135,000 exposed instances, and credential leaks affecting thousands. The data that proves why agent security matters.
Premium Course Content
This lesson is part of a premium course. Upgrade to Pro to unlock all premium courses and content.
- Access all premium courses
- 1000+ AI skill templates included
- New content added weekly
Your AI Agent Has Enemies
In January 2026, security researchers at Koi Security scanned 2,857 skills on ClawHub — the marketplace for OpenClaw AI agent extensions. They found 341 that were malicious. Not “potentially risky.” Not “poorly coded.” Malicious. Designed to steal your data.
335 of those 341 came from a single coordinated campaign they named ClawHavoc. The attack was elegant in its simplicity: publish skills that look useful, include a README that says “install these prerequisites first,” and the prerequisite install command downloads the Atomic macOS Stealer trojan. Crypto keys, SSH credentials, browser passwords — gone.
This isn’t a theoretical exercise. This is what’s happening right now.
By the end of this lesson, you’ll be able to:
- Describe the 4 categories of documented AI agent attacks using real data
- Explain why AI agents create fundamentally different security risks than traditional software
What You’ll Learn
This course covers the security landscape for AI agents — with OpenClaw as our primary case study (it has the most published security research) and principles that apply to every AI agent you use. Over 8 lessons, you’ll build practical security skills from threat modeling to writing your own security policy.
How This Course Works
Each lesson teaches one security domain with real data, practical techniques, and a quiz. The final lesson is your capstone: creating a personal security policy you’ll actually use. No fluff, no fear-mongering — just evidence and action.
Category 1: Remote Code Execution
CVE-2026-25253 was published by NIST with a CVSS score of 8.8 (High). The vulnerability: OpenClaw’s WebSocket endpoint didn’t validate the origin of incoming connections. Any website you visited could send commands to your local OpenClaw instance.
One click on a malicious link. Full code execution on your machine.
The fix was released in version 2026.1.29, but SecurityScorecard’s STRIKE team found that 12,812 instances were still running vulnerable versions weeks later. That number jumped to over 50,000 as scanning continued.
The takeaway isn’t “OpenClaw is bad.” It’s that AI agents run with significant system access, and a single vulnerability in the communication layer gives attackers everything.
✅ Quick Check: Why is a WebSocket RCE in an AI agent more dangerous than in a typical web application? (Answer: Because the AI agent already has permissions to read files, execute commands, access APIs, and modify your system. The attacker inherits all of those capabilities instantly.)
Category 2: Malicious Skills (Supply Chain)
ClawHavoc was the headline, but it wasn’t alone. Across multiple security vendors:
| Researcher | Finding | Scale |
|---|---|---|
| Koi Security | ClawHavoc campaign deploying AMOS trojan | 341 malicious skills |
| Snyk (ToxicSkills) | Vulnerabilities or malicious patterns | 1,467 skills (36.82%) |
| Cisco | Skills with at least one vulnerability | 26% of 31,000 skills |
| Bitdefender | Malicious skills from compromised accounts | 800+ identified |
The supply chain attack pattern mirrors what happened with npm and PyPI in their early years — except AI agent skills are worse because they execute with the agent’s full permissions. A malicious npm package runs in a sandboxed Node.js process. A malicious OpenClaw skill runs with access to your files, your terminal, your API keys, and your browser.
Bitdefender found evidence of compromised GitHub accounts being used to give malicious skills the appearance of legitimacy. The attacker doesn’t just create a fake account — they hijack a real one with history and stars.
Category 3: Credential Exposure
Snyk’s research found that 283 of 3,984 ClawHub skills (7.1%) were leaking credentials through the LLM context window. Some were malicious. Others were accidents — developers treating AI agents like local scripts and forgetting that everything passes through the model.
Popular skills like moltyverse-email and youtube-data were passing API keys in plaintext as part of prompts. The agent never stores the key — but the model sees it, and a poisoned instruction in any other active skill can tell the model to exfiltrate it.
Meanwhile, Clawhatch’s audit of 90+ public GitHub repos found that approximately 40% contained working API keys in plaintext config files. OpenClaw’s update, doctor, and configure commands resolve environment variables and write actual values to configuration files.
✅ Quick Check: A developer uses an environment variable for their API key (good practice). But OpenClaw’s configure command writes the resolved value to a config file. What’s the security problem? (Answer: The environment variable was meant to keep the key out of files on disk. Once the configure command writes the resolved value, the key exists in plaintext on the filesystem — where it can be committed to git, read by other processes, or exfiltrated by a malicious skill.)
Category 4: Exposed Infrastructure
SecurityScorecard’s STRIKE team discovered over 135,000 OpenClaw instances exposed to the internet. The default configuration binds to 0.0.0.0:18789 — all network interfaces — meaning anyone on the internet can connect.
Of those exposed instances, 53,000+ were linked to IP addresses associated with previously reported breaches and known threat actor infrastructure. Whether these were compromised instances or threat actors running their own is unclear — but neither answer is reassuring.
Gartner’s assessment was blunt: “Block OpenClaw downloads immediately.” They called it “insecure by default” — no authentication enforced, no vendor security team, no SLA, no bug bounty program.
Why AI Agents Are Different
Traditional software has well-understood attack surfaces. AI agents introduce something new: an entity that can reason about instructions, access tools, and take autonomous actions based on content it processes.
Trend Micro identified the core problem as a “lethal trifecta” — when an agent simultaneously has:
- Access to private data (files, credentials, messages)
- Exposure to untrusted content (emails, web pages, shared documents)
- Ability to communicate externally (send messages, make API calls, execute commands)
Any two of those three are manageable. All three together? That’s when agents become attack vectors, not just tools.
Simon Willison, the security researcher who coined the term, put it simply: if your agent can read your email AND execute commands AND access the internet, then a single poisoned email becomes a remote code execution vector.
Key Takeaways
- CVE-2026-25253 gave attackers 1-click RCE on any unpatched OpenClaw instance via WebSocket origin bypass
- 341 malicious skills deployed the Atomic macOS Stealer through fake prerequisite installations
- 7.1% of ClawHub skills leak credentials through the LLM context window
- 135,000+ instances were exposed to the internet with default configurations
- The lethal trifecta (private data + untrusted content + external communication) makes agents fundamentally different from traditional software
- These aren’t OpenClaw-specific problems — every AI agent with system access faces these same categories of risk
Up Next
Now you know what’s gone wrong. In the next lesson, you’ll learn how to think about these threats systematically using threat modeling frameworks designed specifically for AI agents — including OWASP’s brand-new Top 10 for Agentic Applications.
Knowledge Check
Complete the quiz above first
Lesson completed!