Monitoring: Catching Threats in Real Time
Build a monitoring setup that detects credential leaks, unauthorized tool calls, and anomalous agent behavior. From log analysis to kill switches — your agent observability guide.
Premium Course Content
This lesson is part of a premium course. Upgrade to Pro to unlock all premium courses and content.
- Access all premium courses
- 1000+ AI skill templates included
- New content added weekly
You Can’t Protect What You Can’t See
🔄 Quick Recall: You’ve built three layers of defense: Docker isolation (Lesson 3), permission boundaries (Lesson 4), and skill vetting (Lesson 5). But defenses can fail. Monitoring catches what prevention misses.
Clawhatch audited 90+ public GitHub repositories containing OpenClaw configurations. They discovered that OpenClaw’s update, doctor, and configure commands resolve environment variables and write the actual values to config files. Developers who carefully stored API keys in environment variables found those keys silently written to plaintext files on disk.
Without monitoring, you’d never know.
By the end of this lesson, you’ll be able to:
- Set up monitoring that detects the three most critical agent security events
- Configure a kill switch for instant agent shutdown
What to Monitor
Not everything the agent does needs monitoring. Focus on three categories that cover the most dangerous attack patterns:
Category 1: Credential Exposure
What to watch: Any agent action that could expose credentials.
| Signal | What It Means | Alert Priority |
|---|---|---|
| API keys appearing in log files | Credential leakage through logging | Critical |
| Secrets in config files (after commands) | Resolve-and-write exposure | Critical |
| Outbound requests containing credential patterns | Potential exfiltration | Critical |
| New environment variable access | Skill requesting unexpected credentials | High |
How to detect:
# Monitor for credential patterns in agent output files
# Common patterns: sk-*, AKIA*, token=*, Bearer *, api_key=*
watch -n 60 'grep -rn "sk-\|AKIA\|token=\|Bearer \|api_key=" /app/agent-data/logs/'
This is a basic approach. For production setups, tools like Gitleaks or TruffleHog (the same tools Argus used to find OpenClaw’s 512 vulnerabilities) can scan continuously.
The Clawhatch finding applies here: After running any agent management command, check if secrets were written to disk. A monitoring script that hashes config files before and after commands will catch this.
Category 2: Unauthorized Tool Usage
What to watch: Agent actions that exceed your permission boundaries.
| Signal | What It Means | Alert Priority |
|---|---|---|
| Shell command execution (if blocked) | Possible compromise or skill abuse | Critical |
| File access outside allowed directories | Lateral movement attempt | High |
| Network requests to unexpected domains | Data exfiltration or C2 communication | Critical |
| Memory/config file modifications | Persistence attempt | High |
If you set up action allowlists in Lesson 4, monitoring for actions outside that list is straightforward: any blocked action that was attempted should trigger an alert.
Network monitoring is particularly valuable. The ClawHavoc campaign used a C2 server at a specific IP. If your agent suddenly starts talking to an IP address that isn’t on your allowlist, that’s an immediate red flag.
✅ Quick Check: Your agent normally makes API calls to api.openai.com and api.github.com. Your monitoring detects a request to 91.92.242.30. What should you do? (Answer: Immediately shut down the agent (kill switch). That IP was the C2 server used in the ClawHavoc campaign. Even if this is a different context, unexpected outbound connections to unknown IPs warrant immediate investigation.)
Category 3: Behavioral Anomalies
What to watch: Changes in the agent’s normal behavior pattern.
| Signal | What It Means | Alert Priority |
|---|---|---|
| Sudden increase in file reads | Possible reconnaissance or data collection | Medium |
| Agent modifying its own memory | Could be legitimate or persistence attack | High |
| Unusually long execution times | Possible crypto-mining or resource abuse | Medium |
| Accessing files it’s never accessed before | Possible lateral movement | Medium |
Behavioral monitoring requires a baseline. Observe what your agent does normally for a week: what files it accesses, what tools it uses, what APIs it calls. Anything outside that baseline is an anomaly worth investigating.
Setting Up Your Kill Switch
CyberArk’s key insight: agents act faster than humans. A compromised agent can exfiltrate data, modify files, and make API calls in seconds. Your response needs to match that speed.
Your kill switch has three components:
1. Process termination (immediate)
# Stop the agent container immediately
docker stop agent-container --time=0
# Or force-kill if stop doesn't work
docker kill agent-container
2. Token revocation (prevent further access)
- Revoke all API keys the agent had access to
- Rotate any secrets that may have been exposed
- Invalidate session tokens
3. Evidence preservation (for investigation)
- Copy container logs before destroying the container
- Snapshot the container filesystem
- Save network logs showing what the agent communicated
The order matters: Stop first (prevent further damage), revoke second (close access), preserve third (enable investigation).
✅ Quick Check: You detect a credential leak from your agent. You revoke the compromised API key but forget to stop the agent container. What’s the risk? (Answer: The agent may have cached credentials, stored them in memory, or already exfiltrated them. The compromised agent could also use other, still-valid credentials to continue malicious operations. Always stop the agent FIRST, then revoke tokens.)
Practical Monitoring Setup
Here’s a minimal monitoring configuration that covers the three categories:
Step 1: Log everything
Configure your agent to log all tool calls, file accesses, and network requests. The format should include timestamp, action, target, and result.
Step 2: Watch for credential patterns
Set up a cron job or background process that scans agent output for credential patterns every few minutes. Alert immediately on any match.
Step 3: Monitor network connections
Use Docker’s network logging or a simple proxy to record all outbound connections. Compare against your allowlist.
Step 4: Set up alerting
When a critical signal is detected, you need to know immediately — not during your next log review session. Use a notification mechanism (email, Slack, system notification) for critical alerts.
Step 5: Test your kill switch
Practice the kill switch procedure before you need it. Verify you can stop the agent, revoke tokens, and preserve evidence in under 5 minutes. In a real incident, you won’t have time to figure it out.
The Observability Mindset
Microsoft’s 2026 report found that 80%+ of Fortune 500 companies use active AI agents, and they’re converging on a shared conclusion: security requires observability across the agent’s entire decision chain — reasoning steps, tool calls, memory references, and actions.
For personal use, you don’t need enterprise tooling. You need:
- Logs of what the agent does (actions and targets)
- Alerts when those actions exceed boundaries
- A practiced kill switch for when alerts fire
That’s it. Three components. The rest is discipline — reviewing logs weekly and keeping your monitoring running.
Key Takeaways
- Monitor three categories: credential exposure, unauthorized tool usage, and behavioral anomalies
- Routine commands can expose secrets — OpenClaw’s configure/update write resolved env vars to disk
- The kill switch has three steps: stop agent (immediate), revoke tokens (close access), preserve evidence (investigate)
- Order matters: stop first, revoke second, preserve third
- Behavioral baselines require observation — know what’s normal before you can detect what’s abnormal
- Test your kill switch before you need it — practice makes the difference between a 5-minute response and a 5-hour one
Up Next
Your monitoring is in place. But one threat deserves its own lesson because it’s fundamentally different from everything else: prompt injection. In the next lesson, you’ll learn why 85% of current defenses fail against it and what layered mitigations actually work.
Knowledge Check
Complete the quiz above first
Lesson completed!