Capstone: Design Your Agent System
Apply everything from the course: design a complete agent system with architecture, tools, memory, guardrails, and evaluation for a real-world use case.
Premium Course Content
This lesson is part of a premium course. Upgrade to Pro to unlock all premium courses and content.
- Access all premium courses
- 1000+ AI skill templates included
- New content added weekly
You’ve learned the components, patterns, tools, and practices of AI agents. Now design a complete agent system from scratch — applying everything from the course.
🔄 Quick Recall: Across this course you’ve covered: why agents matter (Lesson 1), the four components (Lesson 2), design patterns (Lesson 3), tool use (Lesson 4), multi-agent systems (Lesson 5), memory (Lesson 6), and production practices (Lesson 7). This capstone integrates all of them.
Capstone Exercise: The Research Assistant Agent
Design an agent system that helps knowledge workers research topics, synthesize findings, and produce reports. Walk through each design decision using the frameworks from this course.
Step 1: Define the Agent’s Purpose
Task: Research a topic, gather information from multiple sources, synthesize findings, and produce a structured report.
Users: Analysts, consultants, product managers — people who research topics and write reports as part of their jobs.
Success criteria: The report is accurate, well-sourced, covers the key aspects of the topic, and follows the user’s preferred format.
Step 2: Choose the Architecture
Decision: Single agent or multi-agent?
Evaluate using the framework from Lesson 5:
| Factor | Assessment |
|---|---|
| Task scope | Multiple skills: search, read, analyze, write |
| Tool count | 5-8 tools — manageable for one agent |
| Context needs | Fits in one context window for most topics |
| Parallelism | Research steps could run in parallel, but sequential is simpler |
Decision: Start with a single agent. The task is complex but fits within one agent’s capabilities. If research topics regularly exceed the context window, split into a Research Agent + Writing Agent later.
Step 3: Select Design Patterns
Primary: Planning + ReAct
The agent first plans the research (which subtopics to investigate, what sources to check), then executes each step using ReAct (Thought → Action → Observation).
Secondary: Reflection
After drafting the report, the agent reflects: Are all claims sourced? Does the structure match the user’s format? Are there gaps in coverage?
[Plan] Break topic into 4-5 subtopics
[ReAct] Research subtopic 1: search → read → synthesize
[ReAct] Research subtopic 2: search → read → synthesize
...
[Draft] Write report from synthesized findings
[Reflect] Check accuracy, completeness, format
[Revise] Fix issues identified in reflection
[Deliver] Return final report
✅ Quick Check: The agent’s plan includes “Research subtopic: quantum computing applications in healthcare.” After searching, it finds very little information — only 2 sources, both speculative blog posts. What should the adaptive planning agent do? (Answer: Replan. The agent should note that this subtopic has insufficient authoritative sources, inform the user that coverage will be limited in this area, and potentially redirect research effort to better-documented subtopics. It should NOT pad the section with speculation from weak sources. The plan adapts to what the research actually finds.)
Step 4: Define Tools
| Tool | Purpose | When Used |
|---|---|---|
web_search | Find current information | Research phase |
read_url | Extract content from web pages | After finding relevant URLs |
file_read | Read user-provided documents | When user uploads reference material |
file_write | Save the final report | Delivery phase |
calculate | Verify numbers and statistics | Fact-checking during reflection |
Each tool gets a clear description explaining when to use it and when not to.
Step 5: Design Memory
| Memory Type | What It Stores | Pattern |
|---|---|---|
| Short-term | Current research context, sources found | Buffer memory |
| Working state | Research plan, progress per subtopic | Task state with checkpoints |
| Long-term | User’s format preferences, past topics | Entity memory (user profile) |
The agent checkpoints after each subtopic is researched. If interrupted, it resumes from the last checkpoint rather than restarting.
Step 6: Add Guardrails
Input Guardrails:
├── Topic scope check: Is this a research topic we can help with?
├── Harmful content filter: Block requests for harmful information
└── Length estimate: Warn if the topic scope is too broad
Output Guardrails:
├── Source verification: Every claim must cite a source
├── Plagiarism check: No large verbatim passages without quotes
├── Format compliance: Report structure matches requested format
└── Confidence flagging: Mark sections with limited sourcing
Tool Guardrails:
├── URL filtering: Don't access blocked domains
└── Rate limiting: Max 20 web searches per report
Step 7: Plan Evaluation
Test suite (30 cases):
- 15 normal topics (varied domains: technology, business, science)
- 8 edge cases (very niche topics, very broad topics, recent events)
- 4 adversarial (injection attempts, out-of-scope requests)
- 3 regression (previously failed cases)
Metrics:
- Task completion rate target: > 90%
- Source accuracy (verified by human): > 95%
- Format compliance: > 98%
- Average latency: < 5 minutes per report
Course Recap
| Lesson | Core Concept | Key Takeaway |
|---|---|---|
| 1. Welcome | Agents vs chatbots | Agents perceive, plan, act, and adapt in a loop |
| 2. Anatomy | Four components | LLM brain + tools + memory + planning |
| 3. Patterns | ReAct, Reflection, Planning | Choose by task; combine for complex work |
| 4. Tool use | Function calling, MCP, structured outputs | Clear descriptions = correct tool selection |
| 5. Multi-agent | Supervisor, pipeline, peer-to-peer | Start simple; split only when needed |
| 6. Memory | Buffer, summary, vector, entity | Different information needs different storage |
| 7. Production | Guardrails, evaluation, observability | Measure everything; guard every boundary |
| 8. Capstone | Complete system design | Fundamentals before complexity |
Design Checklist
Use this when designing any agent system:
Architecture:
□ Single agent or multi-agent? (Justified by actual need)
□ Design pattern selected (ReAct/Reflection/Planning)
□ Pattern combination defined for complex tasks
Tools:
□ Each tool has a clear description with when/when-not
□ Structured outputs for tool inputs and outputs
□ Fallback tools for critical capabilities
Memory:
□ Short-term strategy (buffer/sliding window)
□ Long-term strategy (vector store/entity memory)
□ State management with checkpointing
Production:
□ Input, output, and tool guardrails defined
□ Test suite covering all four categories
□ Observability with distributed tracing
□ Failure recovery with retries and circuit breakers
Key Takeaways
- Agent system design follows a clear sequence: purpose → architecture → patterns → tools → memory → guardrails → evaluation
- Start with the simplest architecture that could work — a single well-designed agent with good tools beats a complex multi-agent system without fundamentals
- Every design decision should be justified by actual need, not theoretical elegance
- Adaptive planning, clear tool descriptions, and multi-layer memory are the highest-impact design choices
- Production readiness requires four pillars: safety (guardrails), reliability (evaluation), visibility (observability), and resilience (failure recovery)
- The meta-principle: fundamentals before complexity — get the basics right, then add sophistication
Knowledge Check
Complete the quiz above first
Lesson completed!