Build a Production Agent System
Apply everything you've learned to design, build, and deploy a complete AI agent system — from goal definition to production-ready deployment with guardrails.
Premium Course Content
This lesson is part of a premium course. Upgrade to Pro to unlock all premium courses and content.
- Access all premium courses
- 1000+ AI skill templates included
- New content added weekly
🔄 Quick Recall: Throughout this course, you’ve learned agent architecture (Lesson 2), built your first agent (Lesson 3), added tools (Lesson 4), planning strategies (Lesson 5), guardrails (Lesson 6), and multi-agent orchestration (Lesson 7). Now you’ll combine all of it into a single, production-ready system.
The Capstone: A Business Intelligence Agent
You’ll design a complete agent system that takes a company name and produces a business intelligence brief. This task touches every skill you’ve learned:
- Goal design — Clear objective with defined inputs and outputs
- Tool use — Web search, data processing, document creation
- Planning — Task decomposition with parallel and sequential steps
- Guardrails — Scope constraints, cost limits, quality checks
- Evaluation — Self-assessment before delivering results
The output: a structured brief covering company overview, recent news, financial highlights, competitive position, and strategic insights.
Step 1: Define the Goal
Start with a precise goal statement:
GOAL: Given a company name, produce a Business Intelligence Brief containing:
1. Company overview (what they do, size, market position)
2. Recent developments (last 6 months of news)
3. Financial highlights (revenue, growth, key metrics if public)
4. Competitive landscape (top 3 competitors, positioning)
5. Strategic insights (trends, risks, opportunities)
INPUT: Company name (string)
OUTPUT: Structured markdown report (2,000-3,000 words)
SUCCESS CRITERIA: All 5 sections complete, sources cited, delivered within 15 minutes
Notice how this goal is specific, measurable, and bounded. The agent knows exactly what “done” looks like.
✅ Quick Check: Why does the goal include “delivered within 15 minutes” as a success criterion?
Time constraints prevent the agent from over-researching. Without a time limit, an agent could spend hours finding marginally better data. The constraint forces efficiency: gather the most important information within the budget, then synthesize. It also protects against infinite loops.
Step 2: Design the System Prompt
Combine the patterns from Lessons 3-6 into a complete system prompt:
You are a Business Intelligence Agent. Your job is to research a company and produce a comprehensive intelligence brief.
CAPABILITIES:
- Web search for current company information
- Document reading for financial reports and press releases
- Data analysis for financial metrics
PLANNING PROTOCOL:
1. Break the research into 5 sections matching the output format
2. Research each section, starting with company overview
3. After each section, assess: Is the information sufficient? Current? From reliable sources?
4. If a section is weak, do one additional targeted search
QUALITY CRITERIA:
- Every factual claim must have a source
- Financial data must be from the last 12 months
- At least 3 different sources per section
- Flag any information you're uncertain about
SCOPE CONSTRAINTS:
- Maximum 20 web searches total
- Maximum 15 minutes per brief
- Only access publicly available information
- Do not speculate about non-public financials
HUMAN CHECKPOINTS:
- Pause if the company appears to be private with very limited public information
- Pause if you find significant contradictions between sources
- Pause if the task requires accessing paid databases
FAILURE HANDLING:
- If web search returns no results for a section, note "Limited public information available" and move on
- If financial data is unavailable, state the gap clearly rather than guessing
- If you exhaust your search budget, deliver what you have with a note about gaps
This prompt integrates five lessons worth of patterns: identity and capabilities (Lesson 3), tool use guidance (Lesson 4), planning protocol (Lesson 5), guardrails (Lesson 6), and structured output.
Step 3: Define the Tool Set
Apply the minimum viable tools principle:
| Tool | Purpose | Constraints |
|---|---|---|
web_search | Find company info, news, financials | Max 20 queries |
read_document | Extract data from found articles | Only public URLs |
write_report | Assemble the final brief | Markdown format |
Three tools. The agent doesn’t need email, database access, or code execution for this task. Fewer tools mean fewer wrong choices.
Step 4: Map the Workflow
Design the agent’s execution plan:
PHASE 1 — RESEARCH (parallel where possible)
├── Search: Company overview + Wikipedia/Crunchbase
├── Search: Recent news (last 6 months)
├── Search: Financial data (if public company)
├── Search: Top competitors
└── Read: Most relevant articles in full
PHASE 2 — ANALYSIS (sequential)
├── Synthesize findings per section
├── Identify patterns across sections
├── Flag gaps and uncertainties
└── Generate strategic insights
PHASE 3 — DELIVERY
├── Assemble report in structured format
├── Add source citations
├── Self-evaluate against quality criteria
└── Deliver or flag issues for human review
Phases 1 uses parallel research (sections don’t depend on each other). Phase 2 must be sequential because analysis requires all data. Phase 3 depends on Phase 2.
Step 5: Add Evaluation Logic
The agent needs to judge its own output before delivering:
SELF-EVALUATION CHECKLIST (run before delivering):
□ All 5 sections present and substantive (not just placeholders)
□ At least 8 unique sources cited across the brief
□ Financial data is dated (includes the time period it covers)
□ Competitors section names specific companies, not generic categories
□ Strategic insights connect to evidence from earlier sections
□ No unresolved contradictions between sources
□ Total length between 2,000-3,000 words
If any check fails → identify the gap → do one targeted search → reassess
If still failing after remediation → deliver with explicit gap notes
This is the evaluation component from Lesson 2 in action. The agent doesn’t just produce output — it verifies the output meets the standard before considering the task complete.
Step 6: Test Progressively
Don’t go straight to production. Test in stages:
Test 1 — Smoke test: Run the agent on a well-known public company (Apple, Google). Verify: Does it find accurate, current information? Does it structure the output correctly? Does it stay within the 20-search limit?
Test 2 — Edge case: Run it on a smaller, less-known company. Does it handle limited information gracefully? Does it acknowledge gaps rather than fabricating data?
Test 3 — Failure mode: Intentionally give it a fictional company name. Does the agent recognize there’s no real data? Does it report the problem rather than generating a plausible-sounding but false report?
Test 4 — Guardrail test: Monitor the search count. Does it respect the 20-search limit? Does it stay within the time constraint? Does it pause at human checkpoints when appropriate?
Each test validates a different layer of the system.
Putting It All Together: The Complete Blueprint
Here’s your full agent specification — a template you can adapt for any production agent:
AGENT SPECIFICATION: Business Intelligence Agent
VERSION: 1.0
1. GOAL
Input: Company name
Output: Structured intelligence brief (5 sections, 2-3K words)
Success: All sections complete, sourced, delivered in <15 min
2. SYSTEM PROMPT
[Include the full system prompt from Step 2]
3. TOOLS
web_search (max 20 calls), read_document, write_report
4. WORKFLOW
Phase 1: Parallel research (5 searches)
Phase 2: Sequential analysis
Phase 3: Assembly + evaluation + delivery
5. GUARDRAILS
- Scope: public info only, no paid databases
- Limits: 20 searches, 15 minutes, 3K words max
- Checkpoints: limited info, contradictions, paid access needed
- Failure: acknowledge gaps, never fabricate
6. EVALUATION
Pre-delivery checklist with 7 quality criteria
One remediation attempt per failed check
7. MONITORING
Log: every search query and result quality
Track: searches used, time elapsed, sections completed
Alert: if search budget hits 80%, if any section has 0 sources
This one-page specification is everything someone needs to build, test, and maintain the agent. Whether you implement it on Claude, ChatGPT, LangGraph, or CrewAI, the design is platform-independent.
Exercise: Build Your Agent
Choose your path:
Non-technical: Open Claude or ChatGPT. Paste the system prompt from Step 2 as your first message (or add it as Custom Instructions/Project instructions). Then ask it to research a company. Evaluate the output against the checklist in Step 5.
Technical: Use a framework like LangGraph or CrewAI. Implement the agent with actual tool functions (web search API, file writer). Add the guardrails as code constraints (search counter, time limit). Log every tool call for debugging.
Either way, test all four scenarios: well-known company, obscure company, fictional company, and guardrail stress test.
Course Review: What You’ve Learned
| Lesson | Core Skill | Key Pattern |
|---|---|---|
| 1. What Are Agents | Recognize agent opportunities | Goal-directed, multi-step, tool-using |
| 2. Architecture | Design agent systems | Goal → Reasoning → Tools → Memory → Evaluation |
| 3. First Agent | Build from scratch | System prompt with 7 essential sections |
| 4. Tool Use | Extend capabilities | Minimum viable tools + error handling |
| 5. Planning | Make agents reliable | Plan-then-execute + adaptive replanning |
| 6. Safety | Prevent failures | Three layers: scope, checkpoints, monitoring |
| 7. Orchestration | Scale to multi-agent | Hub-and-spoke, pipeline, debate architectures |
| 8. Production | Ship it | Specification → Build → Test → Deploy |
Key Takeaways
- A production agent needs seven components: goal, system prompt, tools, workflow, guardrails, evaluation, and monitoring
- Start with a precise, measurable goal — everything else flows from it
- Design the system prompt by combining patterns from planning, tool use, guardrails, and evaluation
- Test progressively: smoke test → edge cases → failure modes → guardrail stress tests
- Start with a single agent; only add complexity (multi-agent) when evidence shows it improves results
- The agent specification template works across platforms — design is platform-independent
- You now have the complete toolkit to build agents that are reliable enough for real work
Knowledge Check
Complete the quiz above first
Lesson completed!