Multi-Step Reasoning and Planning
Master the planning strategies that make agents reliable — task decomposition, plan-then-execute, adaptive replanning, and chain-of-thought reasoning.
Premium Course Content
This lesson is part of a premium course. Upgrade to Pro to unlock all premium courses and content.
- Access all premium courses
- 1000+ AI skill templates included
- New content added weekly
🔄 Quick Recall: In the last lesson, you gave agents tools — web search, code execution, file access. But tools without a plan are like power tools without a blueprint. This lesson teaches agents to plan strategically and adapt when reality doesn’t match expectations.
The Planning Problem
Give an agent a complex task without planning and it will thrash. It searches randomly, follows tangents, doubles back on work it’s already done, and produces scattered results.
Give the same agent a planning strategy and it works methodically — breaking the task into steps, executing each one, checking progress, and adjusting when needed.
Planning is the difference between an agent that wastes 20 tool calls going in circles and one that finishes in 8 focused steps.
Task Decomposition
The first planning skill: break big goals into small, concrete sub-tasks.
I need an agent to accomplish this goal:
"Analyze the competitive landscape for AI-powered writing tools and produce a strategy recommendation."
Decompose this into sub-tasks. For each sub-task:
1. What specifically needs to be done
2. What tool(s) the agent would use
3. What the expected output is
4. How to verify the output is complete and correct
5. Which sub-tasks depend on others (must happen after)
6. Which sub-tasks are independent (can happen in parallel)
Good decomposition produces sub-tasks that are:
- Specific — “Find pricing for Jasper AI” not “look at competitors”
- Verifiable — You can check if the output is correct
- Independent where possible — Researching Company A doesn’t block researching Company B
- Ordered where necessary — Analysis comes after data gathering
✅ Quick Check: Why should sub-tasks be independently verifiable?
Because the agent needs to evaluate its own progress. If a sub-task has a clear expected output (“a table of 5 competitor pricing tiers”), the agent can check whether it achieved that result before moving on. Vague sub-tasks (“understand the market”) can’t be verified, so the agent doesn’t know when to stop.
Plan-Then-Execute Pattern
The most reliable planning strategy for agents:
Phase 1: Plan. The agent analyzes the goal, breaks it into sub-tasks, orders them by dependency, and writes a numbered plan.
Phase 2: Execute. The agent works through the plan step by step. After each step, it checks: “Did this step produce the expected result? Should I adjust the remaining plan?”
Phase 3: Synthesize. Once all steps are complete, the agent combines the results into the final deliverable.
Add this to your system prompt:
PLANNING PROTOCOL:
Before taking any action, create a numbered plan:
1. Break the goal into 5-8 specific sub-tasks
2. Note which tools each sub-task needs
3. Identify dependencies (what must complete before what)
4. Estimate the number of tool calls per sub-task
Then execute the plan:
- After each sub-task, evaluate the result
- If the result is unsatisfactory, retry with a different approach
- If new information changes the plan, update it before continuing
- Track which steps are complete, in progress, and remaining
After execution:
- Combine all sub-task results into the final deliverable
- Verify the deliverable addresses the original goal
- Note any gaps or areas of uncertainty
Adaptive Replanning
Plans rarely survive contact with reality. Agents need to adapt:
Trigger 1: New information. While researching, the agent discovers the market has a key player that wasn’t in the original plan. It should add research on that player.
Trigger 2: Failed step. The API for getting pricing data is down. The agent should switch to web scraping or manual search.
Trigger 3: Better approach discovered. Midway through individual competitor research, the agent finds a comprehensive industry report that covers all competitors. It should pivot to using that report instead of individual searches.
Trigger 4: Scope realization. The agent realizes the task is larger than expected. It should flag this to the user rather than producing shallow results.
REPLANNING RULES:
- If you discover information that changes the plan, update the plan before continuing
- If a step fails twice with the same approach, try a fundamentally different approach
- If you discover the task is significantly larger than anticipated, inform the user with an updated scope estimate
- Never silently skip a step — either complete it, find an alternative, or explain why it's unnecessary
- After replanning, briefly explain what changed and why
Chain-of-Thought Reasoning
Agents reason better when they think out loud. Chain-of-thought prompting makes the agent’s reasoning visible and improves accuracy:
REASONING PROTOCOL:
For each decision, think through it explicitly:
THINKING: [What am I trying to figure out? What are my options? What are the tradeoffs?]
DECISION: [What I'll do and why]
ACTION: [Execute the decision]
RESULT: [What happened]
ASSESSMENT: [Did it work? What did I learn?]
This explicit reasoning helps you make better decisions and makes it possible for the user to understand and audit your work.
Visible reasoning has two benefits: the agent makes better decisions (articulating reasoning catches flawed logic), and you can audit the agent’s work (if the reasoning is wrong, you can see exactly where).
Planning for Parallel Execution
When sub-tasks are independent, planning for parallel execution saves time:
Given this plan:
1. Research Competitor A pricing
2. Research Competitor B pricing
3. Research Competitor C pricing
4. Compare all three (depends on 1, 2, 3)
5. Write recommendation (depends on 4)
Identify:
- Which steps can run in parallel? (1, 2, 3)
- Which steps must be sequential? (4 after 1-2-3, 5 after 4)
- What's the critical path? (The longest chain of dependent steps)
- How to handle the case where one parallel step finishes before others
Multi-agent systems (Lesson 7) take this further by assigning parallel steps to different agents.
Exercise: Design a Planning Agent
Build an agent with explicit planning capabilities:
- Write a system prompt that includes the planning protocol, replanning rules, and chain-of-thought reasoning sections above
- Give the agent this complex task: “Research the top 5 AI coding assistants, compare their features and pricing, identify which is best for a 10-person startup team, and produce a recommendation report”
- Run the agent and observe its plan
- Note where the agent replans (if it does)
- Evaluate: Did the plan improve the result quality compared to an agent without planning?
Key Takeaways
- Task decomposition breaks complex goals into specific, verifiable, ordered sub-tasks
- The plan-then-execute pattern separates thinking from doing, preventing aimless tool use
- Adaptive replanning lets agents adjust when new information, failures, or better approaches emerge
- Chain-of-thought reasoning makes the agent’s decisions visible and auditable, improving both quality and trust
- Parallel planning identifies independent sub-tasks that can run simultaneously
- Planning transforms agents from scattered to methodical, reducing tool calls and improving output quality
Up Next: In the next lesson, we’ll tackle the critical topic of safety — guardrails, human-in-the-loop checkpoints, and how to prevent agents from causing harm.
Knowledge Check
Complete the quiz above first
Lesson completed!