Multi-Step Reasoning and Planning

🔄 Quick Recall: In the last lesson, you gave agents tools — web search, code execution, file access. But tools without a plan are like power tools without a blueprint. This lesson teaches agents to plan strategically and adapt when reality doesn’t match expectations.

The Planning Problem

Give an agent a complex task without planning and it will thrash. It searches randomly, follows tangents, doubles back on work it’s already done, and produces scattered results.

Give the same agent a planning strategy and it works methodically — breaking the task into steps, executing each one, checking progress, and adjusting when needed.

Planning is the difference between an agent that wastes 20 tool calls going in circles and one that finishes in 8 focused steps.

Task Decomposition

The first planning skill: break big goals into small, concrete sub-tasks.

I need an agent to accomplish this goal:
"Analyze the competitive landscape for AI-powered writing tools and produce a strategy recommendation."

Decompose this into sub-tasks. For each sub-task:
1. What specifically needs to be done
2. What tool(s) the agent would use
3. What the expected output is
4. How to verify the output is complete and correct
5. Which sub-tasks depend on others (must happen after)
6. Which sub-tasks are independent (can happen in parallel)

Good decomposition produces sub-tasks that are:

Specific — “Find pricing for Jasper AI” not “look at competitors”
Verifiable — You can check if the output is correct
Independent where possible — Researching Company A doesn’t block researching Company B
Ordered where necessary — Analysis comes after data gathering

✅ Quick Check: Why should sub-tasks be independently verifiable?

Because the agent needs to evaluate its own progress. If a sub-task has a clear expected output (“a table of 5 competitor pricing tiers”), the agent can check whether it achieved that result before moving on. Vague sub-tasks (“understand the market”) can’t be verified, so the agent doesn’t know when to stop.

Plan-Then-Execute Pattern

The most reliable planning strategy for agents:

Phase 1: Plan. The agent analyzes the goal, breaks it into sub-tasks, orders them by dependency, and writes a numbered plan.

Phase 2: Execute. The agent works through the plan step by step. After each step, it checks: “Did this step produce the expected result? Should I adjust the remaining plan?”

Phase 3: Synthesize. Once all steps are complete, the agent combines the results into the final deliverable.

Add this to your system prompt:

PLANNING PROTOCOL:
Before taking any action, create a numbered plan:
1. Break the goal into 5-8 specific sub-tasks
2. Note which tools each sub-task needs
3. Identify dependencies (what must complete before what)
4. Estimate the number of tool calls per sub-task

Then execute the plan:
- After each sub-task, evaluate the result
- If the result is unsatisfactory, retry with a different approach
- If new information changes the plan, update it before continuing
- Track which steps are complete, in progress, and remaining

After execution:
- Combine all sub-task results into the final deliverable
- Verify the deliverable addresses the original goal
- Note any gaps or areas of uncertainty

Adaptive Replanning

Plans rarely survive contact with reality. Agents need to adapt:

Trigger 1: New information. While researching, the agent discovers the market has a key player that wasn’t in the original plan. It should add research on that player.

Trigger 2: Failed step. The API for getting pricing data is down. The agent should switch to web scraping or manual search.

Trigger 3: Better approach discovered. Midway through individual competitor research, the agent finds a comprehensive industry report that covers all competitors. It should pivot to using that report instead of individual searches.

Trigger 4: Scope realization. The agent realizes the task is larger than expected. It should flag this to the user rather than producing shallow results.

REPLANNING RULES:
- If you discover information that changes the plan, update the plan before continuing
- If a step fails twice with the same approach, try a fundamentally different approach
- If you discover the task is significantly larger than anticipated, inform the user with an updated scope estimate
- Never silently skip a step — either complete it, find an alternative, or explain why it's unnecessary
- After replanning, briefly explain what changed and why

Chain-of-Thought Reasoning

Agents reason better when they think out loud. Chain-of-thought prompting makes the agent’s reasoning visible and improves accuracy:

REASONING PROTOCOL:
For each decision, think through it explicitly:

THINKING: [What am I trying to figure out? What are my options? What are the tradeoffs?]
DECISION: [What I'll do and why]
ACTION: [Execute the decision]
RESULT: [What happened]
ASSESSMENT: [Did it work? What did I learn?]

This explicit reasoning helps you make better decisions and makes it possible for the user to understand and audit your work.

Visible reasoning has two benefits: the agent makes better decisions (articulating reasoning catches flawed logic), and you can audit the agent’s work (if the reasoning is wrong, you can see exactly where).

Planning for Parallel Execution

When sub-tasks are independent, planning for parallel execution saves time:

Given this plan:
1. Research Competitor A pricing
2. Research Competitor B pricing
3. Research Competitor C pricing
4. Compare all three (depends on 1, 2, 3)
5. Write recommendation (depends on 4)

Identify:
- Which steps can run in parallel? (1, 2, 3)
- Which steps must be sequential? (4 after 1-2-3, 5 after 4)
- What's the critical path? (The longest chain of dependent steps)
- How to handle the case where one parallel step finishes before others

Multi-agent systems (Lesson 7) take this further by assigning parallel steps to different agents.

Exercise: Design a Planning Agent

Build an agent with explicit planning capabilities:

Write a system prompt that includes the planning protocol, replanning rules, and chain-of-thought reasoning sections above
Give the agent this complex task: “Research the top 5 AI coding assistants, compare their features and pricing, identify which is best for a 10-person startup team, and produce a recommendation report”
Run the agent and observe its plan
Note where the agent replans (if it does)
Evaluate: Did the plan improve the result quality compared to an agent without planning?

Key Takeaways

Task decomposition breaks complex goals into specific, verifiable, ordered sub-tasks
The plan-then-execute pattern separates thinking from doing, preventing aimless tool use
Adaptive replanning lets agents adjust when new information, failures, or better approaches emerge
Chain-of-thought reasoning makes the agent’s decisions visible and auditable, improving both quality and trust
Parallel planning identifies independent sub-tasks that can run simultaneously
Planning transforms agents from scattered to methodical, reducing tool calls and improving output quality

Up Next: In the next lesson, we’ll tackle the critical topic of safety — guardrails, human-in-the-loop checkpoints, and how to prevent agents from causing harm.