Imagine you are packing for a two-week trip. You could just start grabbing stuff and shoving it in a suitcase. Maybe you will get lucky and have everything you need. More likely, you will forget your toothbrush and pack three jackets you never wear.
A better approach: think about where you are going, what the weather will be like, what activities you have planned, and then make a packing list. Check items off as you pack them. If you find out halfway through that you do not have enough room, you re-prioritize - drop the “just in case” items and keep the essentials.
That is what planning does for an AI agent. Instead of reacting to each step blindly, the agent thinks ahead, makes a plan, and works through it systematically. And when the plan hits a snag, a good agent adjusts rather than plowing forward with a broken approach.
Simple tasks do not need planning. If someone asks “What time is it?”, the agent just checks the clock. No plan required.
But consider a request like: “Research the top 5 competitors in the cloud database market, compare their pricing and features, and write a recommendation for which one we should use for our new project.”
This task requires:
Without planning, the agent might start researching one competitor in extreme detail, run out of context space, and forget to cover the other four. Or it might jump straight to a recommendation without doing thorough research.
Planning gives the agent a roadmap. It knows where it is going, what steps come next, and how to allocate its time and resources.
| Benefit | Without planning | With planning |
|---|---|---|
| Task completion | May miss steps or get stuck | Systematic coverage of all steps |
| Resource efficiency | Wastes tokens on tangents | Allocates effort where it matters |
| Transparency | Hard to see what the agent is doing | Clear progress tracking |
| Error recovery | Gets lost when things go wrong | Can identify where the plan broke and adjust |
| Parallelism | Everything is sequential | Independent steps can run concurrently |
At the heart of every planning agent is a problem-solving loop. Different frameworks describe this differently, but the core steps are consistent:
+-------------------------------------------------------------------+
| |
| +------------+ +------------+ +---------+ |
| | Get Mission| --> | Scan Scene | --> | Think | |
| +------------+ +------------+ +----+----+ |
| | |
| v |
| +---------+ +-----+-----+ |
| | Observe | <-- | Act | |
| +----+----+ +-----------+ |
| | |
| v |
| +------+------+ |
| | Goal met? | |
| | Yes -> Done | |
| | No -> Think | |
| +-------------+ |
| |
+-------------------------------------------------------------------+
1. Get Mission
The agent receives its task. This could be a user request, a triggered event, or a delegated subtask from another agent.
Mission: "Create a summary report of our team's GitHub activity this week."
2. Scan Scene
The agent gathers context about its current situation. What tools are available? What information does it already have? What constraints exist?
Available tools: github_api, document_writer, calendar_api
Known context: Team = "platform-eng", Current date = March 15
Constraints: Report should be under 2 pages
3. Think
The agent reasons about what to do next. This is where planning happens - the agent considers its options and decides on a course of action.
Thought: "I need to:
1. Get the list of team members
2. For each member, fetch their commits, PRs, and reviews this week
3. Aggregate the data
4. Write a summary highlighting key contributions
I will start by getting the team member list."
4. Act
The agent executes an action - calling a tool, generating text, or making a decision.
Action: github_api.get_team_members(team="platform-eng")
5. Observe
The agent examines the result of its action. Did it succeed? What information did it provide? Does the plan need to change?
Observation: Team has 8 members: [alice, bob, carol, dave, eve, frank, grace, hank]
The loop then returns to Think, where the agent decides the next step based on what it observed. This continues until the mission is complete.
There are two fundamental philosophies for how agents tackle tasks. Most real agents blend both, but understanding the pure forms helps you make design decisions.
How it works: The agent creates a complete (or near-complete) plan before taking any action. Then it executes the plan step by step.
User: "Migrate our application from Python 2 to Python 3."
Planning phase:
Step 1: Audit codebase for Python 2-specific syntax
Step 2: Identify incompatible dependencies
Step 3: Update dependencies to Python 3-compatible versions
Step 4: Convert print statements to print functions
Step 5: Fix string/unicode handling
Step 6: Update integer division operators
Step 7: Run test suite and fix failures
Step 8: Update CI/CD pipeline to use Python 3
Execution phase:
[Execute steps 1-8 in order]
Strengths:
Weaknesses:
How it works: The agent takes things one step at a time. It looks at the current situation, decides the best next action, takes it, and reassesses. This is essentially the ReAct pattern from Lesson 4.
User: "Migrate our application from Python 2 to Python 3."
Step 1:
Think: "First, I should understand the codebase. Let me scan for Python 2 syntax."
Act: Search for Python 2 patterns
Observe: Found 47 print statements, 12 unicode issues, 3 deprecated imports
Step 2:
Think: "The print statements are the most common issue. Let me start there."
Act: Convert print statements
Observe: 47 print statements converted
Step 3:
Think: "Now let me tackle the unicode issues..."
[continues step by step]
Strengths:
Weaknesses:
Most production agents use a hybrid approach:
High-level plan (light):
1. Audit the codebase
2. Fix syntax issues
3. Update dependencies
4. Run tests and fix failures
Executing step 1 (reactive):
Think -> Act -> Observe -> Think -> Act -> Observe -> Step complete
Re-plan after step 1:
"The audit revealed more issues than expected. Adding step 2b for
database migration code that uses a deprecated library."
This gives you the structure of planning with the adaptability of reactive execution.
| Situation | Favor planning | Favor reactive |
|---|---|---|
| Task is well-understood | Yes | |
| Task is exploratory | Yes | |
| Multiple independent subtasks | Yes | |
| High uncertainty about what you will find | Yes | |
| Need to communicate progress | Yes | |
| Speed matters more than thoroughness | Yes | |
| Failure is costly | Yes | |
| Task is simple (fewer than 3 steps) | Yes |
Some tasks are too complex to plan in a single flat list. Hierarchical planning breaks a big goal into sub-goals, which are further broken into sub-sub-goals - like a work breakdown structure in project management.
Think about how a software project is organized:
Epic: Launch new payment system
|
+-- Story: Design payment API
| +-- Task: Define endpoints
| +-- Task: Design data models
| +-- Task: Write API specification
|
+-- Story: Implement payment processing
| +-- Task: Integrate with payment gateway
| +-- Task: Handle error cases
| +-- Task: Add retry logic
|
+-- Story: Add monitoring and alerts
+-- Task: Set up logging
+-- Task: Define SLOs
+-- Task: Configure alert rules
An agent uses the same structure. For example, “Write a blog post about Kubernetes networking” breaks into:
Sub-goal 1: Research
Task 1.1: Identify key networking concepts
Task 1.2: Find recent developments and best practices
Sub-goal 2: Outline and Write
Task 2.1: Create section structure
Task 2.2: Write each section with code examples
Sub-goal 3: Review
Task 3.1: Check technical accuracy
Task 3.2: Verify code examples work
| Level | What it describes | Example |
|---|---|---|
| Goal | What success looks like | “Deploy the application to production” |
| Sub-goal | Major phases of work | “Prepare the environment” |
| Task | Specific actions | “Create the Cloud Run service” |
| Step | Atomic operations | “Run gcloud run deploy” |
Agents typically plan at the sub-goal and task level. Steps are handled by ReAct-style execution within each task.
No plan survives first contact with reality. Good agents detect when a plan is failing and adjust.
| Trigger | Example | Response |
|---|---|---|
| Task failure | API returns an error | Try alternative approach or skip and revisit |
| New information | Discovered the database uses a different schema than expected | Update the plan to account for the actual schema |
| Changed requirements | User adds a new requirement mid-task | Incorporate the new requirement into the plan |
| Resource constraints | Running low on context space or API quota | Simplify remaining steps |
| Blocked dependency | A required service is down | Reorder tasks to work on unblocked items first |
1. Local adjustment
Fix the immediate problem without changing the overall plan.
Original plan step: "Query the users table for active accounts"
Failure: "Table 'users' does not exist"
Adjustment: "Query the 'accounts' table instead (it has the same data)"
2. Step insertion
Add new steps to handle an unexpected situation.
Original plan:
1. Fetch data
2. Process data
3. Generate report
After discovering data needs cleaning:
1. Fetch data
1b. Clean and validate data <-- inserted
2. Process data
3. Generate report
3. Plan revision
Significantly restructure the remaining plan based on what you have learned.
Original plan: "Migrate the monolith to microservices"
After analysis: "The codebase is too tightly coupled for a direct migration."
Revised plan: "First, introduce module boundaries within the monolith
(strangler fig pattern), then gradually extract services."
4. Goal modification
In extreme cases, the agent recognizes the original goal is not achievable and proposes a revised goal. For example, if the target was “reduce latency to under 50ms” but the database query alone takes 80ms, the agent might propose targeting 100ms with caching for frequently accessed endpoints.
The key principle: try a local fix first, and only restructure the remaining plan if the local fix is not possible.
Planning requires reasoning - the ability to think through problems logically. Several techniques help agents reason more effectively.
What it is: The agent works through a problem step by step, showing its reasoning at each stage. Instead of jumping to an answer, it “thinks out loud.”
Analogy: Showing your work on a math test. Instead of writing just the answer, you write each step of the calculation.
Without Chain-of-Thought:
Q: "If a server handles 100 requests/second and we need to handle 10,000
requests/second, how many servers do we need?"
A: "100 servers"
With Chain-of-Thought:
Q: [same question]
A: "Let me work through this:
- One server handles 100 requests/second
- We need 10,000 requests/second total
- 10,000 / 100 = 100 servers for the raw capacity
- But we should account for redundancy. With N+1 redundancy, we need 101.
- And servers do not run at 100% efficiency. At 80% target utilization:
10,000 / (100 * 0.8) = 125 servers
- With N+1 redundancy: 126 servers
Answer: 126 servers for production, assuming 80% target utilization
and N+1 redundancy."
The step-by-step reasoning catches nuances (redundancy, utilization) that a snap answer misses.
When to use CoT:
What it is: Instead of following a single chain of reasoning, the agent explores multiple possible paths and evaluates which one is most promising. Think of it as brainstorming several approaches before committing to one.
Analogy: A chess player considering several possible moves, thinking a few steps ahead for each one, and then choosing the best path.
Problem: "Our API is timing out under load. How should we fix it?"
Branch 1: Add caching
-> Evaluate: "Would reduce database load by ~60%. Implementation
takes 2 days. Risk: cache invalidation complexity."
Score: 7/10
Branch 2: Optimize database queries
-> Evaluate: "Could improve query time by ~40%. Implementation
takes 3 days. Risk: low, well-understood approach."
Score: 6/10
Branch 3: Add horizontal scaling
-> Evaluate: "Handles any load level. Implementation takes 1 day
with Kubernetes. Risk: increases infrastructure cost."
Score: 8/10
Decision: Start with horizontal scaling (fastest win), then add
caching for long-term efficiency.
When to use ToT:
What it is: The agent solves the same problem multiple times using different reasoning paths, then checks if the answers agree. If most paths lead to the same conclusion, confidence is high. If they disagree, more investigation is needed.
Analogy: Asking three different mechanics about a car problem. If they all say “bad alternator,” you can be confident. If they each say something different, you need more diagnostics.
Problem: "Is this code change safe to deploy on a Friday?"
Reasoning path 1 (risk analysis):
"The change modifies the payment flow. Payment changes on Friday
mean weekend incidents. Verdict: No."
Reasoning path 2 (scope analysis):
"The change is 5 lines and has 95% test coverage. Small, well-tested
changes are low risk. Verdict: Yes, with monitoring."
Reasoning path 3 (historical analysis):
"Similar changes in the past have caused issues 15% of the time.
That is above our 10% threshold. Verdict: No."
Consensus: 2 out of 3 say No. Recommendation: Wait until Monday.
When to use self-consistency:
| Technique | Approach | Strength | Cost | Best for |
|---|---|---|---|---|
| Chain-of-Thought | Step-by-step linear | Thoroughness | 1x (single pass) | Most reasoning tasks |
| Tree-of-Thoughts | Explore multiple paths | Finds best approach | 3-5x (multiple branches) | Strategic decisions |
| Self-consistency | Multiple independent attempts | Confidence calibration | 3-5x (multiple passes) | High-stakes validation |
The orchestration layer is the control system that manages how an agent plans, reasons, and executes. Think of it as the conductor of an orchestra - it does not play any instruments, but it decides who plays what and when.
+----------------------------------------------------------+
| ORCHESTRATION LAYER |
| |
| +----------+ +-----------+ +----------+ +--------+ |
| | Plan | | Execute | | Monitor | | Re-plan| |
| | Manager | | Engine | | & Eval | | Logic | |
| +----------+ +-----------+ +----------+ +--------+ |
| |
| Responsibilities: |
| - Break goals into executable steps |
| - Decide execution order and parallelism |
| - Route steps to the right tools or sub-agents |
| - Track progress and state |
| - Detect failures and trigger re-planning |
| - Manage context window usage |
| - Enforce guardrails and safety checks |
+----------------------------------------------------------+
In Google Cloud’s Vertex AI Agent Engine, the orchestration layer handles:
The Agent Development Kit (ADK) gives you building blocks for customizing orchestration behavior. You define the agent’s tools, instructions, and behavior - the framework handles the execution loop.
| Decision | Options | Trade-off |
|---|---|---|
| How much to plan upfront | Full plan vs. next step only | Thoroughness vs. flexibility |
| When to re-plan | After every step vs. only on failure | Adaptability vs. overhead |
| How to handle failure | Retry, skip, abort, or re-plan | Resilience vs. cost |
| Sequential vs. parallel | One step at a time vs. concurrent | Simplicity vs. speed |
| How much reasoning to show | Internal only vs. exposed to user | Transparency vs. noise |
Even well-designed planning agents can fail in predictable ways. Knowing these failure modes helps you build defenses against them.
What happens: The agent gets stuck repeating the same action or re-planning endlessly without making progress.
Example:
Think: "I need to find the user's email. Let me search the database."
Act: search_database(query="user email")
Observe: No results found.
Think: "I need to find the user's email. Let me search the database."
Act: search_database(query="user email")
Observe: No results found.
[repeats forever]
Prevention:
What happens: The agent gradually wanders away from the original goal, following interesting tangents instead of staying on track.
Example:
Original goal: "Write a summary of Q4 sales performance"
Step 1: Fetch Q4 sales data [on track]
Step 2: Notice an anomaly in November data [slightly off track]
Step 3: Deep dive into November anomaly [drifting]
Step 4: Research industry trends that might explain the anomaly [lost]
Step 5: Write a report about industry trends [completely off track]
Prevention:
What happens: The agent spends more time planning than it would take to just do the task.
Example:
User: "What is 2 + 2?"
Agent Plan:
Step 1: Identify the mathematical operation (addition)
Step 2: Identify the operands (2 and 2)
Step 3: Verify the operands are valid numbers
Step 4: Perform the addition
Step 5: Verify the result
Step 6: Format the response
[Just say "4"!]
Prevention:
What happens: One failed step causes subsequent steps to fail in a chain reaction, and the agent does not detect the root cause.
Example:
Step 1: Fetch user profile -> Returns error (auth expired)
Step 2: Process user preferences -> Fails (no profile data)
Step 3: Generate recommendations -> Fails (no preferences)
Step 4: Format output -> Fails (no recommendations)
Agent: "I was unable to complete the task due to formatting errors."
[Wrong! The real problem was expired authentication in step 1]
Prevention:
What happens: The plan and its execution history fill up the context window, leaving no room for the agent to reason about the current step.
Prevention:
| Failure mode | Symptom | Key prevention |
|---|---|---|
| Infinite loops | Same action repeated | Max iteration count + repetition detection |
| Plan drift | Agent wanders off topic | Periodic goal alignment check |
| Over-planning | Excessive planning for simple tasks | Complexity estimation before planning |
| Cascading failures | Wrong root cause reported | Trace back on failure + fail fast |
| Context exhaustion | Agent loses coherence | Context budgeting + summarization |
Here is a condensed example showing how planning, reasoning, and re-planning combine:
User: "Analyze our error logs from the past week and create Jira tickets
for the top issues."
PLAN:
Phase 1: Query error logs (past 7 days, ERROR + FATAL severity)
Phase 2: Group by type, rank by frequency, identify top 5
Phase 3: For each issue, check for existing tickets, create if new
Phase 4: Summarize findings
EXECUTE (ReAct within each phase):
Act: log_query(severity=["ERROR","FATAL"], days=7)
Observe: 12,847 entries returned
Act: log_analyze(group_by="error_message", sort="count_desc")
Observe: Top issues identified (connection timeouts, null pointer, rate limits)
Act: jira_search(query="Connection timeout payment-service")
Observe: No existing ticket -> Create new ticket PLATFORM-1234
RE-PLAN (mid-execution):
Observe: PLATFORM-999 already exists for NullPointerException
Adjust: Update existing ticket with new occurrence count instead of creating duplicate
DELIVER:
"Analyzed 12,847 errors. Created 4 new tickets, updated 1 existing one."
Notice how the agent uses hierarchical planning upfront, ReAct-style execution within each phase, and re-planning when it discovers an existing ticket. All the patterns working together.
Planning transforms complex tasks into manageable steps. Without it, agents either miss important steps or waste effort on tangents.
The problem-solving loop (Mission, Scene, Think, Act, Observe) is the engine of every planning agent. Understanding this loop helps you design and debug agent behavior.
Plan-then-execute and reactive approaches are two ends of a spectrum. Most practical agents use a hybrid: light upfront planning with reactive execution.
Hierarchical planning handles complex tasks by breaking them into goals, sub-goals, tasks, and steps. This mirrors how project management works.
Re-planning is essential. Plans will break. Good agents detect failures early and adjust. The ability to adapt separates useful agents from fragile ones.
Reasoning techniques (CoT, ToT, self-consistency) make planning more effective. Chain-of-Thought for step-by-step logic, Tree-of-Thoughts for exploring alternatives, self-consistency for validating critical decisions.
Watch out for common failure modes. Infinite loops, plan drift, over-planning, cascading failures, and context exhaustion are predictable problems with known solutions.
The orchestration layer ties it all together. It manages the plan lifecycle, routes execution, handles errors, and keeps the agent on track.
Next lesson: Multi-Agent Systems