Plan Mode and Multi-Turn Strategies

Separating research from execution

The previous page covered scratchpads as working memory for active tasks. Plan mode goes further by separating entire phases of work research and planning versus execution into distinct modes with different capabilities.

Most agent failures trace back to premature execution. The agent reads a prompt, immediately starts writing code, and produces something that misses the actual requirement. Plan mode prevents this by restricting the agent to read-only operations until a plan exists and gets approved.

In Claude Code, plan mode allows reading files, searching with Glob and Grep, fetching web content, and analyzing architecture. It blocks creating, modifying, or deleting files. The agent cannot execute shell commands or make any changes to system state. This forces thoroughness before action.

Entering and exiting plan mode

Three methods activate plan mode:

# Start a new session in plan mode
claude --permission-mode plan

# In an active session, press Shift+Tab twice
# Normal → Auto-Accept → Plan Mode

# Or use the /plan command
/plan

Exit plan mode with another Shift+Tab press, cycling back to normal mode.

When plan mode delivers value

Plan mode pays off for:

Complex features requiring edits across multiple files.
Architectural decisions where the wrong choice causes significant rework.
Unfamiliar codebases where understanding must precede action.
High-stakes changes where mistakes carry real consequences.

Skip plan mode for single-file changes with obvious solutions renaming a variable, adding a null check, fixing a typo. The overhead of planning exceeds the benefit when the path is clear.

The plan document

Plan mode produces a structured output, typically a plan.md file with task breakdown, dependencies, and execution order:

## Plan: Add User Authentication

### Analysis
- Existing session handling in src/middleware/session.ts
- No current auth implementation
- Database schema supports users table

### Tasks (in order)
1. Create auth middleware (src/middleware/auth.ts)
2. Add login/logout routes (src/routes/auth.ts)
3. Integrate middleware into protected routes
4. Add session validation to existing endpoints

### Dependencies
- Task 2 depends on Task 1
- Tasks 3 and 4 can proceed in parallel after Task 2

### Risks
- Session middleware may need refactoring for JWT compatibility

Review this plan before execution. Edit it directly if the approach needs adjustment. Once approved, toggle back to normal mode and execute step by step.

The explore-plan-code-commit workflow

Boris Cherny, who led the development of Claude Code, describes his workflow: "If my goal is to write a Pull Request, I will use Plan mode, and go back and forth with Claude until I like its plan. From there, I switch into auto-accept edits mode and Claude can usually 1-shot it. A good plan is really important!"

This reflects a four-phase pattern that experienced practitioners converge on:

1. Explore Understand the problem space before proposing solutions. Read relevant files, search for patterns, identify constraints. Ask the agent questions about the codebase.

2. Plan Produce a concrete execution plan with explicit steps. Review and iterate the plan until confident in the approach. This phase takes longer than you'd expect, and that investment pays off.

3. Code Execute the plan with auto-accept enabled for speed. A good plan means fewer corrections and less back-and-forth during implementation.

4. Commit Verify the work, run tests, and commit with a meaningful message. Each small chunk of work becomes its own commit.

Why explicit planning phases work

Telling an agent "don't code yet, just explore" prevents premature implementation. Without this explicit instruction, agents tend to start generating code immediately, even when asked for analysis. The "almost correct code syndrome" where agents produce code that looks right but fails on edge cases stems from insufficient analysis before generation.

Small iteration strategy

Large tasks fail more often than small ones. Context accumulates, understanding drifts, and errors compound. The solution: scope work to what completes in 30 minutes or less.

The 30-minute scope rule

Before starting any agent interaction, ask: "Can this complete in 30 minutes?" If not, break it into pieces that can.

This constraint forces decomposition. "Implement user authentication" becomes:

Create auth middleware structure
Implement password hashing
Add login endpoint
Add logout endpoint
Integrate with protected routes
Write tests

Each piece fits in a short session with focused context. The agent stays accurate because it handles one thing at a time.

Incremental commits

Small iterations produce small commits. This creates several benefits:

Reviewers see focused changes, not sprawling diffs.
If something breaks, recent commits are easy to isolate.
Git log tells a story of incremental progress.
Tests run against each increment, catching issues early.

The pattern: work for 15-30 minutes, verify, commit, repeat. Cherny notes he uses his /commit-push-pr command "dozens of times daily," each representing a small, complete unit of work.

When scope creep happens

Agents expand scope when given room. A request to "add form validation" might return with form styling improvements, accessibility enhancements, and refactored error handling none of which was requested.

Counter this with explicit boundaries:

Add email validation to the signup form.
Scope: Only the email field.
Do not modify other fields or styles.
Do not refactor existing validation logic.

Explicit constraints prevent helpful overreach.

Test-driven development with agents

Test-driven development (TDD) and agent-assisted coding fit together surprisingly well. Everything that makes TDD tedious for humans writing boilerplate tests, covering edge cases, maintaining test suites becomes trivial for agents that generate code rapidly.

Why TDD fits agents

Agents thrive on clear, measurable goals. A binary test result, pass or fail, provides the clearest possible feedback signal. The agent knows exactly whether its code works without ambiguous interpretation.

Traditional TDD criticisms focus on the time cost of writing tests first. With agents, that cost approaches zero. Generating test files takes seconds, turning TDD's biggest weakness into an accelerator.

The red-green-refactor cycle with agents

Apply the classic TDD cycle:

1. Red Write a failing test that specifies the desired behavior. Have the agent generate tests based on requirements. Run them to confirm they fail (no implementation yet).

2. Green Implement the minimal code to make tests pass. The agent writes only what the tests require, nothing more.

3. Refactor Clean up while keeping tests green. The agent improves code structure with confidence because tests catch regressions.

Verification loops

Cherny emphasizes verification as a quality multiplier: "Give Claude a way to verify its work. If Claude has that feedback loop, it will 2-3x the quality of the final result."

This means configuring agents with access to test runners that execute after changes, type checkers that catch errors immediately, linters that enforce style, and build processes that verify compilation. Each verification layer catches issues before they accumulate. An agent that runs tests after every file edit produces dramatically better code than one that implements blindly.

Test-as-specification

Tests serve as executable specifications. When you write tests first, you define exactly what "done" means. The agent implements to that specification rather than interpreting vague requirements.

Research on test-driven prompts shows measurable improvement: the TGen framework improved code correctness from 82.3% to 90.8% on standard benchmarks by structuring prompts around test expectations.

The PDCA framework for agent sessions

Plan-Do-Check-Act (PDCA) provides a structured approach to agent collaboration.

The four phases

Plan (7-15 minutes) Use the agent's ability to analyze the codebase. Create a detailed execution strategy with explicit checkpoints. Define what "done" means before starting work.

Do (30 minutes to 2.5 hours) Generate code following the plan. Maintain active oversight review changes as they happen. Keep iterations small to prevent drift.

Check (2-5 minutes) Validate completeness against the definition of done. Run tests, verify type checks, confirm the implementation matches requirements. This is not optional.

Act (5-10 minutes) Retrospective on what worked and what didn't. Update CLAUDE.md with patterns worth repeating. Identify improvements for the next cycle.

Time boundaries matter

The PDCA cycle should complete in one to three hours. Longer cycles lose focus. Shorter cycles may not accomplish enough to be worth the overhead.

If work exceeds three hours, the scope was too large. Break it down and start a new cycle.

Structured versus unstructured

Comparative studies show stark differences. Unstructured agent sessions spend up to 80% of tokens on troubleshooting problems that arise from insufficient planning. Structured approaches produce better test coverage with fewer total tokens.

The discipline feels slower at first. Over multiple sessions, it proves faster because less time goes to correcting mistakes.

Multi-turn strategy patterns

Extended conversations require deliberate management to maintain coherence.

The checkpoint pattern

Insert deliberate checkpoints at task boundaries:

We've completed the auth middleware.
Before continuing: summarize what we built and confirm the approach for routes.

This forces the agent to consolidate understanding before proceeding. Checkpoints catch drift early rather than discovering problems ten turns later.

The validation gate

Before proceeding to the next task, require verification:

Run the auth tests before we add routes.
Do not proceed if any test fails.

Validation gates prevent broken implementations from propagating. Each task completes successfully before the next begins.

The scope anchor

Long conversations drift. Periodically restate the objective:

Remember: we're adding basic auth, not OAuth.
The current task is the logout endpoint.
Stay focused on that.

Explicit scope anchors prevent helpful expansion into unplanned territory.

Accepting session failure

Not every session succeeds. Cherny notes he throws away 10-20% of sessions that "end up nowhere." This is normal, not a failure of technique.

Signs a session should end:

The agent repeats the same mistake despite corrections.
Context pollution has accumulated beyond recovery.
The approach turned out wrong and needs fundamental rethinking.

Starting fresh with lessons learned beats fighting a polluted context. The /clear command or a new session provides a clean slate. Document what you learned before abandoning the session that knowledge persists even when the conversation doesn't.

Plan Mode and Multi-Turn Strategies

On this page