Parallel Agent Execution
From sequential to parallel
Module 6 covered the mechanics of parallel execution: git worktrees, terminal multiplexers, orchestration tools. This section examines parallel execution as an automation pattern: when to spawn multiple agents, how context isolation enables it, and how subagent composition scales the approach.
Parallel agents work because they operate with isolated contexts. Each agent maintains its own 200K token window, independent of others. This isolation prevents the context pollution that would occur if one agent's exploration contaminated another's working memory.
Context isolation as architecture
Sequential agent work accumulates context. Every file read, every test run, every error message adds to the conversation history. Over a long session, useful signal drowns in noise.
Parallel agents sidestep this problem through architecture, not discipline. Instead of one agent carrying everything, multiple agents carry focused subsets. The auth refactoring agent knows nothing about the API migration agent's work. Their contexts stay clean because they never intersect.
Here's what this means in practice: three parallel agents each have 200K tokens of fresh context, not 200K shared among three tasks. The total effective context across parallel agents is N × 200K, where N is the agent count. Five parallel agents operate with roughly one million tokens of collective context.
The constraint is real: these contexts cannot communicate directly. Agent A cannot tell Agent B what it discovered. Coordination happens through artifacts—files, branches, and the developer who monitors them.
Each parallel Claude Code session uses its own context independently. API usage limits apply per account, not per session. Five parallel agents consume the budget five times faster.
Subagent composition patterns
Claude Code's subagent system provides a different form of parallelism: orchestrated delegation within a single session.
The explore pattern:
Spawn read-only Explore subagents in parallel to gather information:
Main agent receives complex task
Spawn subagent 1: Search for authentication patterns
Spawn subagent 2: Examine database schema
Spawn subagent 3: Find relevant test files
Subagents return summaries, not full content
Main agent synthesizes and proceedsEach subagent runs in its own context. The main agent receives condensed results, not the full investigation history. Three parallel investigations consume subagent context, not main context.
When to use subagents versus worktrees:
| Approach | Best for | Trade-off |
|---|---|---|
| Subagents | Read operations, information gathering, parallel research | Results return to single conversation; writes still serialize |
| Worktrees | Write operations, independent features, long-running tasks | No context sharing; manual coordination required |
| Hybrid | Large refactoring with research phases | Complexity; requires clear phase boundaries |
Subagents excel at parallel reads because they run simultaneously and return only relevant findings. Worktrees excel at parallel writes because they provide complete file isolation.
The map-reduce pattern:
For large-scale changes, split work across multiple subagents:
Main agent: Create todo list with 50 migration items
Batch 1 (items 1-10): Spawn subagent, execute, return results
Batch 2 (items 11-20): Spawn subagent, execute, return results
...
Batch 5 (items 41-50): Spawn subagent, execute, return results
Main agent: Aggregate results, handle failuresThis pattern works within a single worktree. The main agent orchestrates while subagents perform bounded work. Each subagent completes its batch and returns a summary. The main conversation stays focused on coordination rather than execution details.
Teams using this pattern for mechanical migrations—import updates, API version bumps, boilerplate generation—report 10× speedups or better.
Running multiple Claude Code sessions
The simplest form of parallel execution: multiple terminal windows, each running Claude Code in a different worktree.
Resource allocation:
Each Claude Code session requires:
- One terminal process
- Language server resources (if applicable)
- Memory for conversation history
- Network bandwidth for API calls
Practical limits emerge around 5-8 parallel sessions on a typical development machine. Beyond that, resource contention affects responsiveness. Teams scaling further use dedicated machines or cloud development environments.
Port and service management:
Parallel sessions running development servers conflict on ports. Configure each worktree with distinct ports:
# .trees/feature-auth/.env
PORT=3001
API_PORT=3011
# .trees/feature-dashboard/.env
PORT=3002
API_PORT=3012Database connections require similar attention. Parallel agents running migrations against shared databases create race conditions. Either serialize database-touching work or provide isolated database instances per worktree.
Session lifecycle:
Parallel sessions do not synchronize automatically. One session may complete while others continue. Completed work sits on a branch until the developer merges it.
A workflow pattern:
- Create N worktrees for N independent tasks
- Start N Claude Code sessions
- Monitor progress, intervene as needed
- As sessions complete, review and merge their branches
- Remove worktrees after merging
- Remaining sessions continue unaffected
The developer acts as coordinator, not as a participant in each conversation.
Cursor's parallel agent model
Cursor 2.0 implements IDE-level parallel execution with up to eight agents. The implementation reveals design principles applicable to any parallel agent system.
Automatic worktree management:
Cursor creates worktrees automatically when launching parallel agents.
Worktrees appear in ~/.cursor/worktrees/<repo>/ with configuration-driven setup.
The IDE tracks which agent operates in which worktree.
Storage implications:
Each worktree duplicates working files. A 500 MB repository with eight parallel agents consumes 4 GB of disk space. Large monorepos require attention to storage limits.
Cursor implements automatic cleanup: worktrees inactive for 6 hours are removed. Maximum of 20 worktrees per workspace prevents unbounded growth.
Best-of-N execution:
Run identical prompts across multiple agents simultaneously. Compare results rather than accepting a single attempt. Useful for implementations with multiple valid approaches.
This pattern trades resources for optionality: three implementations to choose from rather than one to accept or reject.
The best-of-N pattern works well for ambiguous specifications. When requirements allow multiple valid solutions, parallel attempts surface options a single agent might not explore.
Coordination without communication
Parallel agents cannot talk to each other. Agent A does not know what Agent B is doing. The developer bridges this gap.
File-based coordination:
Shared files in the main worktree (not the agent worktrees) can coordinate work:
# coordination/status.md
| Agent | Task | Status | Last Update |
|-------|------|--------|-------------|
| Auth | JWT migration | Complete | 14:23 |
| Dashboard | Chart refactor | In progress | 14:30 |
| API | Endpoint versioning | Blocked | 14:15 |Agents update their rows before starting and after completing major phases. The developer sees consolidated status without checking each terminal.
Dependency sequencing:
When Task B depends on Task A's output:
- Agent A completes Task A, commits to branch
- Developer merges Task A to main (or creates integration branch)
- Agent B pulls changes, begins Task B
This explicit sequencing prevents agents from working with stale assumptions. The developer controls when dependencies resolve.
Conflict prevention:
Parallel agents should not modify the same files. Task decomposition must respect this constraint.
If two features both require changes to src/config.ts, options include:
- Serialize: one agent completes before the other starts
- Split: one agent handles config changes, others wait for completion
- Abstract: create separate config files that merge later
Module 6 established the heuristic: if tasks can merge without discussion, they parallelize well. The automation perspective adds: decompose tasks to avoid overlapping file modifications.
Scaling considerations
Parallel execution multiplies everything: throughput, resource consumption, review burden, error surface area.
Diminishing returns:
Most scenarios hit optimal throughput at 3-5 parallel agents. Beyond five:
- Merge complexity increases non-linearly
- Context switching between sessions degrades developer attention
- API consumption accelerates budget depletion
- Error surface area expands
Eight parallel agents rarely deliver twice the throughput of four. Coordination overhead consumes the theoretical gains.
Review bottleneck:
Parallel agents produce parallel outputs. Three agents completing three features in 30 minutes create 90 minutes of review work. The throughput improvement shifts from development time to review queue depth.
Match agent parallelism to review capacity. If one developer reviews all output, two or three parallel agents may saturate their attention. Teams with distributed review can sustain higher parallelism.
Error multiplication:
Each parallel agent can fail independently. With five agents, the probability of at least one failure increases. Monitor all sessions, not just the one currently in focus.
Establish intervention triggers:
- Extended silence (10+ minutes without progress)
- Repeated errors in logs
- Requests for approval
- Modifications outside expected file scope
Parallel execution demands parallel attention.
Parallel execution in automation contexts
These patterns extend to automated workflows.
CI/CD parallel agents:
GitHub Actions or other CI systems can spawn multiple agent sessions. Each job runs in its own environment, providing natural isolation. The workflow file coordinates: spawn agents, wait for completion, aggregate results.
Scheduled parallel work:
Cron-triggered agents can parallelize across worktrees or containers. Overnight maintenance tasks might spawn five agents to address different technical debt categories.
Orchestration frameworks:
Frameworks like Claude Agent SDK, CrewAI, and AutoGen provide structured approaches to multi-agent coordination. They implement patterns like:
- Manager-worker hierarchies
- Result aggregation protocols
- Error handling across agent pools
These frameworks suit complex automation where custom coordination logic is needed. For simpler parallel work, worktrees and manual coordination suffice.