Parallel Agent Execution

From sequential to parallel

Module 6 covered the mechanics of parallel execution: git worktrees, terminal multiplexers, orchestration tools. This section examines parallel execution as an automation pattern: when to spawn multiple agents, how context isolation enables it, and how subagent composition scales the approach.

Parallel agents work because they operate with isolated contexts. Each agent maintains its own 200K token window, independent of others. This isolation prevents the context pollution that would occur if one agent's exploration contaminated another's working memory.

Context isolation as architecture

Sequential agent work accumulates context. Every file read, every test run, every error message adds to the conversation history. Over a long session, useful signal drowns in noise.

Parallel agents sidestep this problem through architecture, not discipline. Instead of one agent carrying everything, multiple agents carry focused subsets. The auth refactoring agent knows nothing about the API migration agent's work. Their contexts stay clean because they never intersect.

Here's what this means in practice: three parallel agents each have 200K tokens of fresh context, not 200K shared among three tasks. The total effective context across parallel agents is N × 200K, where N is the agent count. Five parallel agents operate with roughly one million tokens of collective context.

The constraint is real: these contexts cannot communicate directly. Agent A cannot tell Agent B what it discovered. Coordination happens through artifacts—files, branches, and the developer who monitors them.

Each parallel Claude Code session uses its own context independently. API usage limits apply per account, not per session. Five parallel agents consume the budget five times faster.

Subagent composition patterns

Claude Code's subagent system provides a different form of parallelism: orchestrated delegation within a single session.

The explore pattern:

Spawn read-only Explore subagents in parallel to gather information:

Main agent receives complex task

Spawn subagent 1: Search for authentication patterns
Spawn subagent 2: Examine database schema
Spawn subagent 3: Find relevant test files

Subagents return summaries, not full content

Main agent synthesizes and proceeds

Each subagent runs in its own context. The main agent receives condensed results, not the full investigation history. Three parallel investigations consume subagent context, not main context.

When to use subagents versus worktrees:

Approach	Best for	Trade-off
Subagents	Read operations, information gathering, parallel research	Results return to single conversation; writes still serialize
Worktrees	Write operations, independent features, long-running tasks	No context sharing; manual coordination required
Hybrid	Large refactoring with research phases	Complexity; requires clear phase boundaries

Subagents excel at parallel reads because they run simultaneously and return only relevant findings. Worktrees excel at parallel writes because they provide complete file isolation.

The map-reduce pattern:

For large-scale changes, split work across multiple subagents:

Main agent: Create todo list with 50 migration items

Batch 1 (items 1-10): Spawn subagent, execute, return results
Batch 2 (items 11-20): Spawn subagent, execute, return results
...
Batch 5 (items 41-50): Spawn subagent, execute, return results

Main agent: Aggregate results, handle failures

This pattern works within a single worktree. The main agent orchestrates while subagents perform bounded work. Each subagent completes its batch and returns a summary. The main conversation stays focused on coordination rather than execution details.

Teams using this pattern for mechanical migrations—import updates, API version bumps, boilerplate generation—report 10× speedups or better.

Running multiple Claude Code sessions

The simplest form of parallel execution: multiple terminal windows, each running Claude Code in a different worktree.

Resource allocation:

Each Claude Code session requires:

One terminal process
Language server resources (if applicable)
Memory for conversation history
Network bandwidth for API calls

Practical limits emerge around 5-8 parallel sessions on a typical development machine. Beyond that, resource contention affects responsiveness. Teams scaling further use dedicated machines or cloud development environments.

Port and service management:

Parallel sessions running development servers conflict on ports. Configure each worktree with distinct ports:

# .trees/feature-auth/.env
PORT=3001
API_PORT=3011

# .trees/feature-dashboard/.env
PORT=3002
API_PORT=3012

Database connections require similar attention. Parallel agents running migrations against shared databases create race conditions. Either serialize database-touching work or provide isolated database instances per worktree.

Session lifecycle:

Parallel sessions do not synchronize automatically. One session may complete while others continue. Completed work sits on a branch until the developer merges it.

A workflow pattern:

Create N worktrees for N independent tasks
Start N Claude Code sessions
Monitor progress, intervene as needed
As sessions complete, review and merge their branches
Remove worktrees after merging
Remaining sessions continue unaffected

The developer acts as coordinator, not as a participant in each conversation.

Cursor's parallel agent model

Cursor 2.0 implements IDE-level parallel execution with up to eight agents. The implementation reveals design principles applicable to any parallel agent system.

Automatic worktree management:

Cursor creates worktrees automatically when launching parallel agents. Worktrees appear in ~/.cursor/worktrees/<repo>/ with configuration-driven setup. The IDE tracks which agent operates in which worktree.

Storage implications:

Each worktree duplicates working files. A 500 MB repository with eight parallel agents consumes 4 GB of disk space. Large monorepos require attention to storage limits.

Cursor implements automatic cleanup: worktrees inactive for 6 hours are removed. Maximum of 20 worktrees per workspace prevents unbounded growth.

Best-of-N execution:

Run identical prompts across multiple agents simultaneously. Compare results rather than accepting a single attempt. Useful for implementations with multiple valid approaches.

This pattern trades resources for optionality: three implementations to choose from rather than one to accept or reject.

The best-of-N pattern works well for ambiguous specifications. When requirements allow multiple valid solutions, parallel attempts surface options a single agent might not explore.

Coordination without communication

Parallel agents cannot talk to each other. Agent A does not know what Agent B is doing. The developer bridges this gap.

File-based coordination:

Shared files in the main worktree (not the agent worktrees) can coordinate work:

# coordination/status.md

| Agent | Task | Status | Last Update |
|-------|------|--------|-------------|
| Auth | JWT migration | Complete | 14:23 |
| Dashboard | Chart refactor | In progress | 14:30 |
| API | Endpoint versioning | Blocked | 14:15 |

Agents update their rows before starting and after completing major phases. The developer sees consolidated status without checking each terminal.

Dependency sequencing:

When Task B depends on Task A's output:

Agent A completes Task A, commits to branch
Developer merges Task A to main (or creates integration branch)
Agent B pulls changes, begins Task B

This explicit sequencing prevents agents from working with stale assumptions. The developer controls when dependencies resolve.

Conflict prevention:

Parallel agents should not modify the same files. Task decomposition must respect this constraint.

If two features both require changes to src/config.ts, options include:

Serialize: one agent completes before the other starts
Split: one agent handles config changes, others wait for completion
Abstract: create separate config files that merge later

Module 6 established the heuristic: if tasks can merge without discussion, they parallelize well. The automation perspective adds: decompose tasks to avoid overlapping file modifications.

Scaling considerations

Parallel execution multiplies everything: throughput, resource consumption, review burden, error surface area.

Diminishing returns:

Most scenarios hit optimal throughput at 3-5 parallel agents. Beyond five:

Merge complexity increases non-linearly
Context switching between sessions degrades developer attention
API consumption accelerates budget depletion
Error surface area expands

Eight parallel agents rarely deliver twice the throughput of four. Coordination overhead consumes the theoretical gains.

Review bottleneck:

Parallel agents produce parallel outputs. Three agents completing three features in 30 minutes create 90 minutes of review work. The throughput improvement shifts from development time to review queue depth.

Match agent parallelism to review capacity. If one developer reviews all output, two or three parallel agents may saturate their attention. Teams with distributed review can sustain higher parallelism.

Error multiplication:

Each parallel agent can fail independently. With five agents, the probability of at least one failure increases. Monitor all sessions, not just the one currently in focus.

Establish intervention triggers:

Extended silence (10+ minutes without progress)
Repeated errors in logs
Requests for approval
Modifications outside expected file scope

Parallel execution demands parallel attention.

Parallel execution in automation contexts

These patterns extend to automated workflows.

CI/CD parallel agents:

GitHub Actions or other CI systems can spawn multiple agent sessions. Each job runs in its own environment, providing natural isolation. The workflow file coordinates: spawn agents, wait for completion, aggregate results.

Scheduled parallel work:

Cron-triggered agents can parallelize across worktrees or containers. Overnight maintenance tasks might spawn five agents to address different technical debt categories.

Orchestration frameworks:

Frameworks like Claude Agent SDK, CrewAI, and AutoGen provide structured approaches to multi-agent coordination. They implement patterns like:

Manager-worker hierarchies
Result aggregation protocols
Error handling across agent pools

These frameworks suit complex automation where custom coordination logic is needed. For simpler parallel work, worktrees and manual coordination suffice.

On this page