Context Compression Techniques
When context must shrink
Context does not just degrade over time it hits capacity. When that happens, something has to go.
Every AI coding agent compresses context somehow. Some trigger automatically. Others wait for manual intervention. Knowing what survives compression versus what disappears determines whether you lose an hour of work or keep rolling.
Auto-compaction mechanics
Claude Code documentation says auto-compaction triggers at 95% capacity. In practice, behavior varies by interface. The VS Code extension triggers closer to 75-78%, reserving headroom for the compaction process itself. The CLI operates nearer the documented 95%.
You can override the trigger threshold:
CLAUDE_AUTOCOMPACT_PCT_OVERRIDE=50 # Triggers at 50% instead of defaultOr in settings.json:
{
"env": {
"CLAUDE_AUTOCOMPACT_PCT_OVERRIDE": "50"
}
}Disabling auto-compaction entirely is possible but not recommended:
{
"autoCompact": false
}Codex works differently.
Auto-compaction triggers based on model_auto_compact_token_limit, defaulting to 180,000-244,000 tokens depending on the model.
The Codex model was trained specifically for this it writes a "handoff summary" for its future self, preserving enough to continue after context resets.
What happens during auto-compaction
When Claude Code auto-compacts:
- Workflow pauses before the next API call
- A summary request injects as a user message
- Claude generates a summary wrapped in
<summary></summary>tags - The entire conversation history clears, replaced with just the summary
- Processing resumes with compressed context
This happens without asking. One moment the full conversation exists; the next, a condensed version replaces it. Since version 2.0.64, compaction executes instantly with no wait time.
Manual compression: /compact versus /clear
Two commands give direct control over context reduction.
/clear
The nuclear option.
/clear removes all conversation history.
Nothing survives except project-level context (CLAUDE.md, file system).
Use /clear when:
- Switching to completely unrelated work
- Context has been polluted with incorrect assumptions
- A handoff summary provides enough context to start fresh
Recent updates (v2.1.3) ensure /clear properly resets plan state.
Earlier versions left plan files intact after clearing, which caused confusion.
/compact
The surgical approach.
/compact reduces conversation size while preserving what matters.
Unlike /clear, it maintains continuity.
/compact # Default compaction
/compact Focus on authentication # Prioritize auth-related contextFocus instructions influence what survives:
/compact preserve: 1) WebSocket decision for real-time, 2) API contract detailsThis does not guarantee preservation it guides the summarization process. Decisions with explicit preservation instructions survive more reliably than incidental context.
Choosing between them
| Situation | Command | Rationale |
|---|---|---|
| Finishing task, starting new one | /clear | Clean slate prevents cross-task confusion |
| Long session, same task | /compact | Preserve relevant context, reduce noise |
| Context poisoned by errors | /clear | Remove incorrect assumptions entirely |
| Complex multi-phase work | /compact with focus | Retain decisions while reducing detail |
| Session approaching limit mid-task | /compact | Continue work without starting over |
The 70% rule from earlier applies: proactive /compact at 70% capacity beats automatic compaction at 95%.
Manual intervention lets you direct what survives.
What survives compression
Compaction summarizes rather than deletes. Certain categories survive more reliably than others.
High survival probability
Recent exchanges survive nearly intact the last 2-3 turns typically make it through in raw or near-raw form.
Architecture decisions persist well. "We chose PostgreSQL over MongoDB for consistency guarantees" sticks around because it was stated explicitly with reasoning.
Current task objectives remain. Without them, the agent cannot continue.
Established patterns survive because they shape ongoing work. "All API endpoints follow REST conventions with snake_case naming" keeps influencing code generation.
Completion status persists. The fact that something finished matters more than how it finished.
Active errors survive because they need attention. Resolved errors often disappear.
Low survival probability
Intermediate reasoning compresses to conclusions. Step-by-step explanations for decisions become just the decisions the "why" often disappears.
Full file contents become summaries. When files were read into context, compression retains the fact that files were read, possibly with highlights, but not complete contents. File paths may or may not survive depending on the tool.
Exploratory discussions compress aggressively. Back-and-forth ideation that did not produce code changes rarely makes it.
Nuanced conditional rules become vague. "If X and Y but not Z, then W" becomes "conditional logic exists."
Implicit connections disappear. Relationships understood but never stated explicitly do not survive.
Debugging sessions become outcomes. The specific error messages, attempted fixes, and iteration steps vanish.
The compression trade-off
Anthropic's research on context editing provides concrete numbers. In a 100-turn web search evaluation, context editing enabled workflows that would otherwise fail from context overflow. Token consumption dropped 84% while task performance improved 39%.
But those gains cost something.
Consider a session where you established this constraint:
"When modifying authentication, never change the JWT signing algorithm because downstream services validate signatures with hardcoded expectations."
After compaction, this might become:
"Authentication modification constraints established."
The constraint technically survives. The specific rationale and downstream implications do not. A future prompt asking "can I switch from HS256 to RS256?" may get a yes without the context about why that breaks everything.
Influencing compression outcomes
Several techniques improve the odds that critical information survives.
State constraints in CLAUDE.md
Project-level documentation survives all compaction because it loads fresh each interaction. Constraints critical enough to preserve should live in CLAUDE.md rather than relying on conversation memory.
# Authentication constraints
Never modify JWT signing algorithm (HS256).
Downstream services validate signatures with hardcoded algorithm expectations.
Breaking this constraint causes silent authentication failures in production.Use explicit markers
During conversation, frame important decisions clearly:
DECISION: Using WebSocket for real-time updates.
RATIONALE: Polling created unacceptable latency for trading data.
CONSTRAINT: Must maintain backward compatibility with REST fallback.Explicit structure signals importance. Unmarked statements compress harder than marked ones.
Compact proactively with focus
The 70% rule exists to preserve control over what survives. When compacting manually, you can specify preservation priorities:
/compact preserve the authentication constraint regarding JWT signingAutomatic compaction at 95% does not accept focus instructions. It compresses based on recency and implicit importance only.
Create handoff summaries before compaction
Before triggering /compact on a complex session, state what must survive:
"Before compacting: we decided on WebSocket for real-time, JWT auth is immutable, and the API contract at /v2/trades is frozen. All three must survive."
Stating requirements immediately before compaction places them in the high-attention end position and explicitly marks them for retention.
Tool-specific variations
Different tools handle compression differently.
Claude Code uses general-purpose summarization. The model does not receive special training for compaction it uses standard summarization capabilities. Compression quality depends on how clearly the conversation structured information.
Codex models receive native training for compaction. The model learns to write summaries specifically designed for future-self continuation. Handoff summaries from Codex preserve more actionable detail because the model understands what the summary is for.
Cursor takes a different approach: dynamic context discovery. Rather than compressing existing context, Cursor writes large outputs to files and retrieves them on demand. Full history saves to files, with only minimal summaries in active context. This reduced token usage by 46.9% while keeping access to full information.
Amp (from Sourcegraph) rejected compaction entirely. OpenAI research showed that recursive summaries degraded performance summaries of summaries distort earlier reasoning. Amp implemented handoffs instead: users create new conversation threads with explicit goals, preserving original context while enabling fresh focus.
The compression mindset
Long sessions compress. This is not a bug to avoid but a constraint to work within. The question is not whether information disappears, but which information.
Structuring sessions with compression in mind means:
- Stating decisions explicitly when made
- Moving critical constraints to project documentation
- Compacting proactively rather than reactively
- Accepting that some context will disappear
The next page compares how different tools handle context internally across Claude Code, Codex, Cursor, Copilot, and Aider.