Documentation as Context
Why human-oriented docs fail agents; modular documentation patterns; context-specific chunking; avoiding context overload
The previous page examined implicit context the patterns agents infer from code structure, naming, and configuration. This page addresses explicit context: documentation written specifically to guide agent behavior.
Traditional documentation assumes human readers who can infer, adapt, and debug. Agents operate differently. They parse text literally, execute instructions sequentially, and lack the experiential knowledge that helps humans fill gaps. Documentation that serves humans well often fails agents entirely.
Why human-oriented documentation fails agents
Human-oriented documentation makes assumptions about its readers: they can browse, skim, infer meaning from context, and apply judgment about what applies to their situation. These assumptions break down when agents become the readers.
The inference gap
Consider a typical README instruction: "Set up your environment and run the usual tests." Human developers understand this shorthand. They know to check for dependency managers, look for configuration files, and run whatever test command the project uses.
Agents do not infer this way. The phrase "usual tests" provides no actionable instruction. Without explicit commands, the agent either confabulates a test command based on training patterns or reports that it cannot proceed. Neither outcome serves the developer.
Research shows that 75% of API specifications have not been updated in the past six months, and 25% of public APIs do not conform to their own documented specifications. Agents cannot compensate for documentation that contradicts actual behavior.
The browsing problem
Humans navigate documentation non-linearly. They scan headings, jump to relevant sections, and skip content that does not apply. Agents process documentation sequentially within their context window. Long documentation files consume tokens regardless of relevance.
A 10,000-word getting-started guide might contain exactly 200 words relevant to a specific task. Loading the entire document wastes 9,800 tokens of context that could hold code, conversation history, or other relevant information. The irrelevant content also creates context pollution extraneous information that degrades response quality.
The staleness problem
Human readers recognize outdated documentation and compensate. A developer encountering instructions that reference a deprecated API version checks for updates or adapts the instructions. Agents treat documentation as authoritative. Outdated instructions executed literally produce broken code.
This problem compounds in agent-generated documentation. Embedded code snippets become stale faster than prose descriptions. Configuration examples that worked six months ago may reference changed defaults. The more specific the documentation, the faster it ages.
Modular documentation patterns
Effective agent documentation adopts modular patterns that load selectively rather than wholesale.
The pointer principle
Rather than embedding content directly in documentation files, effective patterns use pointers references to canonical sources.
Problematic approach:
## Authentication
Use the following pattern for authenticating API requests:
```javascript
const auth = new AuthClient({
clientId: 'your-client-id',
scopes: ['read', 'write']
});
**Improved approach:**
```markdown
## Authentication
For the current authentication pattern, see `src/auth/client.ts:15-30`.
The AuthClient class handles all API authentication.The pointer approach provides several benefits:
- Documentation cannot become stale relative to the referenced code
- Agents can read the actual implementation when needed
- Token consumption decreases when agents do not need the referenced content
- The reference itself provides navigation context
The general guidance: "pointers over copies." Documentation that embeds code snippets requires constant maintenance. Documentation that references code locations stays current automatically.
Hierarchical loading
Documentation organized hierarchically enables selective loading. Rather than a single comprehensive file, effective projects use multiple files at different scopes.
project/
├── CLAUDE.md # Global conventions (always loaded)
├── docs/
│ └── architecture.md # Loaded when architecture questions arise
├── src/
│ ├── auth/
│ │ └── README.md # Auth-specific patterns
│ └── api/
│ └── README.md # API-specific patternsAgents load context progressively:
- Root-level files load automatically at session start
- Directory-level files load when working in that directory
- Specialized documentation loads on explicit request or relevance detection
This hierarchy matches how humans organize institutional knowledge general principles at the top, specialized knowledge distributed closer to where it applies.
The metadata layer pattern
Advanced documentation systems implement three loading levels:
| Level | Contents | When Loaded |
|---|---|---|
| Metadata | Titles, descriptions, file locations | Always (minimal tokens) |
| Core instructions | Full content of relevant files | When relevance determined |
| Deep references | Supplementary resources, appendices | When specific scenarios require |
This pattern can reduce token consumption by 98% compared to loading all documentation upfront. Real-world deployments report cost reductions from $4.50 per session to $0.06 using progressive loading strategies.
Context-specific chunking
When documentation must be loaded, how it divides into chunks affects agent performance significantly.
The chunking tradeoff
Small chunks preserve precision but lose context. Large chunks preserve context but consume tokens and risk including irrelevant content. Research indicates optimal chunk sizes depend on the task:
| Task Type | Optimal Chunk Size | Reasoning |
|---|---|---|
| Factual queries | 256-512 tokens | Precise retrieval matters more than context |
| Analytical tasks | 1,000-2,000 tokens | Broader context improves reasoning |
| General baseline | 512 tokens | Balances precision and context |
The overlap between chunks also matters. A 10-20% overlap (50-100 tokens at 512-token chunks) prevents information from being severed at boundaries.
Structural boundaries
The most effective chunking respects document structure rather than arbitrary token counts. Splitting at section headers, paragraph boundaries, and code block edges preserves semantic units.
Problematic split:
...the authentication flow requires three steps:
1. Initialize the client with credentials
---CHUNK BOUNDARY---
2. Request an access token
3. Attach the token to subsequent requests...Improved split:
...the authentication flow requires three steps:
1. Initialize the client with credentials
2. Request an access token
3. Attach the token to subsequent requests
---CHUNK BOUNDARY---
The token refresh mechanism handles expiration...Semantic chunking algorithms that detect topic shifts outperform fixed-size splitting, achieving up to 91.9% recall compared to 85-90% for recursive character splitting.
Avoid splitting documentation in ways that separate context from instructions. A code example severed from its explanation becomes difficult for agents to apply correctly.
The "lost in the middle" effect
Regardless of chunking strategy, agents process long contexts unevenly. Research demonstrates a U-shaped attention pattern: information at the beginning and end of context receives more attention than middle sections. Performance on multi-document tasks drops more than 20% when relevant information appears in the middle.
This finding has practical implications for documentation structure:
-
Place critical instructions early. Commands, configuration requirements, and essential patterns belong in the first section.
-
Repeat key points in summaries. Restating critical information at the end of documents leverages recency bias.
-
Keep individual documents focused. Shorter documents reduce the "middle" that suffers attention degradation.
Avoiding context overload
More documentation is not better documentation. Providing agents with excessive context actively degrades their performance.
The paradox of context abundance
Studies analyzing agent performance found that providing more context can reduce accuracy from 92% to 63%. This counterintuitive finding reflects the cognitive limits of context windows not token limits, but attention limits.
Even within the technical context window, agents perform best when context remains focused. Practitioners recommend using no more than 80% of practical context limits, reserving space for reasoning and intermediate computation.
Signals of context overload
Context overload manifests through recognizable symptoms:
- Repetitive cycling Agent returns to already-addressed topics
- Instruction amnesia Agent forgets guidance from earlier in the conversation
- Confabulation spikes Increased invention of nonexistent APIs, files, or patterns
- Quality degradation Progressive decline in output coherence
When these symptoms appear, adding more documentation exacerbates rather than resolves the problem.
The subtraction discipline
Effective documentation requires continuous pruning:
Remove redundancy. If the same instruction appears in multiple locations, consolidate to one authoritative source. Redundancy consumes tokens and risks inconsistency.
Eliminate the automatable. Style rules enforceable by linters do not belong in documentation. "Never send an agent to do a linter's job." Automated tools check faster, more reliably, and without consuming context.
Deprecate the stale. Documentation that no longer reflects current practices should be removed, not merely marked outdated. Agents cannot reliably distinguish deprecated from current content.
Analysis of effective CLAUDE.md files finds they average 485-535 words across thousands of repositories. Concise, focused documentation outperforms comprehensive documentation that attempts to address every scenario.
The quality over quantity principle
Effective project documentation satisfies a clear test: a new developer joining the project should find exactly the context needed to begin productive work no more, no less.
This test applies equally to agents. If documentation requires extensive reading before a simple task, it contains too much. If agents consistently require prompt-level explanation of basics, it contains too little.
Building effective documentation
Documentation development for agents follows a distinct process:
-
Start minimal. Begin with only essential information: build commands, test commands, critical conventions. Add content only when repeated agent failures indicate gaps.
-
Treat confusion as signal. When agents misunderstand, update documentation rather than crafting better prompts. Each confusion resolved through documentation prevents recurrence.
-
Audit regularly. Documentation accumulates faster than it gets pruned. Regular review ensures continued relevance and removes accumulated cruft.
-
Test with fresh sessions. Effective documentation enables productive work from a fresh conversation. If agents require extensive setup instructions, documentation is insufficient or poorly structured.
Documentation and the context hierarchy
Documentation occupies a specific position in the context hierarchy established in earlier pages. It represents explicit project context the layer that loads automatically and persists across conversations.
Strong project documentation reduces the burden on conversation and prompt contexts. Agents working with well-documented projects require shorter prompts, fewer clarifications, and less repetitive instruction. The investment in documentation pays compound returns through reduced token consumption and improved output quality.
The next page examines tool configurations CLAUDE.md, AGENTS.md, and related files as a specialized form of documentation that directly shapes agent behavior.