Documentation as Context

Why human-oriented docs fail agents; modular documentation patterns; context-specific chunking; avoiding context overload

The previous page examined implicit context the patterns agents infer from code structure, naming, and configuration. This page addresses explicit context: documentation written specifically to guide agent behavior.

Traditional documentation assumes human readers who can infer, adapt, and debug. Agents operate differently. They parse text literally, execute instructions sequentially, and lack the experiential knowledge that helps humans fill gaps. Documentation that serves humans well often fails agents entirely.

Why human-oriented documentation fails agents

Human-oriented documentation makes assumptions about its readers: they can browse, skim, infer meaning from context, and apply judgment about what applies to their situation. These assumptions break down when agents become the readers.

The inference gap

Consider a typical README instruction: "Set up your environment and run the usual tests." Human developers understand this shorthand. They know to check for dependency managers, look for configuration files, and run whatever test command the project uses.

Agents do not infer this way. The phrase "usual tests" provides no actionable instruction. Without explicit commands, the agent either confabulates a test command based on training patterns or reports that it cannot proceed. Neither outcome serves the developer.

Research shows that 75% of API specifications have not been updated in the past six months, and 25% of public APIs do not conform to their own documented specifications. Agents cannot compensate for documentation that contradicts actual behavior.

The browsing problem

Humans navigate documentation non-linearly. They scan headings, jump to relevant sections, and skip content that does not apply. Agents process documentation sequentially within their context window. Long documentation files consume tokens regardless of relevance.

A 10,000-word getting-started guide might contain exactly 200 words relevant to a specific task. Loading the entire document wastes 9,800 tokens of context that could hold code, conversation history, or other relevant information. The irrelevant content also creates context pollution extraneous information that degrades response quality.

The staleness problem

Human readers recognize outdated documentation and compensate. A developer encountering instructions that reference a deprecated API version checks for updates or adapts the instructions. Agents treat documentation as authoritative. Outdated instructions executed literally produce broken code.

This problem compounds in agent-generated documentation. Embedded code snippets become stale faster than prose descriptions. Configuration examples that worked six months ago may reference changed defaults. The more specific the documentation, the faster it ages.

Modular documentation patterns

Effective agent documentation adopts modular patterns that load selectively rather than wholesale.

The pointer principle

Rather than embedding content directly in documentation files, effective patterns use pointers references to canonical sources.

Problematic approach:

## Authentication
Use the following pattern for authenticating API requests:

```javascript
const auth = new AuthClient({
  clientId: 'your-client-id',
  scopes: ['read', 'write']
});


**Improved approach:**
```markdown
## Authentication
For the current authentication pattern, see `src/auth/client.ts:15-30`.
The AuthClient class handles all API authentication.

The pointer approach provides several benefits:

Documentation cannot become stale relative to the referenced code
Agents can read the actual implementation when needed
Token consumption decreases when agents do not need the referenced content
The reference itself provides navigation context

The general guidance: "pointers over copies." Documentation that embeds code snippets requires constant maintenance. Documentation that references code locations stays current automatically.

Hierarchical loading

Documentation organized hierarchically enables selective loading. Rather than a single comprehensive file, effective projects use multiple files at different scopes.

project/
├── CLAUDE.md                    # Global conventions (always loaded)
├── docs/
│   └── architecture.md          # Loaded when architecture questions arise
├── src/
│   ├── auth/
│   │   └── README.md            # Auth-specific patterns
│   └── api/
│       └── README.md            # API-specific patterns

Agents load context progressively:

Root-level files load automatically at session start
Directory-level files load when working in that directory
Specialized documentation loads on explicit request or relevance detection

This hierarchy matches how humans organize institutional knowledge general principles at the top, specialized knowledge distributed closer to where it applies.

The metadata layer pattern

Advanced documentation systems implement three loading levels:

Level	Contents	When Loaded
Metadata	Titles, descriptions, file locations	Always (minimal tokens)
Core instructions	Full content of relevant files	When relevance determined
Deep references	Supplementary resources, appendices	When specific scenarios require

This pattern can reduce token consumption by 98% compared to loading all documentation upfront. Real-world deployments report cost reductions from $4.50 per session to $0.06 using progressive loading strategies.

Context-specific chunking

When documentation must be loaded, how it divides into chunks affects agent performance significantly.

The chunking tradeoff

Small chunks preserve precision but lose context. Large chunks preserve context but consume tokens and risk including irrelevant content. Research indicates optimal chunk sizes depend on the task:

Task Type	Optimal Chunk Size	Reasoning
Factual queries	256-512 tokens	Precise retrieval matters more than context
Analytical tasks	1,000-2,000 tokens	Broader context improves reasoning
General baseline	512 tokens	Balances precision and context

The overlap between chunks also matters. A 10-20% overlap (50-100 tokens at 512-token chunks) prevents information from being severed at boundaries.

Structural boundaries

The most effective chunking respects document structure rather than arbitrary token counts. Splitting at section headers, paragraph boundaries, and code block edges preserves semantic units.

Problematic split:

...the authentication flow requires three steps:
1. Initialize the client with credentials
---CHUNK BOUNDARY---
2. Request an access token
3. Attach the token to subsequent requests...

Improved split:

...the authentication flow requires three steps:
1. Initialize the client with credentials
2. Request an access token
3. Attach the token to subsequent requests
---CHUNK BOUNDARY---
The token refresh mechanism handles expiration...

Semantic chunking algorithms that detect topic shifts outperform fixed-size splitting, achieving up to 91.9% recall compared to 85-90% for recursive character splitting.

Avoid splitting documentation in ways that separate context from instructions. A code example severed from its explanation becomes difficult for agents to apply correctly.

The "lost in the middle" effect

Regardless of chunking strategy, agents process long contexts unevenly. Research demonstrates a U-shaped attention pattern: information at the beginning and end of context receives more attention than middle sections. Performance on multi-document tasks drops more than 20% when relevant information appears in the middle.

This finding has practical implications for documentation structure:

Place critical instructions early. Commands, configuration requirements, and essential patterns belong in the first section.
Repeat key points in summaries. Restating critical information at the end of documents leverages recency bias.
Keep individual documents focused. Shorter documents reduce the "middle" that suffers attention degradation.

Avoiding context overload

More documentation is not better documentation. Providing agents with excessive context actively degrades their performance.

The paradox of context abundance

Studies analyzing agent performance found that providing more context can reduce accuracy from 92% to 63%. This counterintuitive finding reflects the cognitive limits of context windows not token limits, but attention limits.

Even within the technical context window, agents perform best when context remains focused. Practitioners recommend using no more than 80% of practical context limits, reserving space for reasoning and intermediate computation.

Signals of context overload

Context overload manifests through recognizable symptoms:

Repetitive cycling Agent returns to already-addressed topics
Instruction amnesia Agent forgets guidance from earlier in the conversation
Confabulation spikes Increased invention of nonexistent APIs, files, or patterns
Quality degradation Progressive decline in output coherence

When these symptoms appear, adding more documentation exacerbates rather than resolves the problem.

The subtraction discipline

Effective documentation requires continuous pruning:

Remove redundancy. If the same instruction appears in multiple locations, consolidate to one authoritative source. Redundancy consumes tokens and risks inconsistency.

Eliminate the automatable. Style rules enforceable by linters do not belong in documentation. "Never send an agent to do a linter's job." Automated tools check faster, more reliably, and without consuming context.

Deprecate the stale. Documentation that no longer reflects current practices should be removed, not merely marked outdated. Agents cannot reliably distinguish deprecated from current content.

Analysis of effective CLAUDE.md files finds they average 485-535 words across thousands of repositories. Concise, focused documentation outperforms comprehensive documentation that attempts to address every scenario.

The quality over quantity principle

Effective project documentation satisfies a clear test: a new developer joining the project should find exactly the context needed to begin productive work no more, no less.

This test applies equally to agents. If documentation requires extensive reading before a simple task, it contains too much. If agents consistently require prompt-level explanation of basics, it contains too little.

Building effective documentation

Documentation development for agents follows a distinct process:

Start minimal. Begin with only essential information: build commands, test commands, critical conventions. Add content only when repeated agent failures indicate gaps.
Treat confusion as signal. When agents misunderstand, update documentation rather than crafting better prompts. Each confusion resolved through documentation prevents recurrence.
Audit regularly. Documentation accumulates faster than it gets pruned. Regular review ensures continued relevance and removes accumulated cruft.
Test with fresh sessions. Effective documentation enables productive work from a fresh conversation. If agents require extensive setup instructions, documentation is insufficient or poorly structured.

Documentation and the context hierarchy

Documentation occupies a specific position in the context hierarchy established in earlier pages. It represents explicit project context the layer that loads automatically and persists across conversations.

Strong project documentation reduces the burden on conversation and prompt contexts. Agents working with well-documented projects require shorter prompts, fewer clarifications, and less repetitive instruction. The investment in documentation pays compound returns through reduced token consumption and improved output quality.

The next page examines tool configurations CLAUDE.md, AGENTS.md, and related files as a specialized form of documentation that directly shapes agent behavior.

On this page