Creating Agent-Friendly Documentation

The empty CLAUDE.md problem

Legacy codebases have no agent documentation. They predate the tooling. Adding a CLAUDE.md file to an unfamiliar project raises an immediate question: what goes in it?

The temptation is to dump everything discovered during archaeology. Every convention, every gotcha, every design decision. A 2000-line CLAUDE.md file consumes roughly 1.8% of the context window before any work begins. Worse, as instruction volume increases, the model follows all instructions less reliably not just the new ones.

Frontier models follow approximately 150–200 instructions with reasonable consistency. Claude Code's system prompt uses about 50 of those. That leaves 100–150 instructions for project-specific guidance.

CLAUDE.md for legacy projects must be selective. Not everything important belongs there. Only things that shape every interaction.

The WHY/WHAT/HOW framework

Effective CLAUDE.md files cover three areas.

WHY: Project purpose. What does this system do? Why does it exist? This helps the agent judge whether proposed changes align with system goals.

## Purpose
Order fulfillment service for warehouse operations.
Optimizes for throughput over latency batch processing is correct behavior.

WHAT: Stack and architecture. What technologies does the project use? What are the main components? A parts list, not an encyclopedia.

## Stack
- Python 3.11, FastAPI, SQLAlchemy
- PostgreSQL (primary), Redis (caching, job queue)
- Deployed on Kubernetes via Helm charts in deploy/

HOW: Commands. How to run tests, build, and verify changes. Exact commands, not generic advice.

## Development
- Tests: `pytest -x -q` (stop on first failure, quiet output)
- Format: `ruff format . && ruff check --fix .`
- Migrations: `alembic upgrade head` (never edit existing migrations)

Anthropic's guidance emphasizes this operational focus: document "bash commands Claude can't guess, core files, code style guidelines, testing instructions." Skip anything Claude can figure out by reading code.

Progressive disclosure: tell where, not what

Progressive disclosure means revealing complexity gradually. For CLAUDE.md: provide pointers to detailed documentation, not the documentation itself.

CLAUDE.md should be an index, not an encyclopedia.

Instead of:

## Code conventions

### Naming
- Classes use PascalCase
- Functions use snake_case
- Constants use SCREAMING_SNAKE_CASE
- Private methods prefix with underscore
- Test files prefix with test_
[...40 more lines...]

Write:

## Conventions
See `docs/code-standards.md` for naming, formatting, and style rules.
Linting with `ruff` enforces most standards automatically.

The agent reads the detailed file when working on style-sensitive tasks. For other tasks, those 40 lines never load.

This works across documentation types:

Topic	CLAUDE.md entry	Detailed location
Architecture	One-paragraph overview	`docs/architecture.md`
API contracts	"REST API, see OpenAPI spec"	`openapi.yaml`
Database schema	"PostgreSQL, 47 tables"	`docs/database-erd.md`
Deployment	"Kubernetes, Helm charts"	`deploy/README.md`
Business rules	"Order validation in OrderValidator"	`src/validators/order.py`

What belongs in root CLAUDE.md

Given the instruction budget, include only content that applies to most interactions.

Include:

Project purpose (one sentence)
Tech stack summary
Essential commands (test, lint, build)
Gotchas that cause repeated mistakes
Directory pointers for major subsystems

Exclude:

Detailed code conventions (use linters)
API endpoint documentation (link to specs)
Database schema details (point to files)
Historical context about design decisions
Information that changes frequently

The test: would removing this cause Claude to make mistakes in typical interactions? If not, move it elsewhere.

HumanLayer's production CLAUDE.md runs under 60 lines. Anthropic's teams "occasionally run CLAUDE.md files through the prompt improver and often tune instructions." They iterate based on observed behavior.

Subdirectory CLAUDE.md files

Large projects benefit from nested documentation. Claude Code loads subdirectory CLAUDE.md files on demand when working in those directories.

project/
├── CLAUDE.md                     # Global: purpose, stack, commands
├── services/
│   ├── CLAUDE.md                 # Service-level patterns
│   ├── payments/
│   │   └── CLAUDE.md             # Payment-specific: PCI rules, test cards
│   └── inventory/
│       └── CLAUDE.md             # Inventory-specific: batch processing
├── packages/
│   └── ui/
│       └── CLAUDE.md             # UI conventions: components, styling

Each file contains only what's relevant to its subtree. The payments CLAUDE.md includes PCI compliance rules. The UI CLAUDE.md includes component patterns. Neither pollutes the other.

Don't repeat root-level content in subdirectory files. Claude Code loads all applicable files from root to current directory. Duplication wastes tokens and creates contradictions when copies drift.

Legacy-specific concerns

Legacy projects have patterns that greenfield projects don't.

Coexisting conventions. When old code uses one pattern and new code uses another:

## Code patterns
Legacy modules (src/legacy/) use callbacks. Don't convert to async.
New modules (src/v2/) use async/await. Match surrounding code.

Technical debt. Known problems the agent shouldn't try to fix:

## Known issues (don't fix)
- User service has circular import workaround in __init__.py
- Rate limiter uses deprecated Redis API (migration planned Q2)

Protected areas. Code requiring careful handling:

## Critical paths
Never modify without explicit approval:
- src/billing/charge.py (payment processing)
- src/compliance/audit_log.py (regulatory requirement)

Missing tests. When coverage is incomplete:

## Testing gaps
Legacy modules have partial coverage. For changes to src/legacy/:
- Add tests for any code you modify
- Don't refactor untested code without adding tests first

From archaeology to documentation

The previous section covered extracting understanding from legacy code. That understanding now becomes CLAUDE.md content selectively.

After archaeology, you have:

Architecture overview
Design rationale
Conventions
Gotchas

For CLAUDE.md, extract only:

Architecture pointers (not the full overview)
Gotchas causing repeated mistakes
Conventions that linters can't enforce

The rest goes into separate files that CLAUDE.md points to.

Self-correcting documentation

CLAUDE.md should evolve based on what actually happens. When the agent makes the same mistake twice, the documentation needs work.

Add mistakes to CLAUDE.md. When an agent repeatedly makes an error, add an explicit correction:

## Common mistakes
- DO NOT use `slug` in page queries use `stem` instead
- DO NOT import from src/internal use the public API in src/api

Keep these minimal. If the list grows long, consider whether a linter rule or pre-commit hook could enforce the constraint instead.

Prune ineffective rules. Rules the agent ignores aren't worth the tokens. If an instruction consistently fails to change behavior, either strengthen it ("IMPORTANT:", "YOU MUST") or remove it and enforce through tooling.

Anthropic recommends treating CLAUDE.md like code: "review it when things go wrong, prune it regularly, and test changes by observing whether Claude's behavior actually shifts."

Example: minimal legacy CLAUDE.md

After archaeology and progressive disclosure:

# Order Fulfillment Service

Warehouse order processing. Optimizes for throughput over latency.

## Stack
Python 3.11, FastAPI, SQLAlchemy, PostgreSQL, Redis

## Commands
- Test: `pytest -x -q`
- Lint: `ruff check --fix . && ruff format .`
- Run: `docker compose up`
- Migrate: `alembic upgrade head`

## Structure
- src/api/   REST endpoints
- src/services/   business logic
- src/models/   SQLAlchemy models
- src/legacy/   old code, callbacks not async

## Conventions
See docs/standards.md. Linting enforces most rules.

## Gotchas
- Never edit existing Alembic migrations
- src/legacy/ uses callbacks don't convert to async
- Rate limiter uses deprecated Redis API (migration planned)

## Critical paths
Changes to src/billing/ require review approval.

Under 40 lines. Covers why, what, and how. Points to detailed docs. Acknowledges legacy realities. Leaves context capacity for actual work.

Creating Agent-Friendly Documentation

On this page