Handling Outdated or Missing Documentation
The documentation gap problem
Documentation and code start together. Then they drift.
A function gets a new parameter. An API endpoint changes its response format. A workflow gains three additional steps. The code reflects reality; the documentation reflects history.
A 2023 SmartBear survey found 75% of APIs do not conform to their specifications. The gap between documented behavior and actual behavior creates integration failures, incorrect assumptions, and wasted debugging time. Developers learn to distrust documentation. They read the code instead, which defeats the purpose of having documentation at all.
Legacy codebases make this worse. The documentation that exists describes the system as it was five years ago. Or three owners ago. Or before the major refactoring that nobody thought to update the docs for. Often, the documentation simply doesn't exist. The "why" behind design decisions lives in the heads of developers who have moved on.
AI can help with both problems: finding where documentation is stale and filling where documentation is missing. Neither task is fully automatic, and both require verification. But the time savings are real.
Identifying stale documentation
Documentation drift has symptoms. Agents can detect them.
Parameter mismatches. Documentation says a function takes three arguments; the function signature shows four. Documentation describes a return value that the function no longer returns. Agents compare documentation claims to actual code and flag discrepancies.
Compare the docstrings in src/api/ to the actual function signatures.
Identify any mismatches:
- Parameters documented but not present
- Parameters present but not documented
- Return types that don't match
- Exceptions mentioned that aren't thrown (or vice versa)Outdated examples. Code samples in documentation that no longer work. Import paths that changed. API calls using deprecated methods. Agents trace through each step of an example to check whether it still works.
Review the code examples in docs/quickstart.md.
For each example:
- Does the import path still exist?
- Does the function being called still exist with that signature?
- Would the example produce the described output today?
Flag any examples that appear outdated.Terminology drift.
The codebase renamed User to Account.
The documentation still says User everywhere.
Search for documentation terms that no longer appear in the code.
Search the documentation for technical terms (class names, function names,
variable names). Verify each term still exists in the codebase.
List any terms that appear in docs but not in code.Git history can detect drift age. Ask the agent to compare when documentation was last modified versus when the code it describes was last modified. Large gaps suggest staleness.
Systematic gap detection
Beyond individual mismatches, agents can audit documentation coverage.
Coverage analysis. Which modules have documentation? Which have none?
List all directories under src/.
For each directory, check if corresponding documentation exists.
Report:
- Directories with README or dedicated doc files
- Directories with only inline comments
- Directories with no documentation at allDepth assessment. Documentation exists but is it sufficient? A one-line comment on a complex module is not useful documentation.
For src/payments/, assess documentation depth:
- Is there an architecture overview?
- Are the main functions documented?
- Are edge cases and error conditions described?
- Are there usage examples?
Rate coverage as: comprehensive, partial, or minimal.Critical path audit. Not all code needs equal documentation. Core business logic, security-sensitive code, and frequently modified modules deserve more attention.
Identify the most critical modules in this codebase based on:
- Dependencies (what depends on them)
- Modification frequency (git history)
- Business sensitivity (payments, auth, data handling)
For each critical module, assess whether documentation matches importance.This creates a prioritized list. Documentation effort goes where it matters most.
Filling documentation gaps with AI
Once gaps are identified, agents can draft documentation. "Draft" is the operative word here. AI-generated documentation requires verification before it becomes trusted reference material.
Module documentation. For undocumented modules, request structured output:
This module has no documentation.
Generate a README that covers:
1. Purpose: What problem does this module solve?
2. Main components: What are the key classes/functions?
3. Dependencies: What does this module depend on and what depends on it?
4. Usage: How would another module use this correctly?
Base your documentation only on what you can determine from the code.
Flag anything you're uncertain about.The instruction to "flag uncertainty" matters. Without it, agents fill gaps with confident-sounding guesses.
Function documentation. For functions lacking docstrings:
Generate docstrings for the public functions in src/utils/validation.py.
Follow Google style docstrings.
Include:
- Brief description
- Args with types and descriptions
- Returns with type and description
- Raises for any exceptions
- Example usage where helpfulArchitecture documentation. For systems lacking high-level explanation:
Based on your analysis of this codebase, draft an architecture overview.
Include:
- System purpose and scope
- Main components and their responsibilities
- Data flow between components
- External dependencies and integrations
- Key design decisions visible in the code structure
Mark any inferences with [INFERRED] so reviewers know to verify.AI-generated documentation inherits AI limitations. Agents cannot know business context that is not in the code. They cannot explain decisions that left no trace. They sometimes confabulate plausible-sounding explanations. Every piece of generated documentation needs verification.
The self-review technique
The most effective verification approach: ask the agent to verify its own output.
Chain-of-Verification (CoVe), documented in a 2023 Meta AI paper, improves accuracy by 40% in technical writing tasks. The process:
- Generate initial documentation
- Create verification questions about the documentation
- Answer those questions by examining the code independently
- Revise the documentation based on verification findings
Step 1: Here is the documentation you generated for PaymentProcessor:
[paste generated docs]
Step 2: Generate verification questions:
- Does PaymentProcessor actually have a process() method as described?
- Does it actually throw InvalidPaymentException as documented?
- Is the return type actually PaymentResult as stated?
Step 3: Answer each question by examining src/payments/processor.py.
Do not refer back to the documentation look only at the code.
Step 4: Based on your verification, what corrections are needed?The critical detail: verification must happen independently. If the agent sees its own documentation while verifying, it tends to confirm what it wrote rather than checking what the code actually does.
OpenAI Codex includes a code review step that validates generated code against stated intent. The same principle applies to documentation: validate output against source.
Test-based verification
For code documentation, tests provide verification.
Verify documented behavior. If documentation claims a function handles null input gracefully, write a test:
The documentation says validate_email() returns false for null input
rather than throwing. Write a test that verifies this claim.A passing test confirms the documentation. A failing test reveals documentation error.
Exercise documented examples. Documentation often includes usage examples. Convert examples to executable tests:
The README shows this usage example:
[paste example]
Convert this into a test that verifies the example produces
the described output.If documentation examples pass as tests, they are accurate. If they fail, either the example is outdated or the documentation misrepresents behavior.
Coverage verification. Use test results to validate documentation claims about error handling, edge cases, and expected behavior. Documentation says "handles empty input"? Does a test confirm this? Documentation says "returns sorted results"? Does a test verify ordering?
AI models consistently preserve inline doctests. If documentation lives in docstrings with embedded test examples (Python doctests, Rust doc tests), AI maintains these when modifying code. External documentation files get no such protection.
Cross-referencing for consistency
AI-generated documentation should be checked against any existing documentation.
Compare with old docs. Even outdated documentation contains useful information. Cross-reference AI-generated content with legacy docs:
Compare this generated architecture overview with the existing
docs/architecture.md from 2021.
Identify:
- Claims that match (likely still true)
- Claims that conflict (one is wrong)
- New information not in old docs
- Old information not in new docsConflicts reveal either evolved code or AI error. Both need human judgment.
Verify against multiple sources. Code comments, commit messages, PR descriptions, and inline documentation create overlapping evidence. AI-generated docs should be consistent with all of them.
The generated module documentation claims this service handles
user authentication.
Verify this claim against:
- The service class name and structure
- Related commit messages
- Any existing comments in the code
- Import statements that suggest purpose
Does the evidence support the documentation?Human review workflow
AI assists. Humans verify.
Review focus for AI-generated docs:
- Accuracy: Does the documentation match the code?
- Completeness: Are important details missing?
- Context: Does it explain the why, not just the what?
- Audience: Is it useful for the intended readers?
AI handles mechanical accuracy well. Parameters, types, return values can be verified programmatically.
AI handles context poorly. Why was this design chosen? What constraints shaped the implementation? What is the business significance? These require human knowledge.
Effective review process:
Review this AI-generated documentation:
[generated content]
For each section:
- ACCURATE: The code confirms this
- INACCURATE: The code shows something different: [correction]
- MISSING CONTEXT: This is technically correct but needs: [addition]
- IRRELEVANT: Remove this, it's not useful for readersThe human review adds the context AI cannot provide while validating the accuracy AI established.
Maintaining documentation currency
Fixing gaps once is not enough. Documentation drifts continuously.
Detect changes requiring doc updates. When code changes, flag corresponding documentation:
This PR modifies src/auth/session.py.
Check if any documentation references session handling:
- README sections
- API documentation
- Code comments in other files
- CLAUDE.md entries
List what may need updates based on these code changes.Automate freshness checking. Tools like DeepDocs and Mintlify Autopilot monitor repositories for documentation drift. When code changes, they identify affected documentation and propose updates. The agent watches; the human approves.
Include documentation in code review. Documentation changes should accompany code changes. If a PR modifies function behavior, the docstring should update in the same PR. Treating documentation as separate from code guarantees drift.
The payoff
Documentation that developers actually trust changes how a team works.
New team members ramp up faster with accurate references. Code review goes smoother when reviewers can check against reliable documentation. Integration with other systems has fewer surprises.
The effort is not trivial. AI accelerates the mechanical work: finding gaps, drafting content, comparing claims to code. Human effort remains for verification, context, and judgment. But the total effort is less than manual documentation from scratch.
Atlassian reported that AI-assisted documentation cut their documentation understanding phase from months to weeks on one legacy codebase migration project. The time saved compounds. Documentation that gets used gets maintained. Documentation that gets ignored decays.