Exercise: Legacy Codebase Exploration and Modification
This exercise applies the full legacy codebase workflow: exploring unfamiliar code with AI agents, creating agent-friendly documentation, writing characterization tests, and implementing a feature using safe modification strategies.
Overview
Koanf is a Go library for configuration management. It provides a clean alternative to larger configuration libraries like Viper, with a modular architecture that separates configuration providers (files, environment variables, flags) from parsers (JSON, YAML, TOML). The codebase is approximately 5,000 lines of core code—large enough to require systematic exploration but small enough to understand in a single session.
The scenario: you've been assigned to add a feature to koanf, a library you've never touched. Instead of reading code for hours, you'll use AI agents to accelerate understanding and implement the feature safely.
You will:
- Explore the codebase using the Explore-Plan-Code framework
- Map the architecture and identify conventions
- Create CLAUDE.md documentation for future agent work
- Write characterization tests to capture current behavior
- Implement a small feature using safe modification strategies
The task
Add a configuration validation feature to koanf.
The feature should:
- Allow users to register validation rules for configuration keys
- Validate configuration after loading (check required keys, value types, value ranges)
- Return structured errors when validation fails
- Follow existing patterns in the codebase
Koanf doesn't currently have built-in validation support, and configuration libraries often need it.
Setup
Clone the repository:
git clone https://github.com/knadh/koanf.git
cd koanfInstall dependencies and verify tests pass:
go mod download
go test ./...The test suite should pass completely. If tests fail, check your Go version (1.18+ required) and network connectivity for module downloads.
Explore the project structure:
ls -la
ls -la providers/
ls -la parsers/
cat README.md | head -100Initial observations:
koanf.gois the main library fileproviders/contains configuration source implementationsparsers/contains format parsers- Architecture separates concerns cleanly
Phase 1: Codebase exploration
Apply the Explore-Plan-Code framework from section 9.1.
Start a Claude Code session
claudeExplore: Understand the architecture
Begin with broad questions to build a mental model:
I'm new to this codebase. Give me a high-level overview:
1. What is this library's purpose?
2. What are the main components and how do they interact?
3. Where is the core logic located?
4. What patterns does this codebase use consistently?Document the agent's response. Verify claims by examining the files mentioned.
Explore: Map dependencies
Ask about the internal structure:
How does configuration loading work end-to-end?
Trace the flow from a user calling Load() to data being accessible via Get().
Which functions are involved and in what order?Draw a simple diagram based on the response:
User calls Load()
→ ???
→ ???
→ Data accessible via Get()Fill in the middle steps. This mental model guides your implementation.
Explore: Find conventions
Ask about implicit patterns:
What naming conventions does this codebase follow?
How are errors handled consistently?
What patterns are used for extensibility?
Look at multiple files and identify recurring structures.Record conventions you discover:
| Convention | Example | Where observed |
|---|---|---|
Verify understanding
Before proceeding, verify your mental model is correct:
Based on our exploration, I understand:
1. [Your understanding of architecture]
2. [Your understanding of data flow]
3. [Your understanding of extension patterns]
Is this accurate? What am I missing?Correct any misunderstandings before proceeding.
Don't proceed to implementation until you can describe the architecture without looking at the code. Unclear mental models produce implementations that don't fit.
Phase 2: Create CLAUDE.md
Document your understanding in agent-friendly format. This captures knowledge for future sessions and provides context for the implementation phase.
Create the documentation
Create a CLAUDE.md file in the repository root:
# Koanf Development Guide
## Purpose
Koanf is a configuration management library for Go.
It loads configuration from multiple sources (files, environment, flags) and provides unified access.
## Architecture
### Core Components
- `koanf.go`: Main Koanf struct, Load/Get/Set methods
- `providers/`: Configuration sources (file, env, confmap, etc.)
- `parsers/`: Format parsers (json, yaml, toml, etc.)
### Data Flow
1. User creates Koanf instance with New()
2. User calls Load() with a Provider and Parser
3. Provider fetches raw data from source
4. Parser converts bytes to map[string]interface{}
5. Koanf merges data into internal store
6. User accesses values via Get(), Unmarshal()
## Conventions
### Naming
- Providers named by source: `file`, `env`, `confmap`
- Parsers named by format: `json`, `yaml`, `toml`
- Options passed via functional options pattern
### Error Handling
- Errors returned, not panicked
- Wrap errors with context using fmt.Errorf
- nil error indicates success
### Extension Pattern
- Providers implement Provider interface (Read, ReadBytes)
- Parsers implement Parser interface (Unmarshal, Marshal)
- New providers/parsers go in separate packages
## Testing
- Run all tests: `go test ./...`
- Test specific package: `go test ./providers/file`
- Tests use testify for assertions
## Common Tasks
- Add new provider: Create package in providers/, implement Provider interface
- Add new parser: Create package in parsers/, implement Parser interface
- Modify core: Changes to koanf.go, run full test suiteYour documentation should reflect what you actually discovered during exploration, not this template verbatim.
Verify documentation accuracy
Ask the agent to validate your documentation:
Review the CLAUDE.md file I created.
Check each claim against the actual codebase.
Are there errors or missing critical information?Correct any inaccuracies.
Phase 3: Write characterization tests
Before modifying code, capture current behavior with characterization tests. These tests protect against unintended changes.
Identify test targets
The validation feature will likely interact with:
- The Koanf struct (to register validators)
- The Load method (to run validation after loading)
- The Get methods (possibly, if validation affects retrieval)
Create characterization tests
Ask the agent to generate tests that capture current behavior:
Write characterization tests for koanf's Load and Get behavior.
The tests should capture:
1. What happens when Load is called with valid config
2. What happens when Load is called with invalid provider
3. How Get behaves with existing and non-existing keys
4. How type-specific getters (GetInt, GetString) handle type mismatches
Create tests that document current behavior exactly as it is.
Put them in koanf_characterization_test.go.Run and verify
go test -v -run CharacterizationAll characterization tests should pass. If any fail, the test is wrong—you're characterizing existing behavior, not desired behavior.
Characterization tests are your safety net. If they break after your implementation, you changed something you didn't intend to change.
Phase 4: Plan the implementation
With understanding and tests in place, plan the implementation.
Request a plan
Plan an implementation for configuration validation in koanf.
Requirements:
- Users can register validation rules for configuration keys
- Validation runs after Load() completes
- Validation errors are returned (not panicked)
- The API should feel natural alongside existing koanf patterns
Consider:
- Where should validation logic live?
- What interface should validators implement?
- How does this integrate with existing Load() flow?
- What existing patterns should this follow?
Provide a plan, not code. I want to review the approach before implementation.Review the plan
Evaluate the plan against what you know:
- Does it follow the conventions you documented?
- Does it match the extension patterns you observed?
- Will it require changes to existing code or only additions?
- Does the API feel consistent with Get(), Load(), etc.?
Request modifications if needed:
I have concerns about [specific aspect].
In the existing codebase, [observed pattern] is used instead.
Revise the plan to follow that pattern.Approve the plan
Only proceed to implementation when the plan aligns with codebase conventions. Write down the approved plan for reference.
Phase 5: Implement the feature
With plan approved, implement the validation feature.
Implementation prompt
Implement the validation feature according to our plan.
Requirements reminder:
- Register validation rules for config keys
- Validate after Load() completes
- Return structured errors
- Follow existing patterns
Create the implementation in appropriate files.
Add tests for the new functionality.
Ensure existing tests still pass.Monitor implementation
As the agent works, verify:
- New files follow naming conventions
- Code style matches existing code
- Tests follow existing test patterns
- No changes to unrelated files
Interrupt if the agent deviates from the plan:
Stop. That approach differs from our plan.
The plan specified [X], but you're doing [Y].
Return to the planned approach.Verify implementation
Run the full test suite:
go test ./...Check that characterization tests still pass:
go test -v -run CharacterizationIf characterization tests fail, the implementation changed existing behavior unintentionally. Roll back and investigate.
Phase 6: Document the changes
Update CLAUDE.md to include the new feature:
Update CLAUDE.md to document the validation feature we added.
Include:
- What the feature does
- How to use it
- Where the code lives
- Any gotchas or limitationsThis ensures future sessions have context about the validation feature.
Analysis
Record observations from each phase:
Exploration phase
| Question | Your answer |
|---|---|
| How long did AI exploration take? | |
| How long would manual exploration have taken? | |
| Did the agent's overview match what you found? | |
| What inaccuracies did you catch? |
Documentation phase
| Question | Your answer |
|---|---|
| Did writing CLAUDE.md clarify your understanding? | |
| What did the agent catch that you missed? | |
| Would this documentation help a new developer? |
Testing phase
| Question | Your answer |
|---|---|
| How many characterization tests did you write? | |
| Did any fail initially because the test was wrong? | |
| Did any break after implementation? |
Implementation phase
| Question | Your answer |
|---|---|
| Did the plan need revision? | |
| Did the agent follow the plan? | |
| How many interventions did you make? | |
| Does the implementation feel native to the codebase? |
Reflection questions
How well did the agent's initial overview match reality? What verification steps caught inaccuracies?
Did creating CLAUDE.md before implementation improve the result? Would you create documentation first on your next unfamiliar codebase?
Did the characterization tests catch any unintended changes? How would the implementation have gone without them?
Did reviewing the plan before implementation prevent rework? What would have happened if you had started coding immediately?
When did you need to correct the agent? Were there patterns in when intervention was needed?
Success criteria
- Codebase explored using Explore-Plan-Code framework
- Architecture and conventions documented
- CLAUDE.md created and validated
- Characterization tests written and passing
- Implementation plan reviewed and approved
- Validation feature implemented
- All existing tests still pass
- Characterization tests still pass
- CLAUDE.md updated with new feature
- Analysis section completed
Variations
Different repository: Run the exercise on a Python codebase instead. Try glom (https://github.com/mahmoud/glom), a nested data access library. Does the workflow differ between languages? Are characterization tests harder or easier in Python?
Larger scope: Instead of adding validation, attempt a larger feature: add a new Provider that loads configuration from a remote HTTP endpoint. This requires creating a new package following existing patterns. Does the Explore-Plan-Code framework scale to larger tasks?
Documentation only: Skip the implementation phase. Focus entirely on exploration and documentation. Create comprehensive CLAUDE.md, ARCHITECTURE.md, and CONVENTIONS.md files. Measure how long exploration takes when documentation is the only goal.
No documentation: Skip the CLAUDE.md creation phase. Go directly from exploration to characterization tests to implementation. Does the implementation quality differ without explicit documentation? How does the agent perform without CLAUDE.md context?
Your production codebase: Apply this workflow to an unfamiliar area of your own codebase. Create CLAUDE.md for a service you haven't worked with before. Implement a small feature using the full workflow. Compare to your usual onboarding process.
What this exercise teaches
The Explore-Plan-Code framework structures unfamiliar codebase work. Exploration builds mental models, planning validates approach, coding executes safely. Skipping exploration produces implementations that don't fit. Skipping planning produces implementations that require rework.
Writing CLAUDE.md forces synthesis. Vague understanding becomes apparent when you try to write it down. If you can't explain the architecture clearly, you don't understand it well enough to modify it safely.
Characterization tests are insurance against yourself. They catch the changes you didn't realize you were making. The test failure isn't the problem—it reveals the problem. Without characterization tests, unintended changes reach production.
The less you know about code, the more safety measures matter. Characterization tests, plan review, and incremental changes compound to reduce risk. As familiarity increases, you can relax these constraints.
The agent's overview might be 90% correct and 10% wrong. That 10% causes problems if trusted blindly. Verification is what transforms AI assistance from risky shortcut to reliable tool.
This workflow transfers to any unfamiliar codebase: legacy systems, new team assignments, open source contributions. The specific repository doesn't matter. The pattern applies: explore, document, protect, plan, implement.