Why Your Agents Are Hallucinating: The Case for Context Engineering
The most common complaint we hear from engineering leaders piloting AI agents is: 'It works for small scripts, but it gets lost in our actual codebase.' The agent hallucinates imports, invents function signatures that don't exist, or simply gives up and asks the user to copy-paste code.
The Problem: Fragmented Context
Human engineers succeed because they build a mental map of the system over months. They know that changing the `User` model in `schema.prisma` requires a migration, an update to the tRPC router, and a frontend type regeneration. An agent dropping into the repo for the first time has none of this context.
Naive RAG (Retrieval Augmented Generation) often fails here. Chunking code by 500 tokens destroys the semantic linkage between files. If an agent retrieves the definition of a function but not the interface it implements, it cannot reason correctly about polymorphism.
The Solution: Unified Context Architecture
To fix this, we need to treat Context as an engineering discipline. This involves three layers:
- The AGENTS.md Standard: A human-written, machine-readable map of the repository's high-level architecture, conventions, and 'gotchas'.
- Tree-Sitter Based Indexing: Instead of chunking by lines, index by AST nodes (Classes, Functions, Interfaces) to preserve semantic integrity.
- Active Context Management: Using tools like compaction to summarize previous turns in the conversation, keeping the context window focused on the current task.
Implementing AGENTS.md
Place an `AGENTS.md` file in your root. It should act as the 'onboarding buddy' for your AI. Here is a production example:
# Agent Guidelines for @/src
## Tech Stack
- Frontend: Next.js 15 (App Router), Tailwind, Framer Motion
- Backend: Hono.js on Cloudflare Workers
- Database: Drizzle ORM + Neon Postgres
## Conventions
1. **Never** create new UI components without checking @/components/ui first.
2. All database mutations must happen via the `mutations/` folder.
3. Use `zod` for all runtime validation.
## Known Issues
- The auth middleware is currently brittle around edge functions. Mock it out for local tests.When we implemented this simple file at a Fortune 500 client, agent success rates on multi-file refactors jumped from 15% to 65% overnight. Context is not just helpful; it is the prerequisite for autonomy.