How Agents Navigate and Search Codebases

Agents cannot see your codebase the way developers do. They navigate through specific tools that read files, search patterns, and traverse directories each with distinct strengths and constraints.

Claude Code provides three primary tools for codebase exploration, plus directory listing through Bash:

Glob Fast file pattern matching. Returns paths sorted by modification time. Supports wildcards, character classes, and brace expansion.
Grep Content search using ripgrep. Supports regex, file type filtering, and context display around matches.
Read File retrieval with a 2,000-line default and 25,000-token maximum. Handles code, images, PDFs, and Jupyter notebooks.
Bash (ls) Directory listing through controlled commands. The permission system can restrict which directories agents may access.

Example patterns

Glob patterns:

**/*.ts           # All TypeScript files
src/**/*.test.js  # Test files in src
*.{json,yaml}     # Config files in current directory

Grep patterns:

pattern: "function authenticate"    # Find function definitions
pattern: "TODO|FIXME", -n: true    # Find markers with line numbers
pattern: "handleError", type: "ts"  # Scope to TypeScript files

Claude Code uses these dedicated tools rather than raw shell commands. Its internal configuration includes: "ALWAYS use Grep for search tasks; NEVER invoke grep or rg as a Bash command."

The Explore sub-agent

For complex codebase questions, Claude Code deploys a specialized Explore sub-agent. This read-only agent uses Glob, Grep, Read, and limited Bash commands to investigate without making changes.

The Explore agent operates with fresh context it doesn't inherit the main conversation's history. After completing its investigation, it returns a summary to the main agent. This architecture prevents context pollution: the main agent receives distilled findings rather than raw search results.

Developers can specify thoroughness levels:

Quick Straightforward file lookups
Medium Moderate exploration
Very thorough Comprehensive analysis across multiple locations

Codex's sandbox approach

Codex takes a fundamentally different approach, operating within OS-level sandboxes rather than application-level tool abstractions.

Mode	Read Access	Write Access	Network
read-only	All files	None	Blocked
workspace-write	All files	Current directory	Blocked
danger-full-access	All files	All directories	Allowed

These constraints are enforced at the operating system level Seatbelt profiles on macOS, Landlock and seccomp on Linux.

The .git/ and .codex/ directories remain read-only even in workspace-write mode explaining why git commit operations require explicit approval.

Agentic exploration patterns

Both tools employ similar high-level strategies:

Just-in-time retrieval Load files on demand rather than pre-indexing. Keeps context focused and current, trades speed for accuracy.
Iterative refinement Start broad, narrow progressively. Each search informs the next as naming conventions and patterns emerge.
Progressive disclosure Information in layers: CLAUDE.md first, directory structure second, file contents last when needed.

Practical implications

File naming matters. Agents infer meaning from names like AuthenticationService.ts or handlePaymentError.js before reading content. Explicit, descriptive naming reduces agent confusion.

Project organization affects efficiency. Agents navigating flat directories with hundreds of files spend more tokens than those working with logical hierarchies. Standard conventions help agents locate code faster.

Search-friendly code helps. Consistent function naming, clear error messages, and distinctive identifiers make grep-based exploration more effective.

These aren't style preferences. They directly impact how effectively agents can understand and modify your code.