Applied Intelligence
Module 3: Context Engineering

Anatomy of an effective prompt

The prompt layer in context

Previous lessons established the three-layer context hierarchy: project context provides the foundation, conversation context accumulates during a session, and prompt context delivers the immediate instruction. This lesson examines the anatomy of that third layer the prompt itself.

Within the broader discipline of context engineering, the prompt represents the final opportunity to shape agent behavior before execution. A well-constructed prompt leverages all the context that precedes it while providing clear, actionable direction. A poorly constructed prompt squanders that context, producing results that miss the mark regardless of how well the environment was prepared.

Research confirms the critical importance of initial prompt quality: over 95% of successful code implementations occur on the first prompt attempt. This finding underscores why prompt construction deserves deliberate attention iteration rarely rescues a fundamentally flawed initial request.

The context-instruction structure

Effective prompts follow a consistent structural pattern: context first, instruction second. This ordering aligns with how language models process information sequentially providing relevant background before the task ensures the model has the necessary foundation when it reaches the action directive.

The context-instruction pattern manifests at multiple scales:

At the sentence level:

Given a user authentication system using JWT tokens [context],
add rate limiting to the login endpoint [instruction].

At the paragraph level:

The checkout module handles payment processing through Stripe.
Current implementation lacks proper error handling for declined cards.
Users see a generic error message that doesn't help them understand
what went wrong or how to proceed.

Add specific error handling for common Stripe decline codes,
displaying user-friendly messages that suggest corrective actions
where appropriate.

At the full prompt level:

## Background
This is a React application using TypeScript and Zustand for state management.
The project follows a feature-based folder structure with shared components in /src/components.
All forms use react-hook-form with zod validation.

## Current State
The user profile form exists but lacks email change functionality.
Email changes require verification, which the backend supports via POST /api/verify-email.

## Task
Add email change capability to the profile form.
Include verification flow with a 6-digit code input.
Follow existing form patterns in the codebase.

The context-instruction pattern applies recursively. Each section of a complex prompt can itself follow the pattern, creating a hierarchy of context and instruction that guides the agent through multi-faceted tasks.

Components of prompt context

The context portion of a prompt serves specific functions, each contributing to agent understanding:

Situational context establishes what exists:

  • Current state of the system or file
  • Relevant constraints or dependencies
  • Technology stack and conventions in use

Goal context explains why the task matters:

  • The problem being solved
  • User needs or business requirements
  • Success criteria and expected outcomes

Reference context points to existing patterns:

  • Similar implementations in the codebase
  • Relevant documentation or specifications
  • Examples of desired output format

Not every prompt requires all three types. Simple tasks may need only situational context. Complex tasks benefit from comprehensive coverage. The key is matching context depth to task complexity a principle explored further in the verbosity discussion below.

Components of prompt instruction

The instruction portion tells the agent what to do. Effective instructions share common characteristics:

Explicit action verbs remove ambiguity:

  • "Add" not "consider adding"
  • "Refactor" not "maybe clean up"
  • "Remove" not "you might want to get rid of"

Bounded scope defines completion criteria:

  • "Add validation for the email field" rather than "improve the form"
  • "Handle the null case in processOrder" rather than "fix the edge cases"

Output specification clarifies deliverables:

  • "Return a TypeScript function" not "write some code"
  • "Update only the authentication middleware" not "make the necessary changes"

Vague instructions produce vague results. Research shows ambiguous task descriptions reduce code generation accuracy by 25-30%. Incomplete specifications cause 20-25% accuracy reduction. The more precisely the instruction defines the expected outcome, the more reliably the agent delivers.

The specificity-flexibility tradeoff

Every prompt navigates a fundamental tension: specificity improves accuracy but can constrain beneficial solutions, while flexibility enables creative problem-solving but risks ambiguity and wrong assumptions.

When high specificity works

High specificity excels in well-defined scenarios:

Debugging with known symptoms:

The UserService.authenticate() method throws NullPointerException
when user.email is null. Add a null check before line 47 that returns
an AuthenticationError with message "Email is required".

Format-sensitive outputs:

Generate a JSON response with this exact structure:
{
  "status": "success" | "error",
  "data": { ... },
  "timestamp": "ISO 8601 format"
}

Refactoring with explicit goals:

Extract the validation logic from processOrder() into a separate
validateOrder() function. Keep the same validation rules.
Return a ValidationResult object with isValid boolean and errors array.

High specificity reduces back-and-forth refinement by up to 68% for straightforward tasks. The agent has clear success criteria and minimal room for interpretation.

When flexibility works

Flexibility proves valuable when the optimal solution isn't predetermined:

Architectural exploration:

The current caching implementation causes memory issues under load.
Suggest approaches to reduce memory footprint while maintaining
sub-100ms response times for cached queries.

Code improvement:

Review the PaymentProcessor class for potential improvements.
Focus on error handling and testability.

Problem investigation:

Users report intermittent failures in the export feature.
Investigate potential causes and propose solutions.

Flexible prompts allow agents to apply their training on vast codebases, potentially surfacing solutions the developer hadn't considered.

The right altitude

Anthropic's research identifies an optimal "altitude" for prompt specificity specific enough to guide behavior effectively, yet flexible enough to provide strong heuristics rather than rigid rules.

The over-specification trap manifests as:

  • Brittle prompts that fail when conditions vary slightly
  • Excessive constraint that blocks superior solutions
  • Trial-and-error loops when attempting to specify everything

The under-specification trap produces:

  • Ambiguous results requiring multiple iterations
  • Wrong assumptions that waste effort
  • Code that runs but doesn't meet actual requirements

Research quantifies under-specification costs: models show 22.6% lower average accuracy when requirements go unspecified. Under-specified prompts are twice as likely to regress during model updates.

Matching specificity to task type

Task TypeRecommended SpecificityRationale
Bug fixes with known causeHighPrecise context prevents wrong assumptions
New feature implementationMedium-highClear requirements, flexible design
Architecture decisionsMediumConstraints without over-prescription
Exploratory refactoringMedium-lowAllow agent expertise to surface
Code reviewMediumSet criteria without constraining findings

Task-dependent verbosity

Prompt length is not inherently good or bad optimal verbosity depends on task characteristics.

When concise prompts excel

Simple, well-defined tasks often perform better with minimal prompts:

Add a loading spinner to the submit button while the form is processing.
Rename getUserById to findUserById across the codebase.

Research on code understanding tasks found a negative correlation between token count and performance more words actually degraded results. For straightforward operations, concise prompts reduce noise and focus agent attention.

When detailed prompts improve outcomes

Complex, domain-specific tasks benefit from expanded context:

## Background
This financial reporting module generates quarterly statements for SEC filing.
All monetary values must use BigDecimal to avoid floating-point precision issues.
The existing codebase follows GAAP standards for revenue recognition.

## Requirements
- Calculate deferred revenue for subscription products
- Amortize setup fees over 12-month contract periods
- Handle mid-quarter contract modifications
- Generate audit trail for all calculations

## Constraints
- No rounding until final output
- All intermediate values preserved for audit
- Match existing calculation patterns in RevenueCalculator.java

## Expected Output
A DeferredRevenueReport class with methods for each calculation type,
following the existing pattern in QuarterlyReport.java

For domain-specific tasks, detailed prompts consistently outperform minimal ones. Research shows complex domains continue improving with prompts exceeding 200 words, while simpler tasks plateau around 100 words.

The verbosity spectrum

Task ComplexityOptimal LengthContent Focus
Single-line changes20-50 wordsAction and location only
Standard features50-150 wordsContext and clear instruction
Complex implementation150-300 wordsBackground, requirements, constraints
Domain-specific work300-500 wordsFull context with examples

Research demonstrates diminishing returns beyond 500 words. Model comprehension drops approximately 12% for every 100 words added past this threshold. Prompts exceeding 3,000 tokens show degraded reasoning regardless of content quality. More is not always better precision matters more than volume.

Structure over length

The format of a prompt often matters more than its word count. Studies show prompt structure can vary code translation performance by up to 40%.

Effective structural techniques:

Using clear section headers:

## Context
...

## Task
...

## Constraints
...

Employing XML tags for explicit delineation:

<background>Current authentication uses session cookies.</background>
<task>Migrate to JWT-based authentication.</task>
<constraints>Maintain backward compatibility for 30 days.</constraints>

Breaking complex requests into numbered steps:

1. Read the existing validation logic in UserValidator.java
2. Identify all validation rules currently implemented
3. Create equivalent validation using the new ValidationFramework
4. Ensure all existing tests continue to pass

The COSTAR framework provides a useful mental model: Context, Objective, Style, Tone, Audience, Response format. Not every prompt needs all elements, but considering each prevents common omissions.

The quality principle

Effective prompts optimize for signal density maximum useful information with minimum noise. This principle connects to the broader context engineering insight that smaller, high-signal token sets outperform verbose, low-signal alternatives.

High signal density:

Add null checking to processPayment(). Return PaymentError
with code INVALID_INPUT when amount or currency is null.
Follow the existing pattern in processRefund().

Low signal density:

So I was thinking we should probably add some kind of null
checking to the processPayment function because sometimes
it crashes when we don't have all the data. Maybe you could
look at how we did it in other places and do something similar?
It would be nice to return a proper error instead of crashing.

Both prompts request the same change. The first provides clear, actionable direction. The second buries the request in hedging language and vague suggestions.

Connecting to context hierarchy

Prompt construction doesn't happen in isolation. The most effective prompts leverage the context hierarchy established in earlier lessons:

Project context reduces prompt burden. When CLAUDE.md specifies coding conventions, error handling patterns, and architectural decisions, prompts need not repeat this information.

Conversation context enables brevity. Within an ongoing session where the agent has read relevant files and discussed requirements, subsequent prompts can reference shared understanding: "Now implement the validation we discussed."

Prompt context fills remaining gaps. The immediate instruction provides what project and conversation context don't supply the specific task, any unique constraints, and success criteria.

This layered approach produces prompts that are simultaneously concise and complete. The prompt itself stays focused on the immediate task while relying on surrounding context for background and conventions.

Summary

Effective prompts follow the context-instruction pattern, providing relevant background before actionable direction. The specificity-flexibility tradeoff requires judgment: match precision to task clarity, erring toward specificity for well-defined tasks and flexibility for exploratory work. Verbosity should scale with complexity, recognizing that simple tasks often perform better with concise prompts while complex domains benefit from detailed context. Structure matters as much as length well-organized prompts outperform verbose, unstructured alternatives regardless of word count.

The next lesson explores specific prompt patterns that consistently produce good results across common development scenarios.

On this page