The Fix Loop of Death

The circular failure pattern

Dependency confabulation stops code before execution. The fix loop of death stops developers during execution, burning time in circles while the codebase deteriorates.

The pattern becomes unmistakable once you've lived through it: an agent attempts to fix a bug. The fix introduces a new problem. The agent attempts to fix that problem. The second fix breaks something else. Each iteration leaves the code worse than before, yet the agent continues with undiminished confidence.

A developer on the Cursor forums documented the experience: "The AI removed essential logging, then broke data handling while attempting restoration, then corrupted core logic. Each intervention worsened conditions. This cycle occurred every 5-10 minutes."

The fix loop doesn't announce itself as obvious failure. Each individual change appears reasonable. Each commit message describes a sensible-sounding correction. Only the accumulating damage reveals that the session has derailed.

Why agents can't escape their own loops

The fix loop persists because agents lack the meta-awareness to recognize it. Several architectural limitations trap them.

No memory of failed approaches. Each reasoning step starts fresh. The agent doesn't recall that it already tried adding a lock to fix the race condition, that the lock caused a deadlock, or that removing the lock caused the original race condition to return. Without explicit memory of what didn't work, the agent cycles through the same failed solutions.

Single-approach fixation. When an agent latches onto a pattern from training data, it proposes the same fix repeatedly with minor variations. One developer observed an agent "insisting on fixing the same snippet repeatedly, ignoring the possibility that the real culprit lurked in a different function or file." The agent saw a pattern it recognized and couldn't consider alternatives.

Local optimization without global awareness. The agent optimizes for the immediate error message without understanding how the fix propagates through the system. Adding a null check here silences this exception. That the null check masks a deeper initialization bug, which will surface as data corruption three modules away, lies outside the agent's reasoning context.

Research on agent task completion confirms the scope: autonomous agent systems achieve approximately 50% task completion rates across benchmarks. The other half fails, often through exactly these loop patterns.

Recognizing a derailing session

The fix loop announces itself through behavioral signals that precede code collapse. Catching these signals early prevents wasted iterations.

Repetitive tool execution is the clearest sign. The agent runs the same command or makes the same edit multiple times. Cursor users report agents "stuck executing the same terminal command repeatedly" with "no recognition that the command has already been executed." When npm test runs three times in succession with the same failing output, the agent isn't debugging. It's looping.

Watch for the same approach with minor variations. Each fix attempt looks nearly identical to the last, with small tweaks to variable names, slightly different error handling, or reorganized logic that addresses the same symptom the same way. The agent generates try { ... } catch (e) { return null } variations without investigating why exceptions occur.

Explanations that repeat or contradict signal trouble. The agent describes its reasoning in increasingly circular terms. Or worse, explanations from one turn contradict the previous turn. "We need to add validation here" followed by "This validation is causing the issue, removing it" followed by "The lack of validation is the problem, adding it back." That whiplash reasoning means the agent has lost the thread.

Increasing response latency matters too. As context fills with failed attempts, agent performance degrades. Responses that took seconds start taking minutes. The agent processes more context while producing less useful output.

Conversation history degradation is perhaps the most insidious. Parts of earlier conversation disappear from the agent's responses. Questions get asked that were already answered. Constraints mentioned at session start get violated. The agent's effective memory shortens as the context window fills with noise.

The three-attempt threshold

Empirical guidance converges on a consistent number: three.

If the same bug persists after three fix attempts, the agent will not resolve it automatically. Continuing to press "fix" or request corrections yields circular results. Each additional attempt pollutes context further, making eventual resolution harder.

The three-attempt rule doesn't mean agents fail after three tries on every bug. Simple issues like syntax errors, typos, and missing imports resolve in one or two iterations. The rule identifies when a bug has exceeded the agent's autonomous resolution capacity.

When the threshold is reached, the response changes fundamentally. Stop requesting fixes. Switch to diagnostic mode. Treat the agent as a debugging partner rather than an autonomous fixer.

# After three failed fix attempts

Instead of: "Fix this error"
Try: "Explain what this error means and list three possible causes"

Instead of: "The test is still failing, try again"
Try: "What assumptions is this code making about the input data?"

Instead of: "Keep trying until it works"
Try: "Let's stop. What have we tried so far, and what hasn't worked?"

Diagnostic mode uses the agent's pattern recognition for analysis rather than generation. The agent may not fix the bug, but it can often identify what the bug is not, narrowing the search space for human investigation.

Context loss as a loop accelerator

Long sessions amplify loop behavior through context degradation. The effective context window, where the model performs reliably, is often much smaller than advertised token limits.

Research quantifies the degradation. In needle-in-haystack tests at 32,000 tokens:

Model	Short Context Accuracy	32K Token Accuracy
GPT-4o	99%	70%
Claude 3.5 Sonnet	88%	30%
Gemini 2.5 Flash	94%	48%

Accuracy drops 20-60 percentage points as context fills, with no warning to the developer. The agent doesn't announce "my performance has degraded." It continues responding with the same confident tone while producing worse output.

Practical thresholds appear lower than benchmarks suggest. Developer reports indicate performance drops after 50-60% context consumption, not at 100%. Sessions lasting 8-12 exchanges on complex tasks often exhibit degradation symptoms.

The fix loop exploits this degradation. Each failed fix adds tokens to context. Error messages, stack traces, and attempted corrections accumulate. By the tenth fix attempt, the agent may have forgotten the original problem statement while drowning in the noise of failed solutions.

Breaking the loop

When a fix loop is detected, several interventions restore progress.

Start a fresh session. The most reliable escape is a new conversation with clean context. Copy the current error and minimal reproduction steps. Describe the problem without the history of failed attempts. The agent approaches the issue without the context pollution that trapped the previous session.

This feels wasteful, abandoning a long conversation with accumulated context. In practice, a fresh start with a clear problem statement often resolves in minutes what a polluted session couldn't resolve in hours.

Summarize and reset. If session continuity matters (established conventions, accumulated decisions), create a summary before resetting. Document what the agent understood correctly. Note what approaches failed. Feed this summary to the new session as structured context rather than conversational history.

## Context for new session

### What we're building
User authentication with JWT tokens for the Express API

### What works
- Token generation (auth/token.js)
- Middleware attachment (app.js lines 24-30)

### Current problem
Token validation fails with "invalid signature" error

### Approaches that didn't work
- Regenerating secrets (same error)
- Changing algorithm from RS256 to HS256 (broke generation)
- Adding debug logging (logs show correct secret)

### Likely cause
Secret encoding mismatch between generation and validation

This compressed context gives the fresh session what it needs without the noise that trapped the previous one.

Narrow the scope. Large tasks trigger large loops. When an agent attempts to fix "the authentication system," it operates across too many files with too many interdependencies. Failures compound as changes propagate.

Reduce scope to a single function or file. "Fix the token validation in auth/validate.js" constrains the search space. The agent can't break authentication across the codebase if it's only permitted to examine one file.

Provide hypothesis, not instruction. Instead of directing the agent to fix the bug, provide a hypothesis about what the bug might be. "I suspect the issue is timezone handling in the expiration check. Examine the exp claim parsing."

Hypothesis-driven requests guide the agent toward specific investigation rather than open-ended flailing. If the hypothesis is wrong, the agent's analysis often reveals why, pointing toward the actual cause.

When confident code is confidently wrong

The fix loop presents obvious failure through repetition. A subtler trap exists: the agent produces code that looks correct, tests that pass, and explanations that satisfy, while introducing latent bugs.

Research on advanced models confirms the risk: "More advanced LLMs are more confident, and thus more likely to confabulate when they don't know the answer." Less experienced developers are "especially likely to be misled by the AI tool's confidence."

Documented cases include an agent that "confabulated entire CRUD logic, fabricated users, and falsified internal test reports," all with clean syntax and no runtime errors. Another case involved a CLI assistant that "issued confabulated move commands, resulting in deletion of real user files" because the tool believed directories existed where none did.

Clean syntax provides no guarantee of correctness. The agent optimizes for plausibility, not truth. Tests pass because the agent wrote tests that match its own flawed implementation. Explanations satisfy because the agent is articulate about its confabulations.

Confidence calibration develops through experience. Early ASD practitioners often accept agent output too readily, then get burned by bugs that seemed correct. Mature practitioners maintain skepticism regardless of how confident the agent appears. The METR study finding that developers predicted 24% time savings but actually worked 19% slower captures the calibration gap between expectation and reality.