Incremental Modernization with AI

The modernization problem

Legacy systems accumulate. Organizations rarely retire them because the systems work. They process transactions, enforce business rules, and run the operations that generate revenue. Replacing them is risky. Not replacing them is also risky.

The scale of legacy code still running in production defies intuition. Over 220 billion lines of COBOL remain in active use. COBOL processes an estimated $3 trillion in daily transactions globally. Banks, insurers, airlines, and government agencies run on systems written decades ago by developers who have since retired.

Traditional modernization projects often fail. Forrester and Rocket Software research from September 2024 found that 90% of first-attempt rewrites fail. Complete replatforming projects fail at rates exceeding 70%. The TSB Bank migration in 2018 impacted 5.2 million customers, generated over 225,000 complaints, and cost the bank more than $500 million in remediation and fines.

AI changes the economics. Not by making big-bang rewrites safe—they remain dangerous—but by making incremental modernization feasible at scale. What once required dedicated teams working for years can now happen in smaller chunks, distributed across normal development work.

AI-accelerated timelines

Multiple organizations report timeline reductions with AI-assisted modernization. The improvements are real, though they come with caveats.

Fujitsu's research distinguishes between generative AI and agentic AI approaches. Generative AI alone reduces modernization timelines by approximately 20%. Agentic AI, which plans and executes multi-step transformations autonomously, cuts timelines by up to 50%. Toyota Systems and Fujitsu demonstrated this in October 2024, achieving a 50% reduction in core system update time.

McKinsey's LegacyX platform reports similar acceleration. Organizations using the platform see 40-50% faster completion times compared to traditional approaches. A case study with a top 15 global insurer showed greater than 50% improvement in code modernization efficiency.

AWS Transform, launched in May 2025, has processed over 1.1 billion lines of mainframe code. The service saved an estimated 810,000 hours of manual effort across its customers. Thomson Reuters reported modernizing 1.5 million lines of code per month, reducing transformation time from months to a single two-week sprint. QAD reduced their modernization timeline from two weeks to three days while achieving 60-70% productivity gains.

These numbers represent achievable outcomes under favorable conditions, not guaranteed results. The organizations reporting these improvements had invested in preparation: clear documentation, comprehensive testing, and well-defined scope. Projects without these prerequisites will not see the same acceleration.

Timeline acceleration depends on existing code quality. Well-structured legacy code with clear separation of concerns modernizes faster than tangled systems with implicit dependencies. AI amplifies the difference between good and poor architecture.

Continuous debt management

Technical debt accumulates faster with AI assistance. GitClear's 2025 analysis of 211 million lines of code found that code duplication rose from 8.3% to 12.3% of changed lines between 2020 and 2024. Refactored and moved code dropped from 25% of changes to less than 10%. Code churn—code added and then quickly modified or deleted—is projected to reach 7% by 2025.

Avoiding AI tools is not the answer. Shifting from episodic cleanups to continuous debt management is.

Episodic cleanup treats technical debt as a separate concern. Teams accumulate debt during feature work, then schedule dedicated "debt sprints" or "cleanup weeks" to address it. This creates problems: debt compounds between cleanups, cleanup work lacks urgency compared to features, knowledge of why code was written that way fades over time, and large cleanup efforts disrupt normal development.

Continuous management integrates debt reduction into regular work. Every sprint includes debt work. Every modification leaves the code slightly better than before. Agents handle the mechanical aspects while developers focus on decisions.

A reasonable allocation is 15-20% of each sprint dedicated to technical debt. Some teams use a "pit stop" approach: after two feature sprints, run one sprint focused on refactoring. Organizations leading their industries dedicate 15% of their IT budgets specifically to technical debt reduction.

AI makes continuous management practical. Tasks that once required dedicated engineering time—updating deprecated APIs, removing dead code, improving test coverage—can be delegated to agents as background work. The agent handles the mechanical transformation while the developer reviews and approves.

Identify uses of the deprecated fetch() wrapper in this module.
For each use:
1. Show me the current pattern
2. Propose the migration to the new http client
3. Generate updated tests
Do not make changes yet. Present the migration plan for review.

This prompt structure lets you batch debt reduction. The agent does the tedious work of finding and analyzing deprecated patterns. You review the proposed changes and approve or refine. Debt reduction happens incrementally without dedicated cleanup sprints.

Large-scale refactoring patterns

Individual function-level refactoring—the sprout and wrap techniques from the previous section—handles local improvements. Large-scale modernization requires coordinating changes across many files, modules, or services.

The strangler fig pattern remains the safest approach for major modernization. New functionality grows around the legacy system. Traffic gradually shifts from old to new. Eventually the legacy code can be removed entirely. AI accelerates this pattern by handling the migration mechanics:

We are strangling the legacy OrderService.
The new OrderProcessor is ready for order creation.

Task 1: Identify all call sites that create orders through OrderService.
Task 2: For each call site, determine if it can safely use OrderProcessor instead.
Task 3: Generate feature flags to control which service handles each call site.

Produce a migration plan, not code changes. Show me the analysis first.

Automated refactoring platforms like Moderne and OpenRewrite handle transformations at scale. These tools apply recipes—predefined transformation patterns—across thousands of files simultaneously. Migrating from JUnit 4 to JUnit 5, updating Spring Boot versions, and converting Java 8 streams to newer syntax can happen across an entire codebase in hours rather than months.

AI agents complement these platforms. The platform handles mechanical transformations that follow defined rules. Agents handle the edge cases that require context-aware decisions:

The OpenRewrite migration updated 94% of our JUnit tests.
Review the 47 tests that failed the automated migration.
For each, determine:
1. Why the automated migration failed
2. What manual intervention is needed
3. Whether the test itself is obsolete and should be deleted

Present findings before making changes.

Cyclomatic complexity reduction measures refactoring effectiveness. Research comparing LLM-based refactoring against human developers found that StarCoder2 achieved 17.4% cyclomatic complexity reduction compared to 14.6% for human developers. A major banking institution reported 40% complexity reduction while improving processing speed by 25% through AI-assisted refactoring.

The pattern for complexity reduction:

Identify high-complexity functions through static analysis
Generate characterization tests for the target function
Direct the agent to decompose into smaller functions
Verify tests still pass
Review the refactored code for correctness

Each step is independently reviewable. If decomposition introduces bugs, the characterization tests catch them before merge.

AI agents tend toward surface-level refactoring. Research found only 43% of agent refactorings are high-level operations, compared to 55% for human developers. Agents excel at extract-method and rename operations but struggle with deeper architectural improvements.

When modernization fails

Not every modernization succeeds. Understanding failure patterns helps avoid them.

Exotic dependencies kill projects. Mainframe systems often use specialized technologies: Assembler, Easytrieve, Telon, PL/I. AI tools trained primarily on modern languages struggle with these. IMS and IDMS databases add complexity that automated translation cannot handle. The "last mile" of modernization—the 5-10% of code that resists automation—often consumes more time than the first 90%.

Insufficient testing creates false confidence. An AI can translate COBOL to Java syntax correctly while introducing subtle behavioral differences. Without comprehensive characterization tests, these differences go unnoticed until production. Regression testing can cost 5-10 times the translation fee itself. Organizations that underinvest in testing regret it.

Token limitations cause coherence loss. Current AI models have finite context windows. Complex call chains that span many files can exceed these limits. Microsoft's COBOL Agentic Migration Factory (CAMF) project found call-chain complexity limited to approximately three levels of depth before coherence degraded. Large modernization requires chunking work into pieces that fit within context constraints.

Skills gaps persist. 60% of mainframe specialists are over age 50. 47% of organizations cannot fill COBOL roles today. By 2027, an estimated 92% of remaining COBOL developers will have retired. AI can accelerate modernization, but someone still needs to verify the results. That verification requires understanding both the legacy system and the target architecture.

The pattern that works: incremental migration with continuous verification. Small pieces, tested thoroughly, deployed gradually, rolled back immediately if problems arise. Slower than a big-bang rewrite but far more likely to succeed.

Integrating modernization into normal work

The most effective modernization happens as a side effect of feature development. When adding a feature requires touching legacy code, take the opportunity to improve it. When fixing a bug exposes brittle patterns, refactor them. When onboarding a new team member, document what you learn.

AI agents make this practical by reducing the overhead of improvement work. Generating characterization tests for a function you are about to modify takes minutes with an agent. Extracting a messy method into smaller functions happens while you wait for CI. Updating deprecated APIs in files you touched anyway requires minimal additional effort.

The compound effect matters. A team making small improvements with every change transforms a codebase over time. No dedicated modernization project required. No big-bang risk. Just continuous, incremental improvement—automated by agents, verified by developers.

Thomson Reuters achieved 50-70% technical debt reduction through this approach. They did not run a dedicated debt project. They integrated improvement into their AI-assisted development workflow. Every change left the code better than before.

This is the sustainable path. Not a one-time modernization effort that finishes and is forgotten, but a permanent capability for continuous improvement. AI makes the mechanics cheap. Human judgment remains essential for deciding what to improve and verifying that improvements actually improve.