Using AI agents to refactor legacy pipelines
· Thiago, Andreos · 2 min read
ai engineering refactoring
Using AI agents to refactor legacy pipelines
We’re excited to share our experience using AI agents to modernize a legacy data pipeline that had been running in production for years.
The Challenge
Legacy systems are everywhere. They work, but they’re hard to maintain, difficult to extend, and often lack proper documentation. Our pipeline was no exception:
- Written over 5 years with multiple authors
- Minimal documentation - most knowledge was tribal
- Tight coupling between components
- No tests to speak of
The AI-Assisted Approach
Rather than a complete rewrite (which rarely works), we took an incremental approach:
- Understanding the existing code - Used LLMs to explain complex sections
- Generating tests - Created test coverage for critical paths
- Refactoring in small steps - Made targeted improvements with AI assistance
- Validating changes - Ran tests and compared outputs
Key Learnings
What Worked
- Start with tests - AI is great at generating test cases when given examples
- Small, focused changes - Let the AI tackle one function at a time
- Human in the loop - Review everything; AI makes mistakes
- Iterative improvement - Don’t expect perfection on the first try
What Didn’t
- Blind trust - AI-generated code needs careful review
- Large refactors - Breaking changes across many files were problematic
- Context limits - Very large files exceeded token limits
Results
After 3 months of gradual refactoring:
- Test coverage increased from 0% to 75%
- Code complexity reduced by ~40% (measured by cyclomatic complexity)
- Documentation now exists and is maintained
- Zero downtime during the transition
Conclusion
AI agents are powerful tools for modernizing legacy code, but they’re tools, not magic. The key is treating them as pair programmers: helpful, capable, but requiring guidance and review.
We’re continuing this approach on other systems and will share more lessons learned along the way.