👋🏻 Currently looking for work!
Please see my LinkedIn profile and get in touch, or see ways to support me in the interim.

… has too many hobbies.

Navigating and Rewriting Legacy Systems

As I'm navigating this job search process, I've been asked several variants of the questions, "how do you work with a legacy or poorly-documented codebase? how would you rewrite one?" I figure this answer might be useful for others.

I approach this problem with every available tool.

I'll usually start by familiarizing myself at a high level with the codebase's goals/requirements, overall structure, and build system. Of course I'll talk to anyone with relevant knowledge to understand their impressions of what the codebase is supposed to do and how it does it.

I'll read any areas of the code that seem particularly important or complex for myself, attempting to understand the code.

I'll use my IDE's tools, in particular call graph and data flow analysis, to see how the system works and how data flows in it. I'll use these tools work from an area I'm curious about back to the program's entry point or to the origin of the data that area operates on; conversely, it's often interesting to start from an entry point and look at how things like dependencies are set up.

If I can run the system under a debugger, I will use it to poke at the running system to clarify anything I'm curious about.

I'll look at logs from the system in production to help understand what it's doing and to understand which, if any, errors/warnings are logged during normal operation.

I'll read and refer back to any available external documentation, keeping in mind that the codebase itself has likely drifted from what external docs describe.

I'll manually review any unit/integration/regression tests that exist, since those can provide insight into requirements and areas where the codebase has failed before.

Finally, tools like Claude Code or Copilot can be very helpful in finding where certain functionality lies and how the codebase solves a problem I'm curious about.

(Aside: Using an LLM programming tool for this is a good way to introduce yourself to the technology. If you're an AI skeptic, you don't have to use it to write code, but why not use every available tool to help you explore a new codebase?)

The Rewrite

First: don't. That said…

To prepare for a rewrite, I'll capture all the requirements (functional, performance, and otherwise) I've discovered and work from there to design a high-level architecture that fits the bill.

I'll keep testability in mind with this design, so we can use TDD during the reimplementation.

Then I'll begin working from the inputs, outputs, and public APIs the codebase deals with, and outline the architecture in more detail with those external contracts in mind.