How To Do Large Scale Refactoring

Refactoring by definition means changing the internal structure of a program without modifying its external functional behavior. This is mostly done to improve the non-functional attributes of the program thus leading to improved code readability and maintainability. However, refactoring on a large scale often gives jitters to even seasoned Agilists. The Agile community discussed some ways of handling the large scale refactoring.

In a recent discussion, Andreas wanted to know the best way out of the possible three for doing a large scale refactoring. His ways included,

Big Bang - Define the structure for the final state and push code to its ultimate home.

Divide and conquer - Try to separate the big ball of mud in to two pieces. Repeat until done...

Strangling- Strangle the classes

Most respondents agreed that Big Bang was almost never going to succeed. Aaron Digulla suggested that he had been using the Strangling approach for his entire career. The idea is to gradually morph the bad code into shiny new code which has a test harness around it. The advantage of this strategy being that since you start slow with smaller pieces, the risk is usually small. David Hall and Shane MacLaughlin stressed on the importance of doing a small divide and conquer approach by writing enough tests around any portion of the program which is touched. Some people suggested a complete rewrite but as per an earlier post on InfoQ, that has its own set of challenges.

Sibylle Peter and Sven Ehrke mentioned that they follow an approach of conducting an assessment and creating a master plan of refactoring for large scale refactoring. For each refactoring step, they follow the following three steps

Analysis: definition of the desired result and HOW to achieve it.
Implementation: application of refactoring techniques to alter the code accordingly.
Stabilization: application of methods, which ensure that the result of the implementation is durable.

Another approach which is gaining momentum for large scale refactoring is the Mikado Method. The Mikado method has its history associated to the work done by Daniel Brolund and Ola Ellnestam and draws its name from the Mikado game. According to this method,

Code changes are like the Mikado game. When you want to make changes to a code base, you can rarely make the exact changes you want right away. You have to prepare some, move code, extract classes, and much more. Picking up the Mikado on your first grab is a rare thing! More often you make a sequence of moves before the Refactoring Mikado is available, working your way systematically to the bottom of the pile, to reach your goal.

Mikado approach starts by keeping the end in mind. For any code base which needs to be refactored, the idea is to create a dependency graph starting with the end goal. Next step is to identify the immediate prerequisites to reaching the goal, and continue identifying dependencies in that way until there is a point, called a leaf which has no prerequisites or dependencies. This mostly is the best starting point to begin the refactoring. Once there is a dependency graph, the central idea is to work back from the leaves toward the goal, step by step.

The approach strongly suggests the importance of Undo, where teams should not have fear to revert and throw away the changes. It also recommends not to get into analysis paralysis and instead start with a naïve step and then understand the consequences. The draft copy of the book on Mikado method is now available.

Thus, though large scale refactoring is hard to do but the key lies in identifying the starting points and then treading the path in small steps.

InfoQ Software Architects' Newsletter

Write for InfoQ

Rate this Article

This content is in the Agile topic

Related Topics:

Related Editorial

Related Sponsors

Popular across InfoQ

The InfoQ Newsletter