The goal of refactoring and rewriting is, to improve the sanity of the system by improving the code readability, structure and clarity. A clean code would be easier to maintain and enhance. However, on many occasions Agile teams have a tough time deciding between the two.
Michael Dubakov suggested the following reasons for the code base getting worse over time,
- More and more features. It leads to increased complexity.
- Shortcuts and hacks to support “We need this fancy search till August. Period!” features
- Developers rotation. New developers don’t know all the fundamental decisions and ideas behind the architecture. Knowledge gets lost with transition inevitably.
- Development team growth. More people - less communication. Less communication - bad decisions.
Michael suggested that, though refactoring and rewrite might lead to cleaner code however, both the techniques bring chaos to the existing system. Refactoring is an incremental activity hence it touches some portion of the system at a time. This creates chaos at local levels and might be easy to contain. Rewrite on the other hand is a more invasive change and results in a bigger chaos in the system. Due to the wider impact, the stabilization period of a rewrite is much longer than that of refactoring.
We have the old system during rewrite, so chaos is constant. After the public release chaos increases significantly. Quite many new (and old) bugs and quirks are expected, so stabilization period is longer.
Peter Schuh suggested that often teams use the words interchangeably there by resulting in more confusion and chaos. Teams should understand that rewriting is riskier proposition as compared to refactoring and hence should use the terminology accordingly. According to him,
It’s just semantics. Well … it’s only semantics until someone gets hurt. Rewriting code is a risky and sometimes painful endeavor. It doesn’t always end well. If we execute a rewrite but call it a refactor and the whole thing goes pear-shaped, no business person is going to stop and think about semantics. They’re just going to cringe the next time they hear the word refactor.
Guido A.J. Stevens made an interesting observation. He suggested that the question is not between refactoring and rewriting but it is between either: refactor, or else: rewrite AND refactor. He suggested that even when a team decides to rewrite the system they would eventually have two systems running in parallel. The old system which would require refactoring and the new system which is being rewritten. The combination becomes an overly complex task. According to him,
Maintaining an aging code base, AND writing a new system, is going to be a huge drain on your resources. Your team is split and will run into delays. You'll have to plan and carefully execute a transition. Meanwhile, your competitors don't have your time-to-market problem and will try to steal your customers. If you can stare this reality in the eye and still want to bet the firm on a rewrite, you may have a chance of succeeding.
Naresh Jain had the following suggestions specifically for legacy code. Refactor when the code is difficult to understand and the team is not sure of what it does. Rewrite when it is clear what the code does but it is difficult to understand.
Thus, refactoring is the preferred way to incrementally improve the system. It is slow paced, improves quality with small and constant improvement. Rewrite has its advantages however, in most situations it is a riskier option and the teams can never be sure about the outcome. As Joel on Software suggested,
It's important to remember that when you start from scratch there is absolutely no reason to believe that you are going to do a better job than you did the first time.