Key Takeaways
- Refactoring is a book for relatively new developers to learn a key practice in order to be effective developers (and a book to help senior developers teach this practice)
- The second edition carries much of the same message, but is less centered on objects and includes examples in JavaScript (although the book isn’t tied to any particular language)
- Even making complex changes can be plausible, by breaking down the change into small, behavior-preserving steps
- We use the metaphor of a (bad) smell to indicate code that needs refactoring; these include duplication and complex conditionals
- Refactoring can play an important part in code reviews, provided it is done quickly
The book Refactoring - Second Edition by Martin Fowler explores how you can improve the design and quality of your code in small steps, without changing external behavior. It consists of around seventy detailed descriptions of refactorings, including a motivation for doing them, the mechanics, and an example.
InfoQ readers can download a sample chapter of Refactoring from Thoughtworks.
InfoQ interviewed Fowler about the major changes in the 2nd edition of Refactoring, how to recognize code smells and refactor code, how code reviews and refactoring support each other, what tech leads can do to encourage refactoring, the benefits refactoring brings, using tools for refactoring, and mob programming.
InfoQ also asked Kent Beck, co-author of the chapter Bad Smells in Code, about dealing with code smells.
InfoQ: What made you decide to create a second edition?
Martin Fowler: The original has been around for a long time, and the code examples are rather dated (they use java.util.Vector). I was concerned that this age is increasingly making people feel that the book isn't relevant, although the techniques in the book are just as useful now as then. Secondarily, it allowed me to revisit the topic in a less constrained object-oriented environment. Java doesn't have top-level functions, which makes it impossible to illustrate refactorings with them.
InfoQ: For whom is this book intended?
Fowler: Primarily for relatively new developers, who once they've got the basic hang of programming are starting to think about how to program well.
But there's an important secondary audience: senior developers, who don't need the book to learn anything about refactoring, but will find it useful to help them teach others about refactoring.
InfoQ: What are the major changes in this second edition?
Fowler: In many ways, the book hasn't changed in any important way. The chapter structure is still there, and the core refactoring technique is still the same as it was twenty years ago. However, I've reviewed and mostly rewritten everything there. I've changed the code examples to use JavaScript (EcmaScript 2005), dropped and added some refactorings, and generalized some others. You can find a summary of the changes in changes for the 2nd Edition of Refactoring.
One notable change is that using JavaScript helps me talk about refactorings in a less object-oriented manner. Thus, I can include refactorings such as Combine Functions into Class.
InfoQ: Are there refactorings which can be rather challenging to do? How can developers take precaution doing them?
Fowler: I don't see particular refactorings as more challenging than others. Some are more long-winded; more steps of mechanics, or more pages to describe in the book. But even these longer refactorings aren't more difficult; each step is pretty straightforward. Where you get variation in difficulty is how the same refactoring varies with context and scope.
For example, let's take Change Function Declaration. This is the refactoring to use when I want to change a function name, or add/remove arguments (these were separate refactorings in the first edition, but I combined them as the steps to carry them out are the same). In the simplest context, I have a function that has just one caller next to it in the code. In that case, I just change it and its caller, test, and commit - even if I'm changing the name and altering three parameters. But the same Change Function Declaration refactoring gets far more complicated if I have a hundred callers, scattered all over a code base. Then I have to move more carefully: I extract the body of the old function into the new one (so that the old function works by calling the new one), then one at a time I replace the calls to the old function with calls to the new one. It's a more labored process, but it allows me to make small changes while keeping the code working at all times - which is the essence of successful refactoring.
With Change Function Declaration, this difference in context was so marked that I have two sets of mechanics for it - which I generally tried to avoid. And this challenge grows even more if it's a published API, as then I can't change it without coordinating with the clients of my code. Then the old function, which only delegates to the new one, may stay for a long, long time.
The general lesson here however is that refactorings may take a while, but we should always break them down into small steps that keep the code working. If I can do that, then refactoring may be long but not challenging. If I'm feeling challenged in a refactoring, it's a sign that I haven't broken it down enough into small steps. If I get into a hole while refactoring, it's a sign that I should revert to my last good commit, and start again with smaller steps.
InfoQ: What are some of the most common code smells? How can we deal with them?
Fowler: Still one of the biggest ones for me is duplication. Spotting duplicate code and figuring out how to remove it often leads me to an improved design. Like anything, it can be overdone, but also like most things, it usually isn't done enough.
Kent Beck: Since Martin already picked duplication I'll choose complex conditional logic. When I see an if statement inside a for loop inside an if statement, I am immediately suspicious that there is a case that hasn't been considered. A slightly more abstract smell I look for is violations of Composed Method, which states that all the operations in a function should be at the same level of abstraction. For example, if I see a bunch of bit twiddling operations in the same function with calls to other functions, I'm pretty sure there is a better way to express the computation.
InfoQ: With refactoring, the code base often becomes smaller as the unnecessary code is removed. What if it becomes bigger, is that a code smell?
Fowler: No. Sometimes adding structure adds lines of code; indeed, this happens in the introductory example in the book. As systems get bigger, then refactoring will usually reduce code, certainly as it removes duplication.
InfoQ: How do code reviews and refactoring support each other?
Fowler: Review is an important component of any intellectual effort. When I'm writing my technical prose, I find it essential to get reviews on my work before publication. They will spot ways in which people can misinterpret what I've written, and places where my explanations aren't clear. They don't hold assumptions that are so baked into my thinking that I don't even notice them.
Writing code has a lot of parallels with writing prose. We think of code as instructions to the computer, but that misses the true point of any language higher than machine code. Good code is about communication with whoever needs to use or modify that code later on, whether myself or someone else. A vital part of code review is to be a test of understandability, an immediate sense of whether code is clear or not for that future programmer. Code review can assess refactoring, providing feedback on whether the refactoring is actually making the code clearer.
Refactoring can also be used as a code review mechanism. If some code under review isn't clear, it's often valuable for the reviewer to get together with the writer and refactor the code together. This is particularly valuable in a mentoring situation where the reviewer is either a senior looking to grow the junior's skills, or where the reviewer is more familiar with the code base, and is looking to inform a new member of a team about the conventions of the code base.
For review to work well in this way, it needs to happen quickly, because any feedback is like milk on a hot day. So you need either a team that's disciplined to do code reviews quickly, or pair programming - which is a mechanism for continuous code review. This is true for all changes, whether refactoring or adding features, but refactoring has a greater need for low latency of review.
InfoQ: What can tech leads do to encourage refactoring?
Fowler: Looking at code, seeing where refactoring can be helpful, and pairing with people to do the refactoring, is a good route. It's important to pass on the point that code isn't "done" merely because it works, but also when it communicates what it's doing clearly, so that it can be easily modified in the future.
InfoQ: What benefits does refactoring bring? How can I convince my manager?
Fowler: Herein lies a dangerous trap. Developers often justify high code quality as a matter of professionalism - essentially a moral justification. But what managers and customers care about is the features that a software has for its users; when they hear discussions about quality, they conclude that features will cost more and take longer to build. But the internal quality of software doesn't follow the usual relationships between quality and cost.
When I talk to developers, most can talk about software systems where adding new features gets slower and more expensive over time, where a simple change can take weeks because it's hard to understand how to modify the code base and easy to introduce bugs. But software systems don't have to be like that. Not only can you reduce this slowing down as a system grows, you can even reverse it. Then adding a feature becomes faster because I can build it by composing existing features quickly. Refactoring is valuable because it allows me to keep the software in a healthy state, avoiding the accumulation of technical debt.
The key point here is that the justification for refactoring, or any practice that improves the internal quality of software, is an *economic* argument. We keep a clean code base because it allows us to add new features for a lower cost, and more importantly to add new features faster. As developers we must always present the economic argument front and center, any argument based on professionalism will sound like it's in opposition to economics - and money beats morality every time.
Sadly, there isn't any hard data we can find to prove the economic benefit of internal quality, because we cannot measure the productivity of software development. But then there is no proper study that proves that parachutes are effective. I base my conclusions about software quality based on conversations with experienced developers. It's not ideal evidence, but I consider it to be pretty strong - and certainly the best we currently have.
InfoQ: What is your opinion on tools that can help us to automate refactoring?
Fowler: If you have them, they are exceedingly helpful. It's one of the main reasons I prefer IntelliJ to my usual Emacs for Java programming - the refactoring support is just so helpful. But don't let a lack of tools stop you refactoring; most of my programming is in Ruby, there's little refactoring tooling for that, but that doesn't stop me from doing it.
InfoQ: What's your view on mob programming?
Fowler: I don't have a strong opinion on it. It's something teams should consider trying, and see if it works for them.
InfoQ: If InfoQ readers want to learn more about refactoring, where can they go?
Fowler: I keep a website with information about refactoring at the appropriately URLed refactoring.com. The book is obviously a source I'd suggest too - see changes for the 2nd Edition of Refactoring for details on how to get it.
About the Book Author
I am an author, speaker… essentially a loud-mouthed pundit on the topic of software development. I work for ThoughtWorks, a software delivery company, where I have the exceedingly inappropriate title of “chief scientist”. I’ve written half a dozen books on software development, including Refactoring and Patterns of Enterprise Application Architecture. I write regularly about software development on martinfowler.com
(photo credit: Manuel Gomez Dardenne)