Key Takeaways
- Old systems are successful systems
- Legacy modernization is a valuable skill no matter how old your systems are
- Define the value-add of modernization up front and be specific; don’t assume newer is better
- Metrics can help reinforce agile development and keep everyone on the same page
- Communication pathways and incentives have an outsized impact on which strategies will work best
The book Kill it with Fire by Marianne Bellotti provides strategies that organizations can use to modernize, maintain, and future-proof their systems. She suggests choosing strategies based on the organizational context, and defining what value you’re hoping to see from modernization.
InfoQ readers can download a sample chapter of Kill it with Fire from the above publisher’s book page.
InfoQ interviewed Marianne Bellotti about upgrading technology, legacy challenges, and modernizing systems.
InfoQ: Why did you write this book?
Marianne Bellotti: I really like working on old systems. The anthropologist in me loves studying legacy systems as a way of understanding an organization’s past. At one point in the book, I describe old code as “artifacts of human thought, like pottery shards.” It really has that resonance for me. I get excited trying to recreate the challenges past programmers must have faced by studying how their product evolved.
At the same time, I think most technical people do not see old code that way. For most programmers, legacy is something to be ashamed of. The maintenance of old systems or the management of technical debt is a burden, a chore. Most of the time when people try to talk to me about my work on legacy systems, they want horror stories. They want the Ripley’s Believe It or Not episode. But it shouldn’t be about that. Legacy modernization isn’t something that begins and ends at a mainframe. Web applications built in 2015 are also legacy code. Learning how to enjoy the process of restoring something old to operational excellence should be part of a normal software engineering experience, because all engineering teams will have to maintain their old code.
I wanted to write a book about how to run large scale legacy modernization projects, but I wanted to write it in a way that was relevant to software engineers who only work on “new” systems too. I wanted to highlight how these skills are broadly relevant to ALL software development, not just COBOL programs on mainframes.
InfoQ: For whom is this book intended?
Bellotti: Primarily software engineers, but the book focuses a lot on navigating the organizational factors that complicate things. Most modernization projects or major migrations do not fail because they are technically difficult. They fail because they require sustained political will to allocate resources to see a project to completion. So I talk a lot about how to order the technical tasks in order to build momentum and how to communicate value-add to non-technical stakeholders. In that sense, there’s a lot in the book for the CIO or senior executive who needs to oversee modernization work, but probably won’t touch a keyboard and write code.
InfoQ: You stated in the book that technology is cyclical. Can you elaborate what you mean by this?
Bellotti: We tend to think about technology advancing in a straight line, with each iteration better and more sophisticated than what came before. The reality is a little more complicated than that because there are no one-size-fits-all solutions. As we make incremental improvements to technology, we are only really optimizing for a specific set of use cases. Those same improvements might make other uses more difficult.
Over time what tends to happen is as one technology gets more and more optimized, the group of people for whom things are moving in the opposite direction of what they actually need gets larger and larger, until finally there are enough people to establish a market for a “new” technology to shift things back in the opposite direction. My favorite example of this is cell phone size: for a while cellphones were about staying connected to the office on the go, so each more advanced version was smaller and thinner. Then the emphasis shifted from work functions to entertainment functions, and suddenly cell phones started to get bigger and bigger. Technology is filled with these kind of cycles where it feels like we’re reinventing or repackaging old solutions. That’s why it’s so important to understand what value an architectural change will bring to the table, rather than migrating to something new just because it is new.
InfoQ: What are the challenges of building a business case for upgrading to new technology? How should one deal with those challenges?
Bellotti: From the perspective of a non-technical person, modernization projects often look like periods where nothing actually happens. Big changes are made, but everything looks the same and behaves the same. Meanwhile, new features that could have been launched were delayed. It’s really important to tie projects to the value they are going to bring. We’re not taking the time to redo these systems because we think the code will look prettier this way. These changes will make the system faster. Or they will make the system cheaper. Or they will make engineers more efficient.
Service Level Objectives (SLOs) are a great way to start to bridge the perspectives of the business side and technical side of the organization. Both sides might agree that faster or cheaper is valuable, but to different degrees. Don’t assume that what engineering values is valued- or in fact even visible- to the business side of the organization. Be deliberate in identifying what the value of the modernization is and structure the roadmap around delivering that value, front loading it if possible.
InfoQ: You mentioned that there is still a lot of COBOL software around. What causes it to still be used?
Bellotti: Like all hard problems, there isn’t one reason. Most of these systems were not built intentionally, and by that I mean that they didn’t start as these massive computer systems that control everything. They started as odd jobs, outsourcing parts of a larger process to software. When that succeeds we keep outsourcing more and more, connecting the different parts and integrating them.
Systems built this way can be very successful, but they also lack a lot of the platform characteristics that makes modifying them easier. There tends to be little or no structured testing, no monitoring, code may not be documented. It is very difficult to change a system in those conditions; you are essentially flying blind.
At the same time, mainframes are super fast and extremely resilient. So organizations can end up getting caught thinking, “Well, the system isn’t broken and modernization will probably break things really badly once or twice. And what’s the argument for getting rid of COBOL? It’s old? It’s not trendy anymore?”
InfoQ: What are the main issues that organizations face with legacy systems?
Bellotti: Loss of institutional memory is a big problem. Michael Feathers (Working Effectively with Legacy Code) defines legacy code as code without tests, and part of the reason why tests are so critical is that a well-written and comprehensive test suite documents the expected behavior of the system much more effectively than traditional documentation. Without any formal way of recording what the system is supposed to do, that institutional memory is stored in the people who built the system, people who will eventually leave the organization.
You know, COBOL is not a difficult language to learn. The column restrictions are a bit annoying on modern IDEs, but any programmer can learn COBOL. The problem is learning the ins and outs of a complex system, which we see in legacy systems in other languages too. Large complex systems take a long time to understand, especially if their expected behavior is not documented.
InfoQ: What different strategies exist for modernizing systems?
Bellotti: Too many to mention, that’s basically why I wrote a whole book about it! But on a high level, successful modernization strategies look to take very large, complex systems and define boundaries— various ways you can cut them up into smaller and simpler pieces Boundaries, even imperfect ones, help you restrict the surface area of any given change. You can think of modernization as a heart transplant. Before you can replace the organ, you need to clap the right blood vessels.
InfoQ: How would we know what strategy to use?
Bellotti: I choose strategies based on the organizational context, rather than the technical context. The other day I was helping a team decide between a couple of different strategies, and the truth is every option they were considering had both significant risks and also made a lot of sense and could prove to be successful. We used to say all the time at United States Digital Service (USDS) that “hard problems are hard.” If there was one strategy that seemed more likely to be successful than all the others, there probably wouldn’t be a modernization project at all, because modernization would have just been done as part of the normal maintenance. If there was an obvious answer, someone else would have done it already.
So the technical challenges narrow the field of options, but ultimately the deciding factors for me come down to internal politics. Where do we have leadership buy-in? How are people incentivized to behave? Are there contractors here? If so, what are they doing? It’s ridiculous to expect someone to make their own job irrelevant, and shocking to me how often modernization strategies are structured in such a way where people are incentivized to sabotage it.
InfoQ: What kind of problems can happen in modernization projects, and how can we deal with them?
Bellotti: Most modernization work isn’t hard, it’s just tedious. It requires months, sometimes years, of prolonged detailed-orientated work, work that often isn’t considered very interesting by the rest of the tech industry. Someone described it to me as “janitorial work” once, and that’s not far off. Just as really exciting machine learning and AI work relies on lots of extraordinarily boring data gathering and labelling work, great modernization work relies on lots and lots of boring refactoring, documenting, and testing work.
So the biggest threat to a modernization project is loss of momentum. Either the business side of the organization sees the project as a money pit and reprioritizes resources, or the tech workers get overwhelmed by what looks like an insurmountable mountain of horrible, uninteresting work. I spend a lot of time talking with people about building trust and communicating value. Just because these projects will likely take a long time to fully complete does not mean we abandon agile development process. We need to clearly define what value we’re hoping to see from modernization, how we expect to see it, then break the project up into smaller tasks where we can demonstrate that value. Personally I’m very fond of picking something big and super scary as the first task, because nothing builds momentum better than succeeding in a place where many have failed before. A big win right away can turn a torturous project full of awful work into a dream job in people’s minds.
InfoQ: What organizational structures can we apply for modernization?
Bellotti: More than a specific structure, I think it’s important for people to understand that Conway’s Law is about communication pathways and incentives, not org charts. Lots of technologists fantasize about the freedom of a flat org structure, but you’re not going to be able to collaborate effectively with everyone in a five hundred person organization. Effective collaboration is about relationship building, and you can’t maintain 500 relationships. So one way or another, people are going to group themselves.
A common mistake that people make with legacy modernizations projects is that they take the engineers who are knowledgeable about the old system and put them in charge of maintenance, while a different team of engineers gets to build the new system. Think about how that incentivizes people: if you’re maintaining the old system, the success of the team building the new system means you lose your job. So when there are questions about the behavior of the old systems, are you going to help the other team?
People tend to lose sight of the fact that teams are made up of people and that what looks neat and orderly on a PowerPoint slide deck doesn’t work if it positions people in such a way that success of the project means personal failure.
InfoQ: How can we find out if a modernization project is going in the right direction and that the system is becoming better?
Bellotti: The first two steps in any modernization project should be defining the value modernization is going to create, and what metrics we’re going to use to monitor that value add. In some ways this is an agreement between the technical team and the business side or other stakeholders. Do not fall back on the assumption that new technology is better simply because it is new. Develop a theory of how things will be improved and monitor those conditions from day one.
For example, most organizations will save money by moving to the cloud, but not all organizations. It’s awkward to get all the way through a difficult migration like that and realize that actually you’re spending way more money. Don’t assume! The more thought you put into constructing the case for value add in the beginning, the easier it is to run the project. When stakeholders can see the needle moving in the right direction, they will continue to support the effort. It also helps tech teams make better decisions. Tech teams often fall victim to pedantic arguments and distractions. Having a clear statement of value add helps organize the hundreds of little decisions that implementation will require.
InfoQ: What role do feedback loops play in modernization, and how can we keep them sufficiently open?
Bellotti: It goes back to agile development. It’s hard for tech leadership to avoid sliding into waterfall techniques when it comes to modernization because often it seems possible to map the modern system directly to the current system. What is there to iterate? You already know how everything should work! But by establishing our theory of value add and monitoring it, we’re creating a feedback loop that can drive that agile iteration. When a change doesn’t produce the benefits we hoped for, we can alter our direction before sunken costs have gotten too great.
Another way feedback loops are important is in avoiding getting into a bad legacy situation in the first place. In this context, feedback loops are less about status updates, more about how different organizational structures influence incentives and behavior. For example, a common problem when building a new system is how to avoid ending up with a big, messy, monolith that needs a major effort to break up. You can design the system as a set of services instead, but each new service is additional overhead on an engineering team. Separate services need deploy pipelines, on-call rotations, and possibly their own product management structures. All that increases overhead. You only have so many people in your engineering organization and they only have so many hours in the day. Thinking about how different architectures create different interactions— or feedback loops—inside the organization can help you avoid digging yourself into a hole. It’s not enough to say, “We’re going to do a good job managing our tech debt.” You have to make sure people are empowered to do so.
About the Book Author
Marianne Bellotti has worked as a software engineer for over 15 years. She built data infrastructure for the United Nations to help humanitarian organizations share crisis data worldwide and tackled some of the oldest and most complicated computer systems in the world as part of United States Digital Service. At Auth0, she ran Platform Services, a portfolio that included shared services, untrusted code execution, and developer tools. Currently she runs Identity and Access Control at Rebellion Defense.