BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Articles ‘Debt’ as a Guide on the Agile Journey: Technical Debt

‘Debt’ as a Guide on the Agile Journey: Technical Debt

Key Takeaways

  • Frameworks like Scrum@scale can help you navigate in agile transformation by providing a holistic structure that identifies pains and potential for improvements. However, the framework does not provide an understanding of what causes the pains and how to remove them.
  • Use your intuition to understand the root causes, and you are likely to find that there is a pattern in these that revolves around debt: legacy technology and technical debt not only limits technical opportunities, but consumes mental resources and limits the ability to think out-of-the-box.
  • If you have a team who is under pressure to develop new solutions for your business that they cannot deliver because they are drowning in operations of old technology, consider temporarily splitting the team and dividing responsibility for Dev and Ops between the teams to allow the engineers to focus.
  • When you need your teams to step up and take responsibility for the quality of their products, such as increase their release maturity, consider installing a framework that gives the teams freedom to decide how to improve in a structured way. Ensure that there are incentives for the teams to work with improvements and provide inspiration for how to improve competencies.
  • The right experiments can help the organization move forward, but in all cases success depends on support from your leadership. Find ways to get your leadership onboard with your ideas so they accept the flatter hierarchy that follows from delegating decisions to teams and so they support the teams in taking the time to increase competencies and make improvements.

For organizations who work with software development, agile practices are now the default approach - especially for start-ups - and the market is swamped with experience reports and frameworks that can help facilitate agile ways of working for teams who are “born agile”. But where do you turn to if 1) you don't work with software development, and 2) you are an organization with a long history of being project-led and hierarchical? In 2017, that was the question we asked ourselves, as our CIO exclaimed, “We’re going agile - and you have full freedom to figure out how”.

Our context was that we were around 100 engineers who were in charge of the IT infrastructure in the LEGO Group, a Danish toy manufacturing company which today has factories and offices across the globe and employs 18,000 people. Our department (Infrastructure & Security, I&S) handled everything from internal hosting, network, application integration, ERP platform, cyber security, mobile phones and personal computers, and today we have expanded the portfolio to also include public cloud hosting platforms.

Our product catalogue was, in other words, diverse and it was characterized by a significant amount of technical debt and time consuming maintenance tasks, while a development agenda was steadily increasing; technology and requirements were changing at a faster and faster pace and we and our legacy technologies were struggling to keep up.

Our approach to the agile transformation has been incremental - we have not tried to plan it. Our guiding star has been to try to constantly take steps towards a better and more agile way of working by introducing experiments and learning from these. To help us be structured in our approach to introducing experiments, we used a compass: Scrum@Scale1, created by Jeff Sutherland. Scrum@Scale defines 12 components that each provide a specific view on the organization, and it describes a desired behaviour for an agile organization but does not prescribe any solutions. One component is the Team level process - if teams deliver value in a sustainable cadence to customers. Another component is Delivery - how satisfied our users are with our deliveries and the delivery process.

Over time, a pattern emerged and we saw that many of the pains we had were caused by technical debt caused by legacy technology that was preventing the organization from achieving our goals in several ways: it did not provide the technical opportunities that you would expect of a modern infrastructure, and managing the legacy technology took up too much of the mental capacity of our engineers, preventing them from working with new opportunities. We became attuned to the stale smell of debt.

In this article, the second of a series on how ‘debt’ can be used to guide an agile journey,  we will provide two examples of smells that are related to technical debt, explain the symptoms, the impact on the business and in our organization, outline the experiments (countermeasures) that we have introduced in an effort to try to remove the smell, and provide some specific - hopefully practical - advice for you to be inspired by, should you experience similar smells in your organization.

Smell #1. The ol’ switch(eroo)

All of us are consumers of network infrastructure, but few of us know what is actually going on in the department that manages the physical and logical layers of the network. If you do know - or if you go ask the engineers who do - you are likely to be able to recognize the situation we were in a year ago: the architecture and the physical components in our global network infrastructure , i.e. the switches, routers and access points in our offices and factories, were old and outdated.

Many of the devices were end-of-life or getting very near an age where retirement was the only option. In other words, our network infrastructure was a clear case of technical debt. In Scrum@Scale, we identified the issue as an impediment to the Network Team’s “team process”, as the team was challenged on increasing their performance and operating in a way that was sustainable and enriching for the team. As we will explain below, the team wasn't able to fix the issues on their own - they had to be escalated (to the “Executive Action Team”).

Symptoms

The legacy network devices and management tools were depending on hands-on resources for operations and maintenance, and the Network Team became reactive and firefighting. Most of their time and energy was spent on running the existing environment and very few resources could be used for preparing the network for the future. The technology in the field of network was quickly moving towards infrastructure-as-code and software-based solutions - an entirely new paradigm that the engineers would need to spend time on embracing, but they weren't able to find that time to develop their technical skills and imagine what the future could look like.

Impact

If network infrastructure is not your specialty, you might question how much requirements for connectivity could really change over 10 years? Does the Network Team really need to develop a completely new solution and live the DevOps dream? The answer to that is a resounding yes! Today’s (not to mention tomorrow’s) requirements for security features and performance are significantly different from 10 years ago; the network infrastructure is key in the cyber security area of protecting vital business processes and applications by controlling data traffic, and the network must support the vastly increasing amount of data traffic that is the result of new streaming and IoT services, for instance. The Network Team was not able to deliver to these expectations with the legacy technology that we were fighting to operate and maintain, and thus, the business was impacted.

Internally, the Network Team themselves were also impacted. They felt the heat from several CXOs who were frustrated that they couldn't satisfactorily support top priorities such as the cyber security agenda. Ultimately, we were seeing a loss of pride and motivation in the team as the engineers wanted to deliver what was expected but could not see how they could set themselves up to succeed.

Countermeasure

The Network Team needed to focus on developing a modern solution, but they were swamped with operating the old network platform - and it wasn't an option to stop doing operations of the legacy platform as much of the business still depended on it.

The issue was escalated to the Executive Action Team who introduced an experiment. We wanted to enable the engineers to focus, and to facilitate this we split the responsibility for the network between two teams. One team would manage the existing network and hosting infrastructure, another team (heavily supported by externals with knowledge about the new technology) would develop a new design and roll it out. This setup would be temporary and the ambition is to merge the responsibility for the network sometime in the future.

Learnings

It worked. And it worked just in time to ensure that when the world was closed down by Covid-19 in Q1 2020, we were able to establish a new VPN setup in a matter of days and ensure that the entire 14,000 white collar workers could work from home without issues when they were sent home. A year later, the old network platform for the offices and factories is and has been running, keeping the business going, while a new network design for factories and offices has been developed and is in the process of being rolled out globally. This roll out, it turns out, comes with new challenges that we are planning experiments for, but that is another story.

We are trying out versions of this experiment in other teams who are heavily impacted by technical debt while under pressure for developing new solutions to our business. And we can recommend that if you have a team that is drowning in operations and not developing the new solutions your business needs, consider splitting the responsibility between separate teams temporarily to allow the engineers to focus.

Smell #2. Deployment whoopsies

If you are employed in the software industry, you are likely to know all about the hassle of orchestrating and automating releases (if you are not in the software industry, a “release” is equivalent to when you deliver something - new - to your customers). This smell is related to our ambition to release at a faster and faster pace. After all, most of the agile call to arms is about increased speed (of value delivered).

We used to have a top-down approach to control that the releases we made were of high enough quality to be introduced in the world. We had weekly Release Board Meetings run by line managers who would double check and sign off on the release and be overall accountable for the outcome. This approach would not work for two reasons: 1) to succeed in increasing speed we needed to let go of the top-down-control approach: decision power had to be placed in the teams, and 2) we didn't have line managers anymore, so who was actually accountable? It felt like the most obvious example of an opportunity where we could let the teams decide “the how”.

To facilitate speed, we therefore increased the frequency2 of Release Board Meetings to twice a week and told the teams that they were now in charge of quality and coordination. It was a small change that was made with the best intentions: it created a vacuum for decisions, with little to no structure that supported the teams in filling that vacuum and taking responsibility.

As the teams were busy delivering value to the business, it never became a priority to spend time on indirect value adding activities, such as finding new ways to ensure quality of changes. It is an example of process debt, and in Scrum@Scale we put it under “Delivery” (in the Scrum Master cycle) as it relates to delivering a consistent flow of valuable finished products to customers - at high quality.

Symptoms

With an increasing number of releases in an environment that was becoming ever more complex and growing in size, we started seeing a rise in the number of incidents that - due to not living up to quality standards - had an unintended and unexpected effect on business processes.

Impact

The impact on the business from these incidents was downtime - significant downtime - which is noticeable when the daily business is a matter of millions of euros. The impact was also visible within the infrastructure teams. The infrastructure engineers in the teams experienced some hard backlashes from other parts of the digital organization and for each major incident our credibility took a toll. The engineers became risk averse, and the knee jerk reaction from the management layers in the organization was to install top-down controls again.

Countermeasure

We took a deep breath and tried to remember what we wanted to achieve: we wanted teams to have freedom to decide the how, because we wanted speed and agility. And we knew we could not achieve this by dictating improvements. So instead we created a structure for the teams to act within - a clear outline of the playing field and the rules. We called this structure the “Release Maturity Model”(RMM)3, and it is illustrated in Figure 2.

The RMM outlined seven areas relevant in the release process (the playing field), including how source code is stored, how tests and validations are made, how risk is assessed, how change requests are documented and approved, how deployment is orchestrated, and how the team learns from the experiences they get from their releases. The rules were that the teams were requested to 1) assess their current maturity (on a 4-level scale from “this is not something we do” to “this is fully-automated in our setup”), 2) decide on two areas where they would commit to improving their maturity in the next quarter, based on their evaluation of where it would be of highest value in their specific situation, and 3) decide how they would improve. Over time, the idea was to make the teams’ assessments and action plans transparent to everyone to facilitate knowledge sharing and inspiration across the department.

Figure 2. The release maturity model, first draft

Learnings

It didn't work. The approach with the playing field and the rules was very well received in the teams who we tested it on - they eagerly discussed where and how to improve. But the initiative failed quickly as the teams did not move past the discussion. Even though we had created a vacuum for them to make decisions AND provided a structure to support them, there was something missing. Our conclusion was that there were two unsolved issues: 1) We didn't manage to make this a priority in the teams - the incentive to work on creating customer value was overruling the considerations for improvements. We could perhaps have mitigated the effect of lack of incentive if we had 2) made sure that the engineers had the time to focus on becoming great at releasing and knew where to get inspiration, tools and competencies, but we didn't.

So if you want your teams to take responsibility for release quality and for continuously increasing their release maturity, a Release Maturity Model can be a way forward, but you need to ensure that there is an incentive for the teams to work with release management, or at least ensure that teams know where to get inspired and find tools and that they can focus on increasing their competencies.

Concluding remarks

Smells, pains, impediments - different ways of describing issues - can be inspected with many different lenses. And to make a successful agile transformation, we believe that using different lenses, viewpoints  and levels of abstraction is paramount. Above, we have tried to be very practical and prescriptive in how we recommend you experiment if you are experiencing specific issues similar to what we have described. The up-side of this approach is that it is - well, practical. You can go home and start your experiment at once. The down-side is that reality is complex, and if your situation is not exactly similar to ours (which complexity suggests that it is not) you are likely to not get the same result we did.

To adjust for this risk, we want to conclude with a set of more high-level strategies that we are also constantly navigating from. We believe that if you add these to the mix when planning your experiments, you are much more likely to be able to judge if the experiments are likely to be the right medicine for your ailments.

For the two experiments to remove technical debt described above, we have found support in the following two strategies:

  • Foster faster decisions. When you experiment with anything that is for removing pains and smells from your organization, remind yourself that the experiments must help decrease time-to-decision. A cornerstone in great decision-making is availability of information, involving people in understanding and addressing challenges, and pushing the decisions as far down the organization you can. Whatever you do, be prepared to accept a significant increase in personal discomfort as you move out of your comfort zone when you accept that you need to let go of control to enable others to take the responsibility on them.
  • Accept that leadership support is needed. When you take a deep breath to understand all of the smells and consider the two recommendations above, you are likely to realize that introducing the right experiments that can help the organization move forward requires support from your leadership. Taking out the time to make change happen in the organization - and perhaps accepting the hierarchy that follows from making decisions across teams - requires a leadership that accepts this investment. Fostering fast decisions by increasing communication and letting go of control requires a leadership that dares to be brave. Whatever you do, find ways to get your leadership onboard with what you are doing, help them help your organization in any way possible, because you are not likely to succeed if they don’t have your back.

About the Authors

Anne Abell is Strategic advisor at The LEGO Group. Advises the CTO on product- and organizational strategy, and has worked as scrum master and product owner. Abell holds a Doctor of Philosophy in Information Technology Project Portfolio Management, a Master of Science in Information Technology and a BSc in Political Science; is certified Scrum@Scale practitioner and certified in ITIL Foundation and ITIL Service Operation.

Carsten Ruseng Jakobsen is Agile coach at Grow Beyond. Jakobsen has more than 25 years of hands-on experience working with change management in major Danish companies; is one of the agile pioneers and since 2005 has worked with implementation of agile mindsets and culture to achieve business agility. Jakobsen has presented experiences at international conferences like SEPG 2007, Agile 2007, Agile 2008, Agile 2009, Agile 2011, OOP 2014 and XP 2020. Jakobsen holds a Diploma of Engineering Business Administration (EBA), Scrum Trainer by Scrum Inc., Certified Scrum@Scale Trainer (CSaST), Certified Scrum@Scale Practitioner, Certified Scrum Professional (CSP). Certified Scrum Master (CSM), and Certified Scrum Product Owner(CSPO).

Rasmus Lund Jensen is Agile coach at The LEGO Group. Jensen teaches, coaches and develops leaders in their roles as scrum masters and product owners to facilitate agile transformation. Jensen holds a Master of Science in Engineer, Technology Based Business Development and Bachelor of Science in Electrical Engineer. Jensen is certified Scrum Master, Scrum Alliance Certified Team Coach, and Scrum@Scale practitioner. Has attended Leading Organization and Change (MIT), Summer Institute for General Management, Stanford University GSB, Situational Leadership II (CfL).

Resources

1 For more information about Scrum@Scale (Scrum Inc.) you can visit this.

2 You might add that the more “correct agile” approach would be continuous release. This is also a guiding start that we are working towards. However, we concluded that we are not mature enough as an organization to make this happen. Yet.

The model is “home-grown” as we were not able to find a framework that would provide enough structure AND freedom to fit the purpose, but we do not suggest that we made an exhaustive search.

Rate this Article

Adoption
Style

BT