Key Takeaways
- How coordinating work across different parts of your organization with DevOps depends on team size due to architectural coupling.
- Why transformations that start by addressing the biggest inefficiencies in development processes are more successful.
- Why people need a common understanding of how their current approaches are causing inefficiencies for the overall software development and delivery system to change their way of working.
- How mapping your current deployment pipeline including metrics with teams helps understanding the biggest inefficiencies in their software development processes.
- How Increasing the frequency of deployment while maintaining or improving quality and security using DevOps forces out inefficiencies that have existed in your processes for years.
The book Starting and Scaling DevOps in the Enterprise by Gary Gruver provides a DevOps based approach for continuously improving development and delivery processes in large organizations. It contains suggestions that can be used to optimize the deployment pipeline, release code frequently, and deliver to customers.
InfoQ readers can download an excerpt from Starting and Scaling DevOps in the Enterprise.
InfoQ interviewed Gary Gruver about what makes DevOps fundamentally different from other agile approaches, how DevOps can help to optimize requirements and planning activities, metrics for continuous integration, the difference between scaling with tightly coupled architectures and with loosely coupled architectures, types of waste in large organizations and how to deal with them, and why executives should lead continuous improvement and how they can do that.
InfoQ: Why did you write this book?
Gary Gruver: As I started working with more and more different large organizations that wanted to transform their software delivery processes, I discovered one of the biggest challenges was getting the continuous improvement journey started and aligning everyone on implementing the changes. For this to work I feel pretty strongly that you should start the continuous improvement process with the changes that will have the most significant impact on the organization so you build positive momentum. I found I was spending most of my time analyzing the processes in different organizations and helping them identify those areas. I saw a lot of common problems but I also found each organization had some unique challenges that resulted in different priorities for improvement. Overtime I started refining the process I used for analyzing different businesses and felt it was important to document the approach to share as pre-work for my workshops. I would send this out ahead of time to a few key leaders in the organization to start mapping out the deployment pipeline so we would have a good straw dog for the workshops. The workshops would then really helped get everyone aligned on the changes that would have the biggest impact and get everyone excited about starting the journey. The reason I decided to publish the book is that I realized I can’t do workshops for every company that needs to improve their processes and I thought others might find the approach helpful.
InfoQ: For whom is this book intended?
Gruver: The book is intended for large organizations that have tightly coupled architectures. Small organizations or large organizations like Amazon that have architected to enable small teams to work independently won’t learn much by reading this book. They would be better served by reading the DevOps cookbook to identify some best practices that they have not implemented yet. This book is not intended for them. It is instead for larger organizations that have to coordinate the development, qualification, and release of software across lots of people. This book provides them with a systematic approach for analyzing their processes and identifying changes that will help them improve the effectiveness of their organizations.
InfoQ: What makes DevOps fundamentally different from other agile approaches?
Gruver: I try not to get too caught up in the names. As long as the changes are helping you improve your software development and delivery processes then who cares what they are called. To me it is more important to understand the inefficiencies you are trying to address and then identify the practice that will help the most. In a lot of respects DevOps is just the agile principle of releasing code on a more frequent basis that got left behind when agile scaled to the Enterprise. Releasing code in large organizations with tightly coupled architectures is hard. It requires coordinating different code, environment definitions, and deployment processes across lots of different teams. These are improvements that small agile teams in large organizations were not well equipped to address. Therefore, this basic agile principle of releasing code to the customer on a frequent basis got dropped in most Enterprise agile implementations. These agile teams tended to focus on problems they could solve like getting signoff by the product owner in a dedicated environment that was isolated from the complexity of the larger organization.
The problem with that approach is that in my experience the biggest opportunities for improvement in most large organizations in not in how the individual teams work but more in how all the different teams come together to deliver value to the customer. This is where I believe the DevOps principle of releasing code on a more frequent basis while maintaining or improving quality really helps. You can hide a lot of inefficiencies with dedicated environments and branches but once you move to everyone working on a common trunk and more frequent releases those problems will have to be address. When you are building and releasing the Enterprise systems at a low frequency your teams can brute force their way through similar problems every release. Increasing the frequency will require people to address inefficiencies that have existed in your organization for years.
InfoQ: How can DevOps help to optimize requirements and planning activities?
Gruver: My view of DevOps is optimizing the flow of value through organizations all the way from a business idea to a solution in the hands of the customer. From this perspective, it requires analyzing waste and inefficiencies in your planning and requirements process. In fact, I see this as one of the biggest sources of waste in many large organizations because they build up way too much requirements inventory in front of developers. As lean manufacturing has taught us this excess inventory leads to waste in terms of rework and expediting so it should be minimized as much as possible. There are others in the DevOps community that tend to look at DevOps as starting at the developer and moving outward because that is where a lot of the technical solutions of automation and infrastructure as code are implemented. Again I would not get too caught up in the naming. This is not about doing DevOps. It is about addressing the biggest sources of waste and inefficiencies in your organization. If you can develop and release code in a day but it takes months for a new requirement to make it through your requirements backlog you probably need to be taking a broader end to end view of your deployment pipeline that includes your planning process and move to a more just in time process for requirements and planning.
InfoQ: Which metrics do you recommend for continuous integration?
Gruver: The first step is understanding the types of issues you are finding with continuous integration. It is designed to provide quick feedback to developers on code issues. The problem that I frequently see though is that the continuous integration builds are failing for lots of other reasons. The tests may be flaky. The environments may not be configured correctly. The data for running the test may not be available. These issues will have to be addressed before you can expect the developers to be responding to the feedback from continuous integration. Therefore, I tend to start by analyzing why the builds are failing. This is one of the first steps you use to prioritize improvements in your process. Next it depends on what you are integrating. If you are integrating code from a small team, you probably want to measure how quickly the team is addressing and fixing build issues. If you have a complex integration of a large system, I am much more interested in keeping the build green and making sure the code base is stable by using quality gates to catch issues upstream because failures here impact the productivity of large groups of people. There is a lot more detail and metrics in the book because it really depends on what you are integrating at which stage in the deployment pipeline.
InfoQ: In the book you distinct between scaling with tightly coupled architectures and with loosely coupled architectures. What makes it different, and how does that impact scaling?
Gruver: From my perspective, DevOps is a lot about coordinating work across different people in the organization and the number of people you have to coordinate depends on the size of your organization and the coupling of your architecture. If you have a small organization or a large organization with a loosely coupled architecture then you are working to coordinate the work across 5-10 people. This takes one type of approach. If on the other hand, you are in a large organization with a tightly coupled architecture that requires 100s of people to work together to develop, quality, and release a system it takes a different approach. It is important to understand which problem you are trying to solve. If it is a small team environment, then DevOps is more about giving them the resources they need, removing barriers, and empowering the team because they can own the system end to end. If it is a large complex environment, it is more about designing and optimizing a large complex deployment pipeline. This is not they type of challenges that small empowered team can or will address. It takes a more structured approach with people looking across the entire deployment pipeline and optimizing the system.
InfoQ: Which types of waste do you often see in large organizations?
Gruver: Most large organizations with tightly coupled systems spend more time and energy creating, debugging, and fixing defects in their complex Enterprise test environments than they spend writing the new code. A lot of times they don’t even really want to do DevOps; they just need to fix their environments. They are hearing from all their agile teams that they are making progress but are limited in their ability to release new capabilities due to all the environment issues. I usually start there. How many environments do they have between development and production? What are the issues they are seeing in each new environment? Are they using the same processes and tools for defining the environments, configuring the databases, and deploying the code in all the different environments? Is it a matter of automating these processes with common tools to gain consistency or are the environment problems really code issues by other teams that are impacting the ability of other groups to effectively use the environments for validating their new code. These are frequently some of the biggest sources of waste that need to be addressed.
InfoQ: What can be done to remove the waste?
Gruver: It depends a lot on the source of waste. A lot of the waste is driven by the time it takes to do repetitive tasks and manual errors that occur by having different people implement the process in different ways. This waste is addressed through automation. This requires moving to infrastructure as code, automating deployment and testing processes. The problem is that this effort takes some time so you should start where the improvements will provide the most value for your organization. This is why we do the mapping to determine where to start.
There is waste that is associated with having developers working on code that won’t work with other code in development, won’t work in production, or doesn’t meet the need of the customer. Reducing this waste requires improving the quality and frequency of feedback to developers. The developers want to write good code that is secure, performs well, and meets the need of the business but if it takes a long time for them to get that feedback you can’t really expect them to improve. Therefore, the key to removing this waste is improving the quality and frequency of the feedback.
Lastly a lot of organizations waste a lot of time triaging issues to find the source of the problem. Moving to smaller batch sizes and creating quality gates that capture issues at their source before they can impact the broader organization is designed to addresses this waste.
InfoQ: Why should executives lead continuous improvement? How can they do that?
Gruver: In large tightly coupled systems somebody needs to be looking across the teams to optimize the flow through the deployment pipeline. As we discussed above this is just not something that small-empowered teams are in a position to drive. It requires getting all the different teams to agree that they are going to work differently. I frequently see people start with grass roots efforts but most of these initiatives start to loose momentum as they get frustrated trying to convince peers and superiors to support the changes. If you are going to release code on a more frequent basis with a tightly coupled system then all the teams need to be committed to keeping their code base more stable on a day to day basis. If 9 of the 10 teams focus on stability it won’t work. All the teams need to be committed to working differently. This is where the executives can help. They can pull the teams together, analyze the process, and get everyone to agree on the changes they will be making. They can then hold people accountable for following through on their commitments. This can’t be management by directive because before mapping out the process with the teams the executives typically don’t have a good feel for all the inefficiencies in their processes. It needs to be more executive lead where the executive is responsible for pulling everyone together and getting them to agree on the changes and then leading the continuous improvement process.
This advice for executive lead is for large tightly coupled system. For organizations that have small teams that can work independently this is not as important.
InfoQ: Any final advice for organizations that want to adopt DevOps?
Gruver: Don’t just go off and do DevOps, agile, lean or any other of the latest fades you are hearing about. Focus on the principles, analyze the unique characteristics of your organization, and start your own continuous improvement journey. Judgment is required. Understand what you are trying to fix by changes in your process and then hold regular checkpoints to evaluate if your changes are having the desired effect. Bring the team along with you on the journey. At each checkpoint review what got done, what didn’t get done, what you learned during that iteration, and agree on priorities for the next iteration. The important part is starting the continuous improvement journey and taking everyone with you. You will make some mistakes along the way and discover that what you thought was the issue was just the first layer of the onion but if you come together as a team on the journey you will achieve breakthroughs you never thought were possible.
About the Book Author
Gary Gruver is an experienced executive with a proven track record of transforming software development and delivery processes in large organizations, first as the R&D director of the LaserJet FW group that completely transformed how they developed embedded FW and then as VP of QA, Release, and Operations at Macy’s.com leading the journey toward Continuous Delivery. He now consults with large organizations and runs workshops to help them transform their software development and delivery processes. He is the author of Starting and Scaling DevOps in the Enterprise and co-author of Leading the Transformation “Applying Agile and DevOps principles at Scale” and A Practical Approach to Large Scale Agile Development “How HP Transformed LaserJet FutureSmart Firmware”.