Sanjeev Sharma, IBM’s CTO for DevOps adoption, will be speaking at DevOpsDays NZ in October, where he will give a closing keynote, When DevOps Met SRE: From Apollo 13 to Google SRE. Drawing on examples of incident response, such as the ill-fated Apollo mission, Sharma will discuss the intersection between DevOps and site reliability engineering (SRE) practices.
Sharma is an author, active commentator and thought leader in the DevOps space. Sharma has also recently released his latest book The DevOps Adoption Playbook: A Guide to Adopting DevOps in a Multi-Speed IT Enterprise, filled with experience and observation based "plays" and "game plans" for context-sensitive adoption of DevOps in the enterprise.
InfoQ caught up with Sanjeev Sharma to discuss DevOps, Apollo 13’s SREs and his new book.
InfoQ: Can you please tell us about your current role and areas of interest?
Sanjeev Sharma: Like most distinguished engineers at IBM, I wear multiple hats. The first role is as IBM's CTO for DevOps adoption. In this role, I am responsible for leading the worldwide technical sales and adoption community for DevOps offerings across IBM’s portfolio of tools and services. Personally, I work with IBM’s largest customers helping them drive DevOps adoption in their organizations, addressing all three areas of adoption - process, technology and culture.
In my second role, I am IBM's global leader of our cloud architecture practice. In this role, I lead our global cloud architect community as their 'guild leader', providing direction and guidance on architecting solutions, platforms and infrastructure for cloud adoption. I personally focus on ‘first of a kind’ (FOAK) and complex solutioning for our clients migrating their applications to the cloud.
My main area of interest right now is microservices and containers. With the broadening of the adoption of cloud-native applications, the playing field is changing. We are moving from large monolithic apps to apps made up of microservices, running in containers, or even as serverless functions. How these applications (or really microservices) are developed, tested. deployed, run and managed is what I am getting more and more focused on.
InfoQ: As a thought leader in the field, what does DevOps mean to you?
Sharma: I have a very simplistic view of DevOps. To me it is the application of lean principles to make the act of getting requirements coming from the business deployed as code running in production, delivering business value to clients, in the most efficient and effective manner. And from this code running in production, getting feedback to continuously improve what and how we just delivered to production. The feedback is used to improve three areas:
(i) The code we just delivered - is it functioning and performing as desired?
(ii) The environment(s) on which we just delivered the code - is it performing and behaving as desired?
(iii) The processes with which we delivered the code and the environment(s) - how can we make it more lean and efficient for the next cycle of delivery?I like this definition as it is technology agnostic. It is process agnostic. It is not a methodology. It is not a job description for a role in the organization. It is what you do!
InfoQ: What can you tell us about your upcoming DevOpsDays talk?
Sharma: This talk started with a client asking me to differentiate between DevOps and SRE. I told her I will need to get back to her, as I was stumped. So, I came back home and went to work. I looked at all the literature I could find on SRE, trying to fit it into my mental model of DevOps. Then I did what I know best - wrote a series of blog posts explaining my understanding of the intersection between DevOps and SRE.
InfoQ: What can the Apollo 13 incident teach us about DevOps and SRE?
Sharma: One of the core aspects of SRE is incident response and management. If you remember the movie Apollo 13, which documented (dramatized) the Apollo 13 accident, it focuses a lot on the response by the engineers on the ground in mission control, which resulted in the saving of the lives of the astronauts. To me, these engineers were the real heroes. That aside, if one breaks down how the engineers on the ground triaged the incident, broke down the challenges that needed to be addressed and developed responses to address each by priority, one realizes that their approach was not very different to how today's SRE teams would respond. Incident response is still about situational awareness, triage and real-time response. The core remains the same, whether one is on a space mission, or running an application sharing emoji's via the cloud.
InfoQ: How do you think DevOps-adopting organisations would benefit from also investing in SRE practices?
Sharma: Organizations moving to cloud-native applications have to invest in SRE. The availability and responsiveness expectations of users and the service level objectives (SLO) expected by the business can only be delivered with SRE practices and dedicated teams. We are no longer talking about managing hundreds of servers running a static set of applications. We are talking about hundreds of thousands of containers, elastically changing in scale, serving up a dynamic number of microservices, all being continuously updated with new versions, thanks to continuous delivery. These cannot be run and managed without both DevOps and SRE practices in place.
InfoQ: Since not all of our products are life-critical, what is the danger of an overfocus on antifragile and fault tolerant design, regardless of value or contextual risk?
Sharma: 'Antifragility' is a new area for IT organizations. Moving from 'traditional resilience' to antifragile systems has an underlying cost. So yes, one needs to understand the business need and the SLOs expected by the business to validate the need for antifragile systems. That being said, cloud services themselves, delivered by a cloud provider, whether an external cloud service provider or internal IT, should always be antifragile in nature. It is the very definition of a cloud service. It cannot be down. It cannot be unresponsive.
InfoQ: Please tell us how your new book came about.
Sharma: The DevOps Adoption Playbook was a passion project for me. It is the distillation (a long distillation at over 400 pages) of all the conversations, discussions and lessons learnt in my four years as the global DevOps technical leader at IBM. I have spoken to literally hundreds of clients around the world and learnt and shared a lot about DevOps and how to adopt it at scale. This book puts all of those discussions and lessons learnt to paper for anyone to read and learn from.
InfoQ: How well do these adoption patterns scale across different types and sizes of organisation?
Sharma: One has to realize that even within a single company, the IT organizations are neither homogeneous nor monolithic. They vary even within the same division, by practitioner and team maturity, technology stacks, processes, and culture. This makes DevOps adoption at any scale larger than the proverbial 'two pizza' team a challenge. That is why there is no one set of practices to adopt a methodology.
There are entire sets of approaches from which to adopt what fits. I call them 'plays' in my book, similar to plays in sports. How one tackles the next step in any sports game depends on several factors - my opponent, am I winning or losing, am I attacking or defending, which players of mine are on the field, how good are they doing today, what are the field conditions, etc. This determines how I am going to play. The same way, the leadership of a team and an organization need to decide which DevOps 'plays' they need to execute, given the current 'playing conditions', and in which order. My book helps them do exactly that. That is why I called it a 'Playbook', like a coach's playbook for a sports team. And yes, the book is chock full of sports stories and analogies.
InfoQ: What are you most looking forward to from DevOpsDays NZ?
Sharma: First time in NZ! I hear the wine is great...
On a serious note, I am looking forward to meeting new and interesting people. That is what I love the most of coming to such events. I look forward to learning from the speakers, from the organizers and above all the attendees. So, if you (the reader) are going to be there, come hit me up between sessions. Let's chat about how you are adopting DevOps. What are your lessons learnt? What are you working on that is new and exciting? I would love to hear about it.
DevOpsDayz NZ will be running in Auckland, October 3-4, where Sharma and a number of international and local speakers will discuss a range of cultural and technical topics.