This is the Engineering Culture Podcast, from the people behind InfoQ.com and the QCon conferences.
In this podcast Shane Hastie, Lead Editor for Culture & Methods, spoke to Josh Evans, former engineering manager at Netflix, on how Netflix does DevOps and the freedom and responsibility culture that undermines their way of working.
Key Takeaways
- There are many interpretations of the term DevOps, it is a useful shorthand for a wide variety of technologies and approaches
- You build it, you run it” is the concrete application of the freedom and responsibility culture
- When building a platform tool make it so easy to use that the product teams are not tempted to try and build something for themselves
- Product teams are free to experiment and learn, which can feel chaotic and is a valuable part of the freedom and responsibility culture
- The value of blameless and safe incident reviews – the goal is to learn and find patterns and use that information to present whole classes of failure from happening in the future
- Don't view the value stream in a fragmented way – see the whole end to end system with all its interactions and dependencies and optimize the system as a cohesive whole rather than different tools and domains
Subscribe on:
- 0:40 Introductions
- 1:50 There are many interpretations of the term DevOps, it is a useful shorthand for a wide variety of technologies and approaches
- 2:13 DevOps at Netflix starts with the company culture
- 2:35 “Freedom and responsibility” as an abstract class for how most areas of Netflix function
- 3:03 Each area manifests the culture in a way that is correct in their context
- 3:10 “You build it, you run it” is the concrete application of the freedom and responsibility culture
- 3:34 The importance of having teams that wholly own the things they build – development, deployment, instrumentation, monitoring and continually improving their components
- 4:23 The distinction between product teams and centralised platform teams
- 4:50 The centralised platform teams build the infrastructure and tools the product teams use
- 5:15 The value of having separate platform teams to accelerate product development
- 5:35 When building a platform tool make it so easy to use that the product teams are not tempted to try and build something for themselves
- 6:15 Platform teams being thought leaders and identifying what the product teams may want before they realise they want them
- 7:00 Know who your customers are and listen to them, value the squeaky wheels, talk to people and listen to feedback
- 8:15 Sometimes the product teams use a new language or technology before the centralised teams have identified the need, and that’s OK
- 8:57 When a centralised team picks up a product they harden it and share it across other teams
- 9:21 This can feel chaotic, and it allows for experimentation and learning and is part of the freedom and responsibility culture
- 10:30 The importance of context not control – visibility into the cost of adopting new technologies and the value of involving the centralised teams when exploring new tools or techniques
- 11:14 The secret sauce to Netflix’s freedom and responsibility culture is hiring very senior people and giving them a lot of autonomy
- 11:40 Autonomous engineers will generally make better decisions than if managers try to micromanage them
- 12:00 Hiring people who have a sense of responsibility, letting people go who are unable to take feedback
- 12:35 Netflix has the highest revenue per employee ratio in the industry, this comes from hiring senior people and giving them opportunities for autonomy, mastery and purpose
- 13:20 You can’t just copy the Netflix way of working in a different business with a different culture
- 13:38 Engineering leaders need to hold the space and create opportunities for freedom and responsibility within their scope of control, even if the whole organisation doesn’t change
- 13:52 Empower the people who feel the pain to solve their own problems
- 13:58 Adopt these changes in small steps – getting to the end state takes time, find the pain points and address them one by one
- 14:15 For Netflix, the initial driver was reliability
- 14:35 Pick one simple metric to monitor – for Netflix this was Start-plays per second
- 15:38 Picking just one metric enabled all the teams to focus on the same outcome and improve the metric over time
- 16:05 Explore the tools for continuous delivery and implement them - Spinnaker is one example
- 16:30 If necessary start with manual steps in the CD process and replace them with automation over time
- 17:20 The value of blameless and safe incident reviews – the goal is to learn and find patterns and use that information to present whole classes of failure from happening in the future
- 17:38 Never fail the same way twice
- 18:40 Making it very clear that the goal of the incident review is learning and improvement not blame, and ways to achieve this
- 20:02 The importance of emotional maturity, not as chronological age but as a personality trait, supplemented by experience
- 21:10 What to look for when interviewing – introspection and being open to feedback
- 21:55 People who will thrive in a freedom and responsibility culture need to be open to hearing what others say and to learn
- 23:35 Thinking holistically about the whole system – seeing the developer experience as well as the customer experience in an integrated way
- 24:44 Don’t view the value stream in a fragmented way – see the whole end to end system with all its interactions and dependencies and optimize the system as a cohesive whole rather than different tools and domains
- 25:31 An example of how Netflix achieved this using a “Canary” which compares the performance of new code against the old code, side by side, and the holistic set of tools, models and metrics which expose the results
- 26:27 The benefits that accrue to the whole organisation from having the monitoring and management tools fully integrated
- 27:36 Velocity with confidence – the ability to move quickly while having safety nets in place to catch and recover from mistakes quickly
- 27:58 What’s next for Josh?
Mentioned: