This is the Engineering Culture Podcast, from the people behind InfoQ.com and the QCon conferences.
In this podcast Shane Hastie, InfoQ Lead Editor for Culture & Methods, spoke to Jason Hand of VictorOps about the DevOps culture, what ChatOps is and powerful post-mortems.
Key Takeaways
- The misaligned incentives between development and operations in many organisations
- The need to instil a sense of ownership across the whole delivery organisation where everyone takes responsibility for solving problems, rather than saying “that’s not my job”
- There is no roadmap to change the culture of a company, because every company is different
- In complex systems you can’t avoid failure, so make sure you can learn from it and respond rapidly
Subscribe on:
00m:30s - Introductions
01m:30s - A proactive culture around incident management helps with uptime and visibility into infrastructure
01m:45s - Books on ChatOps and Post-mortems
02m:30s - Defining DevOps – bringing the entire IT organisation together to work as a single unit
03m:05s - The misaligned incentives between development and operations in many organisations
04m:05s - The “squishy stuff” of collaboration
04m:15s - Change to this extent needs top-down and bottom up support and commitment
05m:20s - There is no roadmap to change the culture of a company because every company is different
06m:20s - We are all service providers supporting the greater organisation
06m:35s - The need to instil a sense of ownership across the whole delivery organisation, where everyone takes responsibility for solving problems, rather than saying “that’s not my job”
08m:10s - Becoming aligned on the business objectives and the whole company’s goals, rather than those of a single role or department
08m:25s - This gets harder as the organisation gets larger
09m:05s - How agile practices support this culture change
09m:30s - This is not a trivial transformation and requires significant time and investment
09m:45s - Example of the Target DevoOps DoJo training exercise
10m:30s - Example of Verizon’s Learning Labs
11m:40s - These conversations need to happen across and outside of the existing structures in the organisation in order to see the bigger picture
02m:05s - The importance of being able to make mistakes and fail safely
09m:30s - Know when something is wrong and shorten the feedback loop for learning
13m:50s - The inability to learn is the cause of a lot of significant failures in the marketplace
14m:10s - In complex systems you can’t avoid failure, so make sure you can learn from it and respond rapidly
14m:30s - Trying to engineer failure out of a complex system is an exercise in futility
14m:45s - The value of post-mortems to identify the cause of a failure; explore how to identify it earlier in the future and find ways to recover from that failure quickly
15m:03s - Trying to find a root cause for a failure in a constantly changing system may be impossible
15m:40s - Use the retrospective to explore the human factors, without assigning blame
16m:20s - This requires a safe space to honestly admit to mistakes without experiencing negative consequences
16m:30s - Successful retrospectives need to invoke safety – we are here to learn, no one will be in trouble from what we learn, and we will celebrate and reward the learning
17m:35s - Example of Etsy’s “Three armed sweater award”
17m:55s - This culture needs to come from the top and trust within and across teams
18m:45s - The importance of a just culture – treat people justly
19m:30s - The need to remove the desire to find blame and to fixate on a single cause
20m:10s - Introducing ChatOps – tapping into the group chat tools and making things visible and white boxing conversations; make conversations knowable to the whole team
21m:30s - Integrating outcomes from tools into the collective chat and assessing tools directly from the group chat
22m:55s - This requires a new way of approaching problem solving and empowering people through making information visible
24m:30s - Successful organisations make knowledge sharing a top priority
25m:30s - Avoiding the single point of failure risk through knowledge sharing
Mentioned:
- VictorOps
- Target – DevOps DoJo
- Verizon
- Etsy – Three Armed Sweater
- Book: ChatOps for Dummies
- Book: ChatOps: Managing Operations from Group Chat