InfoQ Homepage Presentations Recipes for Blameless Accountability

Recipes for Blameless Accountability

View Presentation

Speed:

48:31

Summary

Michelle Brush provides a set of norms and practices, but also antipatterns, for balancing accountability and blamelessness in organizations.

Bio

Michelle Brush is a math geek turned computer geek with over 20 years of software development experience. In her current role as an SRE Manager for Google, she leads teams of SREs that ensures GCE's APIs are reliable. Previously, she served as the Director of HealtheIntent Architecture for Cerner Corporation. Before that she was the lead engineer for Garmin's automotive routing algorithm.

About the conference

QCon Plus is a virtual conference for senior software engineers and architects that covers the trends, best practices, and solutions leveraged by the world's most innovative software organizations.

Transcript

Brush: I'm Michelle Brush. I'm an engineering manager at Google. I work in SRE, specifically SRE for Compute Engine. I'm going to start with the definition. In looking at this area of accountability that we talk about a lot in organizations, I realized that people don't always agree what it means. Sometimes people think it means that you'll be punished for making bad decisions, or that people have to face the consequences of their actions. That's a very narrow, naive, and not terribly useful version of the word. I want to make sure we're all working on a good definition of the word. This is the definition I like. It is that accountability is the organizational assurance that you'll be evaluated based on outcomes that resulted from your behavior related to something for which you're responsible. A lot of these words are load bearing. I wouldn't cut them out at all. They're very essential to the definition. Most notably, the ones I have underlined, of course. If we look at those, it has these ingredients. The first is that there's an organizational assurance. We have some trust or contract between the organization and the individuals, that when they are evaluated, they will be evaluated based on outcomes, but also behavior in situations where there was some level of responsibility. All these ingredients are really important. You can't just do one and not the other, or do three and not all five. I say this, because I have seen organizations try to do just these two. They've tried to heavily focus on evaluating folks, evaluating decisions based on outcomes alone.

Sometimes you see this in organizations that call themselves results-driven organizations. This is a phrase that inherently is neither good nor bad. It's a neutral phrase. Sometimes organizations call themselves results driven, or outcomes driven, because they want to signal that they're going to give their people autonomy in how they do the work, that they're not going to be micromanaging and overly prescriptive. Sometimes they use this phrase, because basically, they are admitting to you that what they care about is the outcome and the outcome alone, and they don't care at all how you get there, no matter how many bridges you burn, no matter how many people you mistreat. These are the companies that are basically brilliant jerks. This isn't inherently bad. It does even in the best organization, even the organizations that are just trying to get autonomy, it does lead to some undesirable results.

To illustrate, I'm going to talk about poker. I used to like to play a lot of poker. I would play online. I would play in the casino sometimes, and play in tournaments. The way I thought about poker when I used to play it a lot was that it had just these aspects of it, where of course, baseline, to play poker, you have to understand the rules. That's very similar to an organization. With an organization, there's policy, there's procedure, there's rules. Then, you have to understand the likelihood of the action you're about to take, or the decision you're about to make yielding a good or bad result. Basically, you had to assess the probability. Then we would use this concept called pot odds to understand the impact of the probability. Basically, the idea being that you would look at the likelihood you would win the hand based on what's on the table and what's in your hands. You would look at the money on the table. You would look at the money you were willing to put into the table. Then basically based on all that information, you take this calculated risk. Of course, this has a huge aspect of reading people and trying to guess what decisions they might make. Then occasionally through this, you might make money.

When organizations only look at just the results, it's like only looking at that number five and only looking at when you win money, and then deciding that no matter what happened leading up to that, it must have been the right decision. In poker, it's obvious that people win money all the time making a terrible decision. It's obvious that the best players in the world sometimes do everything right and still lose money. Annie Duke, who's a well-known poker player, she has this book called, "Thinking in Bets." She talks about this tendency for leaders in organizations and poker players early on to do this bad thing called resulting. Resulting is when you associate the goodness or badness of a decision with the outcome, independent of behavior. We don't want to do this, because as an organization, if we only look at the outcomes, we can't really learn, because there's this element of luck in everything we do. The world is too complex. The systems are too complex. The organizations are too complex for us to control every variable. We're sometimes trusting probability or likelihood or some fluke accident happening in order for things to go exactly right or go exactly wrong. When that happens, if you do only evaluate the outcomes, you end up with a very bad understanding of your organization, and what's working and not working. That illustrates, again, why you have to look at all the pieces. You have to actually have the outcomes and the behavior as part of your evaluation process. The organization has to give assurances that we will look at all those things.

Recipe for Evaluation

If we're not going to evaluate just based on outcomes, we're not going to be that results-driven organization, and we do want to include behaviors. There's a good question of like, how do we decide what behaviors we want to evaluate, and what to reward and not to reward? Typically, I always bring it back to values. Organizations, teams, they have values. Those values can be things like we have a value for release velocity, or we have a value for frugality, or we have a value for innovation. Basically, whatever your company decides. From those values, the company usually derives some norms. If they value transparency, maybe they have a lot more communication, and a lot more meetings where there's these large forums where people can talk and ask questions. Basically, these norms result from the values. Then, from that you can start to see what we expect of individuals' behaviors, in order to sit in this value system. Then we want to see signs that those behaviors that we're saying we value we want to see are actually yielding good outcomes. If you don't do this, and let's say you just look at outcomes, like I talked about with the whole resulting example. In addition to not being able to figure out what you should and shouldn't do, you also sometimes trend towards a very cutthroat environment, because people are rewarded for getting what they wanted, not necessarily for doing the right thing. If you also only reward behavior, like, "I don't trust the outcomes. I'm just going to make sure that people are always doing the right things. I'm going to reward them for that." You basically have this chance of losing innovation and efficacy of your organization, because the world is going to change around you, the system is going to change around you. Then you're going to keep doing the same things that you thought always worked, and it's going to stop working, and you're not going to notice.

Recipes for Improvement

How do we get better if we are going to have this mix of looking at behaviors and outcomes? How do we improve? Throughout this track, I'm sure you're going to hear from a lot of people about how you need to do incident analysis, how you need to also look at successes. I'm just going to reiterate that. A piece of improvement requires that you have to retrospect when things went well. You have to, absolutely. You have to understand, what does good look like in your system? What does it look like when everything's working? What are the behaviors that are happening? How is that sustained? How is that reproducible? Sometimes you find and you're looking like, it's not reproducible. It's all luck. Then you have to look at, are there day-to-day actions that my people are doing, and they're the ones who are making sure that this success is happening because they keep doing these behaviors? Then you need to understand what those behaviors were and you need to capture them. Then you need to look and think critically, what if those things stopped? What would change about the outcomes? What would change about the organization if people weren't doing that? Not because you want to stop them but just because you want to do the thought experiment of how essential are these things.

Since you want to be careful of survivorship bias, where you're only looking at the positive outcomes and saying all the things that led to those positive outcomes must be part of achieving the positive outcomes. You also want to look at near misses. You want to look at what prevented them from being a catastrophe where you got lucky. You want to make sure the luck can happen again. Then, of course, you want to look at your accidents, your errors, your incidents, your outages, and ask yourself, could it have been worse? Were we lucky it wasn't worse? Also, how could it have been better, not in a prevention set, like not in the sense of how do we stop this from ever happening again. More in the sense of, if it does happen again, how will the organization respond differently? How will they make sure that everyone is in a better position to do the right thing at this point in time? I want to really stress that I've seen this way too much that organizations a lot of times when they do this analysis, they spend too much time thinking about, how can we have stopped it? They forget that, at the end of the day, humans are always going to make mistakes. Software is always going to have defects. Hardware is always going to fail. If you fixate too much on prevention, you lose this rich understanding of how you could have made the software in your organization more resilient, which is really when you want to be. In addition, when folks over-fixate on prevention, they sometimes accidentally trend back into that pattern of root cause analysis.

Recipe for Making Retrospectives Punishment

Speaking of root cause analysis, sometimes folks have this idea that in order to have accountability in the organization, you must make people clean up their own mistakes. I did this early in my career as a manager. It was a mistake I made, and I've since learned from it. Basically, it's this idea that we want the people who were involved in the behaviors or decisions that led to the poor outcome, to be the ones that own driving the solution. Whether we intended or not, that's treating retrospectives as punishment. The reason that is, is because in order for this to work, we have to find a single person who we have to then attribute the situation to. We have to know who's to blame. To do that, we have to sometimes oversimplify what happened into a sequence of events so we can land at a decision point, that everyone can buy into the idea that that decision point is where everything went wrong. Then we figure out who's responsible for that decision point, and we call that the root cause. Then we give that person who is that decision point, the work of then making sure the situation doesn't happen again. They have to write everything. They have to do all the work. Then, often, we actually couple this with some terrible process where they have to go in front of leadership and explain what happened and possibly get yelled at. All of this, whether we intended or not, is actually punishment. It's a consequence for actions.

This is a bad idea, because punishing people when they make a mistake tends to cause them not to tell you when they made a mistake. It's even worse because of that first line item, that we had to identify a root cause. Root cause is an antipattern. You can Google root cause considered harmful. There's tons of blogs and discussions of this. At the end of the day, any time you get stuck in this pattern of root cause analysis, what you're doing is you're leading yourself down a very narrow, biased, linear path to a single point where, whether you intended it or not, you're blaming something. You're missing out on this greater, rich organizational learning on all the chaos and all the complexity that was at play across the sociotechnical system when the bad thing happened. Don't just default to saying, I'm going to have accountability by having the person who caused the issue, fix it, or write the document on how to fix it. Don't do this.

Carrots and Sticks

This naturally leads to a discussion of carrots and sticks. In every organization, there are all these micro and macro rewards and consequences. We could say, the rewards are the carrots, and the consequences are the sticks. A lot of times we have this desire to manage outcomes and behaviors by saying, we're going to have consequences for things that don't go well and we're going to have carrots or rewards for things that do go well. That makes us have to ask the question, does punishing people actually work? Generally speaking, no. If you look at it, there's lots of studies and lots of discussion. Generally speaking, what we found is that positive reinforcement works a lot better than negative reinforcement. Generally, if you want to have the right behavior in your organization, you want to be modeling the right behavior. You want to be showing signs that the organization is celebrating the right behavior. You don't necessarily want to be creating this culture of you'll punish people or consequences. Because again, that creates a world where people tend to want to hide things from you, because they don't want to get in trouble.

There's a story from Daniel Kahneman about how he was trying to convince some flight instructor that yelling at fighter pilots was actually not the way to get better performance from them. The flight instructor didn't believe Kahneman. He didn't believe Kahneman because he said, every time I yell at one of my students, I see improvement shortly after. Every time I praise one of my students, they screw up a little bit later, they make a mistake. He's like, so clearly, negative reinforcement works better than positive reinforcement. What's actually happening is this thing called regression to the mean, which is that there's this average performance that people might have and then there's variability all over the place throughout different times that they're performing the activity or realizing the outcome. If you see a long period where someone's doing everything right, if there's this average performance level, eventually, you're probably going to see a period where things aren't going as well. It's going to all fit this general bell curve of human performance. When the flight instructor was basically saying, when I yell at people, they get better, he was just getting lucky that it was coinciding with when folks were just going to naturally get better because performance is variable. We have to be very careful when we're trying to decide whether these rewards or these consequences are working, that we're not falling victim to this pattern.

What's interesting is I actually see people fall victim to this when they're trying to address outages or incidents, in that during their incident analysis, they might have this really great idea that they come up with. They say, we're going to prevent future outages by doing this thing, slowing rollouts, or we're going to add additional reviews. Then, lo and behold, for some period of time after the outage, when they instituted this idea, things seem to get better. Again, it's just that complexity fitting a curve. It really is regression to the mean again. Also, just another example of survivorship bias. If we only look at when things go well, and we say, this behavior, or this activity, or this consequence led to things going well, and then we don't look at all the other times that we did the exact same thing and it didn't go well. Generally, positive reinforcement beats negative reinforcement. We'd rather have rewards than consequences in order to motivate people to do the right thing.

Intrinsic Motivation Beats Extrinsic Motivation

The next level of understanding is, intrinsic motivation beats extrinsic motivation. If you have to use an external reward or consequence to get people to do the right thing, you're going to have less success than if you can actually just get them to do it as part of who they think they are, as part of what they want to do. Even in a world where we're trying to get this big behavior change, because we saw this pattern, and we want a different outcome and so we're saying, as an organization we're going to learn, and we're going to do something different. You don't want to motivate people by just immediately rewarding them for doing the right thing, or immediately punishing them for not. You actually want to look at like, how can I actually create the motivation for them to want to do it naturally? Daniel Pink has this book, "Drive" where he talks about purpose, autonomy, and mastery, and how that's a good model, and it is a model for what motivates people. Purpose being the why. Why are we doing the work? Why do we want this behavior? Autonomy being like, giving some people some ability to make some decisions, not being overly prescriptive. Mastery being, allowing folks to improve their skills, and then demonstrate that they have these skills improved, so they can feel like more of what they're doing is a craft and less of like rote, repetitive work.

Then adding to that, beyond Daniel Pink, there's this book called, "If You're So Smart, Why Aren't You Happy?" That book talks about a lot of studies around happiness, and what makes people happy. Then it pivots to making people happy at work. One of the things it talked about is people are happy when they have the state of flow. Flow is what we call being in the zone. It's basically anytime you feel like hyper-productive and immersed in your work. According to "If You're So Smart, Why Aren't You Happy," the more we can give people flow, the more that we're going to get them to want to do the right things. Basically, how I'd like to look at that is the more we disrupt flow, the more we're going to get people to do the wrong thing, because we're not creating the incentive for people to do the right thing. Basically, if we want to motivate people to do the right things, we need to figure out, how do we approach it from this aspect? How do we play with these four ideas and get people to want to just intrinsically do the right thing?

The other thing about, "If You're So Smart, Why Aren't You Happy," is they do actually go tackle the idea of whether or not people are happier by having money, or by being promoted, or whatever. The reason people aren't happier by those things, and therefore, why it's a poor motivator, is because they happen rarely, they happen infrequently. You get your raises certain times of the year. You get your promotion certain times of the year. Then, they only provide relative joy. You're happy because you're looking back on where you just were, and now you're in a new place, and you're like, "The comparison. This is good." Or you're looking at peers, you're saying, "I'm doing better than this peer." Comparison based joy or happiness is very short-lived, because you normalize to where you are now to your new state. You really want to focus less on this, on how you motivate people. I'm not saying don't give raises and don't give promotions. Definitely do those things. This isn't how you're going to get people to do the right things every day.

Then the other mistake I see in addition to over-leaning on these very temporary rewards like raises and promotions, is leaning too much on purpose. I used to work for a health care organization. In that organization, they would often try to get us to write better quality software by bringing in nurses or physicians and having them talk about the consequences of us writing bad code. Like if we had a bug, a patient could die. Despite the organization thinking that this would motivate us, give us a strong sense of purpose and why, and then we'd naturally do the right things, it actually was a demotivator. It was a demotivator for two reasons. One, it's scary, and it created an anxiety in people. People, instead of trying to take more ownership and make sure the right outcomes happen, they tried to pull back their level of ownership, and scope it so that they had the smaller ball of ownership, so they could feel less guilty about the outcomes, so they wouldn't be blamed. The other thing that happened was that people got mad because it was somewhat condescending and insulting to imply that the reason we were writing that software was not because maybe our build system needed improvement, or maybe we needed better testing tools, or maybe folks needed more time in the project lifecycle in order to invest in quality controls. No, it was like, basically, you just didn't know any better. If we just told you, you would do better. Of course, that's not true. People had no value in terms of making people do the right thing. Be careful about overabundance of purpose. It can go wrong.

I don't want to imply that you should never reward people. You shouldn't think of it as a carrot you dangle to get people to do the right thing. You should come to people and you should reward them. Actually, I prefer if we're talking about day-to-day rewards, or more frequent rewards and what can happen from a promotion or compensation cycle, I like whimsical things. I think people actually really adore getting surprised and being delighted. We had a project where a lot of folks worked really hard in order to speed up something we were working on. I bought everyone little Hot Wheels cars, just a funny, little token of my appreciation that people worked so hard on this performance improvement. To this day, I still occasionally get someone messaging me that they're cleaning out a drawer, and they ran across one of these Hot Wheels, and they take a picture and send it to me. I wrote little notes to everyone on the back.

Culture of Continuous Improvement

What I'm trending towards and where I'm going with this is that blameless accountability is really about building this culture of continuous improvement, or at least it goes hand in hand with the culture of continuous improvement. How you get a culture of continuous improvement is you have to empower people. You have to make sure that you're creating an environment where the right behaviors are consistently happening. You need to reward people for improvement, not for absolutes. What I mean by this is that sometimes in organizations, they'll set a really high bar target that they want everyone to hit. Then they wait until you hit that target to reward people. It ends up creating the situation where people are not motivated to improve, because the people that are already in good state, they hit that target and they just hang out there. They're like, "This is great, we're meeting the target." Then the folks that maybe were way far away from that target, they may see reaching that target is just unreasonable, they're just never going to get there. They end up giving up on the motivation of hitting that absolute target.

I worked in an organization where one of the targets was zero defects. You can imagine how demotivating it was to get to the bar of what was being asked of us. We had to never make a mistake in the code. Instead, naturally, what you get is people redefining the word defect. We want to reward improvement. We want to make time for development and exploration so that people can be creative. They can find that mastery. They can find the autonomy to do things differently. Then we want to constantly seek and incorporate feedback. We do want to seek criticism. We want criticism to be part of the culture. Really stressing that rewarding improvement, we want to say that getting better is just as good as staying great. Even teams that are doing extraordinarily well should be seeking criticism and feedback. Obviously, constructive criticism, but they should be seeking it.

Recipe for Toxic Positivity

That leads me to the next part, which is, how do people get blamelessness wrong? Usually what happens is they're so afraid of blaming someone, they don't actually want to critically evaluate any decision. They want to treat everything as if it was exactly the right thing to do. Or they want to create an environment where people can only do the right things. The combination of those two things leads to what I think of as toxic positivity. Toxic positivity is when the organization creates a culture where we only want to see things going well. It's like the failure is not an option culture. What happens is folks discourage or dismiss constructive criticism. Criticism is considered a bad thing, because, of course, you're blaming someone, or you're saying something is wrong. Then, they only reward positive messages, because you're praising people. You're saying everything's going well, and everything's amazing. Then, that means when someone points out like, things aren't really going well, and we have this risk, and we're not addressing this risk, and it's going to cause a problem. They get punished by the culture, because the culture is like, "No, we don't do that. We don't talk about that stuff." Then, because folks are punished if they bring up risks or admit failures, now we have this environment where folks are just committed, and they're required to commit to any decision without any option to disagree or to improve on it. It's like you have to commit first and then don't do the disagree part at all. This leads to these toxic positivity environments.

I've worked in one for a while. The signal that I had, that it fit this pattern was one of our cultural statements. One of the pithy statements we had was, if you must oppose, you must propose. You might think, what's wrong with that? If someone's going to bring up a risk, they shouldn't just be like, "This sucks, we shouldn't do it." They should call out like, how can we do it? How can we own a solution? The reason it's bad is that, often, folks will identify a risk, but it's not their day job. They're just there and they have their own day job, and they have other work. Then they say, I'm actually worried that this thing isn't going well and this thing isn't going right. I want to highlight to leadership that we have this area of concern that we need to address. If it's a, if you must oppose, you must propose culture, they get handed the work of fixing the problem. Raising the risk means suddenly they get overloaded, they get extra work. They're like, "I already had three projects, I didn't need the fourth one." Be very careful about creating the sense of that anyone who brings up a risk has to carry with them the responsibility of addressing the risk. It's ok to hear critical feedback, especially if it's constructive. Definitely, if it's constructive, that's trying to improve the overall organization, or the overall success of the system, even if they don't necessarily have an answer for what to do about it. Don't do this.

4+1 View of Software Architecture

This just comes back to, every piece of software sits in a very complex space, that's a mixture of the people interacting with it, the timeline and sequences of events and messages, and threads, and processes. Then it's running on hardware and that hardware fails. We have to make sure that when we're addressing systems, and we're looking at the world that we live in, that we're looking at the whole of it. We're identifying areas for improvement based on the whole, and not over-fixating on a small slice of the world. There's this great essay called the 4+1 View of Software Architecture. It was my first introduction to the idea that I should look at my system in all these different lenses, because if I really want to find areas for improvement, I can't just fixate on the code, I have to think about the whole thing. I have to think about that there are humans in the system. Those humans are both saving the day, but also introducing risk, and I have to balance that. I have to understand that this system is very complex, and I have to make it more resilient. To do that means I have to be constantly looking for places to improve.

Passive Voice - Where Are the People?

Since we recognize that there are people interacting with the system, and there are people building the system, and there are people deploying the system sometimes, we have to make sure that when we do this whole constructive criticism because we're avoiding that toxic positivity, that we don't forget to actually talk about the people sometimes. This is an area where I see people use blamelessness wrong. They say, to be blameless, we can never speak of the people. We can never speak of decisions. We can never speak of activities or behaviors. We must use passive voice: so, the system crashed, or the software was deployed, or the bug was written. It's like nobody ever mentions the human that was involved in that. When you do this thing, when you basically say, we're going to try to remove all the humans to make sure that we're completely blameless, you end up losing a lot of organizational learning about wherever people may be confused by the information or priorities presented to them. Where were people missing skills? Where were the tools not helping the people at all? You want to make sure that you are thinking about, who was involved here? That you're talking to those people and you're learning from those experiences.

Skills and Knowledge Gaps

To do this in a blameless way, and to get blamelessness right, you have to have what I call a charitable framing, which is, you have to assume people are competent. You have to assume that they're trying to do the right thing. Then you basically ask yourself, what made this decision seem the most optimal at the time? One of the things I've learned being a parent, but also working a long time in the industry, is people really do funny things when they don't know what else to do. When they are left in a difficult position where all the constraints are conflicting, and maybe they lack skills, but maybe actually they don't and it's just the system is a mess. They're like, "I don't know. I'll try this thing and see what happens." That's not their fault. It's not that they made a bad decision, it's just that that system set them up to have this be the only thing that they can do.

I don't want to be arguing for training, as we traditionally think of it. I don't like root cause, but in my career, I've been in a number of root cause analysis meetings because I was forced to attend them. In them, basically we get back to like, some human made a decision, and folks don't like the decision. They basically say, "There was training missing, we should have trained everyone." Then you end up with this horrible organizational cost of both developing training, which is very expensive to develop, but also taking your entire population and trying to send them to it in the hopes that they'll remember when the situation happens again, and almost always the same situation actually doesn't happen again. We don't want to have this training where we're going to give people all the knowledge and they're going to make the right decisions. What we actually want to do when we're looking at this and trying to improve from a blameless perspective, is we want to think in terms of systems. We want to think humans are part of the systems, they're using technology. They're building technology. There's an environment around them. They have policies and procedures they're trying to follow. They have incentives we've put in place. Then there are feedback loops amongst all of that that creates this giant sociotechnical system that makes people do the things they do.

Then, because we understand that, we can talk about the people and we can figure out, which of these things do we need to adjust so that the outcome is better next time? Sometimes it is skill development. Sometimes it is like, we did actually set these people up to fail, because we asked something of them that we hadn't actually prepared them to do yet. I definitely don't want people defaulting when this happens to web based training, and instructor led training, because those things have a very low return on investment. More like if we're thinking about skills development, we need to think bigger about, how do we get the information or the encouragement to people just in time? How do we focus on teaching them how to think differently, or different behaviors, not telling them something different, and giving them new knowledge? Of course, how do we make this constantly reinforced through continuous and ambient messaging? How do we get people to avenues to interact with this training or skills development we're offering them? It ends up being this really complex thing that isn't as easy as just a slide deck, or a fun, little quiz at the end. I had this opportunity once to meet with the folks doing the strategy for one of the very large, very popular online training systems in the industry. They asked me if I could have one thing that we'd train people on doing, what would it be? What's the one thing missing in my organization that people really need to know? I said, systems thinking. Can you teach people systems thinking? Basically, they said, we were hoping you were going to say Hadoop or C#, unfortunately.

Legibility

Sometimes when we accept that people are going to make mistakes, and we recognize that we can't really train everyone to make all the right decisions all the time, and we accept that the system is complex, and that we need to get the right outcomes. We go a little bit too far, and we try to create systems and structures that basically are overly prescriptive, and basically ensure that nothing bad ever happens. It doesn't work. I call this legibility. I call this legibility because of the book, "Seeing Like a State." It's basically whenever we try to have leadership make decisions for everyone, either by instituting policy or rules, or we try to have leadership have visibility into everything by creating a lot of process that surfaces information up to leadership, and so, making the things fit the report. Legibility is when we try to create an environment where leadership can make better decisions, or they can have more visibility into the organization by making everything on the ground, all the folks actually do extra work to provide that. "Seeing Like a State" has this great anecdote about how, in an attempt to count trees, this leader decided that the easiest way to count trees would be make the forest fit the way it needed to be, to be easier to count and turn it into a math problem instead of an actual human problem of folks running around counting trees and making mistakes. They had this process by which they forced folks to reimagine the forest and rebuild the forest as a countable forest. Basically, what happened is that in their attempt to make the forest fit this very prescriptive, uniform world that was easy to count, they accidentally lost the whole forest to blight.

Legibility is a bad thing. When you have these rules-driven, top-down decision making culture where you're taking away some of the ability for people to make decisions and deal with the chaos and complexity of the system, you're actually sacrificing some of the responsibility. That's going back all the way to that original definition of accountability. I said, you have to have responsibility. You can't have accountability without responsibility. When you're overly prescriptive, and you try to create a rule system in which no one can make a bad decision, basically, you're creating a system where people do exactly what you said, but not exactly what you wanted. If you tell someone exactly how to do something, they'll do it exactly as you asked. In this way, legibility hurts innovation, because the reports, so the policy becomes the contract with your employees not delivering the outcomes. This leads to Goodhart's Law, which is captured really well in this book called, "Measuring and Managing Performance in Organizations." That's one of my favorite all-time books on leadership. Basically, it talks about a general understanding of when you try to use measurements as the way you're managing something, and you're trying to conform to the measurements, you get in this really bad situation where the measurements you're watching are going great, but what you actually cared about is going terrible. The whole thesis of the book is if you can't measure everything important, you have to trust and delegate. When leaders try to pull all the decision making up by pulling all the information up, they miss out on information. They're losing information, and they can't measure everything, and so we end up with bad outcomes. Then this ties back to that whole motivation idea. We do want to give people autonomy. We do want to give them mastery. We want to let them own how the work happens. Otherwise, we get malicious compliance. Basically, we get exactly what was prescribed.

We're not going to do this. We're not going to do legibility. We're not going to solve the problem by basically making it so that humans don't have to make decisions. Instead, we actually want to do the complete opposite, to actually have a culture of blameless accountability and actually really yield the improvement we want to see in our organization. We have to actively remove rules and policy that were meant to introduce safety. We have to simplify, get rid of bureaucracy, eliminate hierarchical decision making, because we actually do want to give people that autonomy and that empowerment to do right by the system and right by the organization. One way I put this is we move the decision to the information, not the other way around. Circling all back to that definition of accountability. We want an assurance that folks will be evaluated based on outcomes resulting from behavior for something they were responsible. We need that responsibility in order for this to be successful.

Final Recipe for Blameless Accountability

What's my final recipe for blameless accountability? I think, most importantly, you have to evaluate behavior and outcomes, not just one or the other. You need to be looking back when things go well, when things barely succeed, when things go poorly, and not just fixate on when things go poorly. We need to avoid the causality credo and root cause. We need to make sure that we're not creating this overly simplified narrative that actually leads us towards blamefulness. If we want to get people to own the solution and do the right thing, we should use purpose, autonomy, mastery, and flow, because that's what motivates people day-to-day. We definitely want to create a culture of continuous improvement. Then, part of that is actually eliminating rules, regulations, and measurements driven by legibility. Then, again, move the decision to the information, not the other way around.

Questions and Answers

Probst: One of the things that is really a big topic a lot is psychological safety, and how we handle psychological safety and not confuse it with the ability to still deliver feedback. Some of us are managers. We're often in this position where we have to deliver some feedback and help people grow, but at the same time, we don't want to sacrifice psychological safety. Can you talk a little bit about how you think about those two things?

Brush: One of my favorite books that I think relates to psychological safety, even though if you've watched the show, Silicon Valley, or some other shows, they tend to mock this book as if it doesn't. There's this book, "Radical Candor." It talks about how the ability to give feedback is predicated on the fact that people have to believe you care about them, and they have to believe that you have concern for their well-being. Where psychological safety goes out the window is when folks fear that if you're giving feedback to them, it's because you're looking for reasons to fire them, or you're looking for reasons to punish them. They're not seeing them as you coming to them from a place of you're looking for them to succeed. One of the things I think that I try to work with folks, when I'm telling them, your behavior here didn't align with our values, or, let's come together and talk about why we didn't get the outcomes we wanted. Is that it is coming from this place of concern, around wanting the organization to have continuous improvement, around wanting the individual to continuously get better, so that they can have their career goals and succeed in the organization.

Generally, when I frame it from that standpoint, it works 80% of the time. There's still that 20% of folks that really struggle with hearing feedback. I actually was one of those folks early in my career when I first got out of college. I was used to being the smartest person in the room. It was just because I went to a small college, and I did really well there. There was just a lot of times where I was the one that had all the answers, I was the one tutoring everyone else. When I came into the workplace, and I started being around people smarter than me, it was this jarring that I would get this feedback. I wasn't used to experiencing that. Someone told me, and I think this is what helped me get over it, is like, "Michelle, it's not about you, it's about the work." Hearing that really helped me realize that we're all in this together to try to make things better. That's why we all need to hear feedback.

Probst: You talked about behaviors and outcomes. How do you deal with people and feedback and psychological safety when the outcome is there, but the behavior is not, and it's not necessarily an outrageously violating principle, or violating a code of conduct, but it's more subtle? How do you deal with those situations?

Brush: There's a degree of severity to it, to your point. There's the way and we all know what to do if it's code of conduct. Then there's the other end, which I call the, wouldn't it have been better, or this would have gone even more amazing levels. If you're starting down to like someone could have just done a little bit better in how they interacted with others, or in how they supported other people, or maybe in how they dotted all the I's and crossed all the T's in the work they produced. Then I tend to come at it as like, "You did a great job. However, I would like to talk about how it could be even better for next time." I make sure not to end that with a compliment sandwich. A compliment sandwich is compliment, feedback, compliment. People forget the meat, they forget the sandwich every time you do that. I always try if I am going to start in that way, I do the compliment, but then I make sure that the punctuation ends after the feedback. That's all the way on this side. You get progressively more intense with the coaching as you get closer to that part of the spectrum that's like, no, this behavior is trending towards not acceptable, or it's going to have long term consequences to the organization. When I start trending in that direction, I do try to center it in potential future outcomes. Like, you got what you wanted this time, but I could still in the future, if you did it the same way again, that this team might not want to work with you, or that we might run into this risk in production and not have capacity, or whatever the problem was, and try to center it there.

Probst: I had never heard the notion of root cause being an antipattern, though, it's starting to make sense based on your talk. Do you have any recommendations for reading more about this? I've worked in engineering cultures where the whole notion probably sounds like heresy.

Brush: I have worked in those cultures as well. There's this great discussion of Safety-I thinking versus Safety-II thinking. It's coming out of a whole group of folks. There's Dr. Robert Cook. There's Sidney Dekker. There's the staff at the London University. This whole group of folks that started their research in, when you look at even higher stakes than technology organizations, like healthcare and aviation, and the Coast Guard, and marine accidents around, how do you actually analyze accidents in a way that leads to long term better outcomes? One of the things that they came across was that Safety-I thinking is how most organizations, ones I've been a part of, ones you've been a part of, think about preventing accidents. It is very much driven as like, we're going to have a single narrative, we're going to have a single line. We're going to get to that point through the single narrative, and then we're going to fix everything in that single narrative. Then this won't happen again. Safety-II thinking is starting to realize that that really narrowed your response and your organizational resiliency by really fixating on that single line. It starts thinking about like, how do we broaden? Definitely look up things on Safety-I thinking and Safety-II thinking. Definitely look up stuff from Dr. Robert Cook, Sidney Dekker.

Probst: Some people might be working in organizations where post-mortems are definitely not blameless and are more treated as punishment. How do you suggest working with executives within such companies on how to move the needle on the culture?

Brush: I said don't do, if you must oppose, you must propose. This is one of those cases, I think, sometimes people have to see it to believe it. The idea is, show them how it should be done. Managers and leaders are really equipped to do this, because we have a lot of institutional credibility and authority. One of the things I have done is said, I will write the post-mortem and I will show you how it should be done. I will show you that you get a richer set of outcomes if you don't treat it as punishment. If you give it to this neutral third party who understands the organization and the technology, but isn't in the middle of the decision-making process, because they can come into it and they can look at every decision that was made in a new light, and they won't be hiding things. They won't be trying to tell a story about how everything could have gone well if it hadn't been for that pesky hardware failure, or whatever. I do think that helps a lot. If your leadership is open to reading things, there's the Howie guide from Nora Jones and company. There's work from John Allspaw. There's a ton of work in this space you can share.

See more presentations with transcripts

Recorded at:

Jul 25, 2023

Michelle Brush

InfoQ Software Architects' Newsletter