BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Articles Software Systems Need Skin in the Game

Software Systems Need Skin in the Game

Key Takeaways

  • Skin in the game means having decision-makers bear the consequences of their decisions
  • All systems need that connection to consequences to drive evolutionary pressure
  • Software development is a “systems building” activity that has suffered greatly from a history of removing skin in the game
  • Modern software practices and management systems have reversed the trend and put skin back in the game 
  • On-call engineering is the quintessential modern engineering practice to create skin in the software development game 

In science, the test of all knowledge is the experiment. This is the sole source of scientific truth. But where does the knowledge to test come from? In physics, the thinking process is so difficult that there is a division of labor - theoretical physicists dream up new ideas, and experimental physicists create the experiments to test them.

This division of labor between ‘thinking’ and ‘doing’ might make sense if you are trying to understand the origins of the universe, but here on earth - in the real world - we do not have the luxury of letting others dream up experiments for our own lives. Consequential decisions need to be taken by the people who pay for the consequences, by the people with skin in the game

Skin in the game is an important attribute of healthy systems. It creates symmetry of consequences. When bad decisions are made in a system where people have skin in the game, evolutionary processes will either alter the decision or eliminate the decision-makers. When skin in the game is removed, that evolutionary pressure is also removed.

In software, where work is made up entirely of system building, we have continuously missed this crucial point, lining up countless software developers in front of the firing squad of asymmetry.

There was a period in the history of software where managers tried to pull everything they could away from the developers writing the code. Theoretically this should make things faster: "With the division of labor we can create a software factory!" But in reality, we now know that this fails miserably. The systems crumble under the weight of decisions made without consequences. 

Systems of Software Engineering

Software systems are emergent. They come into being through the accumulation of many small, collectively-made decisions. We write those decisions into code, but they are based on imperfect information and will need to be updated as we learn from the system.

The learning comes from all the activities that surround the code, the testing, monitoring, securing, maintaining, demonstrating, using of the system. That is where the "riskiness" of our decisions becomes obvious. Systems of engineering that move those tasks away from developers remove skin in the game, destroy evolutionary pressure, and generate software systems that are fragile, expensive and difficult to maintain.

We learned many hard lessons from this, and they run up and down the development stack:

  • It’s why we do DevOps. "You build it, you run it." Werner Vogels, CTO of Amazon, saw this clearly: if you don't run the code you write, you have no skin in the game. When one group does the building and another group does the running, you get the classic separation of doing and thinking, the division of decision-making and risk-bearing.

  • It’s why we killed QA teams. When one team is writing code and the other team is figuring out how to test it, you remove the direct contact between how it works and how it fails. Turns out “testing” teams do the exact opposite of what they intend. They remove the evolutionary pressure that is needed to stop bad code from entering the system. 

  • It’s why we build products, not projects. Projects have one group doing the planning, and another group doing the work. But it's easy to set a date when you don't have to figure out how to meet it. Ironically, separating the planning from the “doing” removes all predictability from the system. Developers end up burning out or shipping bad code to meet deadlines dreamed up by people with no skin in the game.

And we are still learning many hard lessons

  • Why is security still so poor in software? Because it's mostly compliance frameworks created by the checklist mafia with zero skin in the game. Cringe hard when you hear about how security is now a C-level problem. What a terrible place to put the problem, in the hands furthest away from the people doing the doing.

  • I once heard someone proclaim that technical debt should be made an executive level problem. My heart aches for the developers that have the decisions about their technical debt circumnavigated to what might as well be the other side of the world.

But we are making progress! Over the past two decades, synthetic management systems and technical practices have emerged that are designed to put the skin back in the game - full-stack teams that start together. Collective ownership of the product. Practices like Agile, DevOps, Continuous Integration and Continuous Delivery work together to remove encumbrances on the ability to own the risks that we create with our systems.

Software systems are under constant pressure to change, internally by the people contributing to the system, externally by the people using the system, and environmentally by the advancement of technology. There is an intense need to be able to evolve or die. To do this, software must be built in a way that is highly responsive to its evolutionary feedback loops.

Systems don’t learn because people learn individually – that’s the myth of modernity. Systems learn at the collective level by the mechanism of selection: by eliminating those elements that reduce the fitness of the whole, provided these have skin in the game... in the absence of the filtering of skin in the game, the mechanisms of evolution fail: if someone else dies in your stead, the build up of asymmetric risks and misfitness will cause the system to eventually blow-up. - Nassim Nicholas Taleb

The important question to ask is, how can we maximize skin in the game? How can we give developers the most direct contact with the risks they create. There is one practice that provides such contact, having a privileged position that creates a unique form of skin in the game that is unparalleled in the business world.

The Practice of On-call Engineering

Going “on-call” means that on some regularly-occurring cadence, you put down your regular work and spend a week working directly on the system. This means two things: 1) carrying the pager for a system, responding to issues in real time as they happen, and 2) doing the work needed to maintain the system, tuning monitors and alerts, working on regularly occurring tasks, or making small improvements to optimize the on-call experience. 

For a system where developers go on-call, the connection to risk they create could not be greater. If monitors are generating a lot of unreasonable or unactionable alerts, if components are loaded up with manual tasks and un-automatable issues, if the system is complicated and difficult to debug, it is the people who created those problems that must suffer. 

On-call engineering creates skin in the game in a way that deeply shapes our understanding of the world.

"What matters isn't what a person has or doesn't have; it is what he or she is afraid of losing." - Nassim Nicholas Taleb

An incomplete list of things that one loses when one goes on-call for a hard-to-operate system (in no particular order): sleep, weekends, evenings, time working on things they enjoy, general well-being.

When done correctly, on-call engineering unites the team and creates a powerful set of engineering ethics, a code of conduct that stems from overcoming adversity and the constant threat of ruin. The idea of taking on risk wantonly or haphazardly becomes deeply offensive, in a way that is hard to understand by people who have not gone on call before. 

Get Your Hands Dirty

Unnecessary risks can take many forms, but most often they come disguised in the recommendations and requirements of those who do not need to get woken up in the night by their consequences. There are different attempts to create skin in the game, but by far the most valuable asset is learning through experience. Time spent in the hot seat.

This is a key point: Until you have actually experienced the impact of on-call for a system, it is hard to imagine just how deeply it impacts your decision making. Skin in the game therefore implies that we need people to come down from our ivory towers and experience the world.

The knowledge we get by tinkering, via trial and error, experience, and the workings of time, in other words, contact with the earth, is vastly superior to that obtained through reasoning. Taleb, Skin in the Game, p.7

If you want to make decisions about a product's technology or architecture or technical debt or timelines or backlog or security, then go join that team. If you can't join the team, you can offer your opinion, but you don’t get a vote. You can even help do research, gather information and present facts. But the decision comes down to those with the risk. .

In the real world, when you bear both the risks and the rewards for your decisions, the outcomes are properly incentivized. The most powerful force in systems-building comes from the evolutionary pressure brought by decision-making when something is at stake. When we realize this, we can use that force to drive the evolution of our systems and generate the kind of value that customers pay for, and engineers love.

About the Author

John Rauser is a software engineering manager at Cisco Cloud Security, based out of Vancouver, Canada. His team is building the next generation of network and security products as cloud services. John has spent the last 10 years working in a variety of different roles across the spectrum of IT, from sysadmin to technology manager, network engineer to infosec lead, developer to engineering manager. John is passionate about synthetic management, new ways of working, and putting theory into practice. He speaks regularly at local and international conferences and writes for online publications.

Rate this Article

Adoption
Style

BT