At the DevOps Enterprise Summit in London this year, authors of the-soon-to-be-published 'Team Topologies' book that aims to offer a practical and adaptive model for organisational design, Matthew Skelton and Manuel Pais took to the stage to share their thoughts with the audience.
Skelton and Pais explained that software that is too big for our heads works against organisational agility, and advised that leaders limit the size of services or products to the cognitive load that the team can handle. They also asserted that each service must be fully owned by a team with sufficient cognitive capacity to build and operate it. There are three types of cognitive load they consider: intrinsic (skills), extraneous (mechanism), and germane (the domain focus).
They have identified a number of techniques that can help reduce the need for intrinsic and extraneous load in order to leave more space for germane load, which then frees the team to own more or larger services among them: mobbing, domain driven design (DDD), developer and operations experience, and Thinnest Viable Platform (TVP). In their research, they have identified four fundamental team topologies and have identified that the last three are designed to lower the cognitive load on the first one:
- Stream-aligned team
- Enabling team
- Complicated subsystem team
Platform team
A stream-aligned team is a cross functional product team, but a team may only own part of a larger product. They explained that:
There is a need to be much more explicit about the ways in which teams interact. Teams don’t understand how or why they should interact with other teams.
Pais shared several case studies with the audience, highlighting their triggers for evolution: software too large, over specialisation, increased coordination needs, awkward interactions, people uninvested/burned out, and frequent context switching. InfoQ took the opportunity to discuss the themes of the book in more detail with the authors
InfoQ: How important is it for a leader to understand what’s going on in their teams’ brains at a cognitive and neurological level?
Matthew Skelton & Manuel Pais: It can be helpful, but is not necessary. Cognitive load is quite easy to understand on the surface. In many organisations it’s not being asked whether the cognitive load is adequate, and people tend to ask successful teams to do more and more and exceed the cognitive load available.
InfoQ: Is there a way to match software size/complexity to available cognitive load?
Skelton & Pais: There’s not an exact measure but you can have a relative measure; you can compare domain complexity. Some domains may require discovery (heavier work) and others more repetitive tasks. Thinking of extraneous and germane types of work or cognitive flow, drag can be created by asking people to use difficult technologies.
You can ask a team to score from one to five on how well they are able to understand the system and track it weekly or quickly with some context and several questions. Leaders can look at the answers and it may highlight individuals that would benefit from extra assistance. It may also indicate that we need to do something architecturally with the software; if the team is constantly context switching then their working memory will be taken up with extraneous tasks. We want to minimise extraneous cognitive load.
Long-lived teams focused on a single service helps us achieve these objectives and it also implies that other teams can’t just come in and change it.
InfoQ: How ready do you think organisations are to have conversations about sociotechnical and neurotechnical patterns?
Skelton & Pais: Some already are: Justin Kitagawa, head of platforms at Twilio, talks about unlocking developer effectiveness, explicitly aiming at reducing cognitive load. We can extend the conversation to usability and user experience with both internal and external users. Usability is so important but there are so many gaps - like developer experience and operations experience.
InfoQ: What are the evolutionary reasons for team size?
Skelton & Pais: We can look at Dunbar’s number: psychologist Robert Dunbar put in the research around something people had identified intuitively; that there are natural group sizes based on trust. Around one hundred and fifty people is the limit for an important kind of trust or familiarity; group sizes beyond this can trigger the ‘us and them’ moment. WL Gore (Goretex) realised when a factory reached a certain size (150 people) it had reached its trust limit, and they would stop growing it and open another factory; they have scaled very effectively that way.
InfoQ: Who is responsible for deciding a team topology?
Skelton & Pais: Typically it should be an architect, but an architect with a remit for organisational design plus technology; a traditional solutions architect who is only thinking about the silicon part and not the social part of the system may be creating a problem.
It's about working with Conway's law and ensuring the team are not missing out on possible solutions due to being restricted by current team structures. Management often decides on the team structure but they don’t necessarily understand the technical boundaries. We need people who understand dynamics; it’s a collaborative effort between people who understand both the business and system drivers. They need to be ‘Conway-aware’.
InfoQ: How often should we listen for the evolution triggers?
Skelton & Pais: It depends how fast you are moving; if the environment is moving rapidly, do it more frequently. Somebody needs to facilitate the discussion and hear the sensors; team leads can have an overview and look out for signals, but you can have some more intentional inspection. There’s a set of heuristics or triggers in the book that show us if something’s not working and we have context for why it’s not.
A lot of organisations are not listening to the barrage of signals that are coming at them; creating a schema of signals that allow people to recognise them leads to better results and improved team interaction modes. If we have better defined teams and better defined interaction modes then organisations will be better at detecting signals, which in turn will help the organisation to innovate and evolve.