Implementing microservices is really challenging, and there are many ways to fail. Holly Cummins has identified seven ways to fail at microservices, and on this episode of the podcast Thomas Betts asks her to describe them, and how they can be avoided.
Key Takeaways
- When implementing microservices, you must have a goal, and that goal should not be, “to have microservices.”
- Creating a distributed system does not automatically give you a decoupled system. Contract tests, such as Pact tests, can help to explicitly identify and handle the coupling between two services.
- Large enterprises often have multiple layers in their software, such as presentation, services, integration, and databases. If your approach to microservices only addresses one layer, the others will remain bottlenecks.
- Real CI/CD is hard to achieve because it takes significant investment in automation, and the benefits of the automation can be invisible when they work as intended.
- Adopting microservices, and a cloud-native approach, requires a holistic view of all the changes to be made. Being able to have self-service, but locking it down with traditional governance, minimizes the benefits.
Subscribe on:
Transcript
Intro [00:17]
Thomas Betts: Hello, and welcome to another episode of The InfoQ Podcast. I'm Thomas Betts, co-host of the podcast, lead editor for architecture and design at InfoQ and an application architect at Blackbaud. Today I'm joined by Holly Cummins. Holly is a senior principal software engineer on the Red Hat Quarkus team and a Java Champion. Before joining Red Hat, Holly was a longtime IBM-er. She's used the power of cloud to understand climate risks, count fish, help a blind athlete run solo ultra-marathons in the desert, and invent stories, although not all at the same time. Holly was last on the podcast a year ago, discussing the 2021 architecture and design trends. That means I get to say welcome back, Holly.
Holly Cummins: Thank you.
Thomas Betts: So just a few weeks ago, some of our listeners may have seen Holly at QCon London, where she was hosting the modern Java track, and was a panelist for a discussion on getting the most out of microservices. I unfortunately couldn't be there, but I've heard really good things about the conference and that panel in particular. But today we're not talking about Java or how to get the most out of microservices. I wanted to talk to Holly about a presentation she gave QCon Plus a few months ago on seven ways to fail at microservices.
I've done my homework and I have your list of ways to fail in front of me. I thought we could just walk down the list and discuss them if that works for you.
Holly Cummins: Yes. I mean, this is kind of how to get the least out of microservices.
The Murky Goal and Microservices Envy [01:27]
Thomas Betts: Right. So the first two are actually, in my head, fairly closely related. You called them the murky goal and microservices envy. So what should and should not be the goals of microservices?
Holly Cummins: I think the answer to that isn't necessarily prescriptive, but the really important thing is that you have to know what you're trying to get out of doing microservices, and I think what happens a lot of times is for really good reasons, organizations look around and they see that everybody else is doing microservices and of course, we all want to be doing our jobs to the best of our ability and we want to be keeping up with what seems to be the best practices in our industry. So we say, everybody else is doing microservices, I should do microservices, but we haven't really thought about what problem are we actually trying to solve?
So then it means that we sort of think of services as a goal, but they're not a goal, they're a means. So then if we think of microservices as the goal, we don't think about what else we need to have in place to achieve our actual goal. Whether that's being more agile, well, agile is a word I hate to use, it's so loaded, but whether we want to be able to deliver more quickly or whether we want to be more resilient or all of those are legitimate goals, or whether some of our team really wanted program off in Python and we want to support that. All of these are fair goals, but we need to know which one we're going for.
Thomas Betts: And where's microservices envy fit in? Just the idea that everyone else is doing it and I'm missing out? You said companies see other companies, is that also on an individual level?
Holly Cummins: I think so. Yes. That's a really good point because I think it's both. One of our fellows in the IBM garage, he used to have this rule that when he sort of did an initial session with clients, if they said Netflix, Netflix, Netflix, which did sometimes happen, he thought, okay, we're going to have an awkward conversation here because we see these sort of presentations and really, really good things coming out of some of these organizations. And we say, okay, well, if I do microservices, I'll be like them. And it's like, well, our problems are different, and the solutions are probably different, and the conditions for success are different as well and the culture is different.
So that's the sort of the organizational level, but then at the individual level as well, I was really surprised a while ago because Red Hat did a survey looking at the drivers for container development and the number one driver was career progression, which I think they didn't want to put CV driven development on the survey, because it looks a bit cynical, but that really is a nice way of saying CV driven development. So as individuals, we kind of look at our CV and we say, oh, there's this big gap where microservices should be, and that's not going to work out for me. So I need to re-architect my whole stack in order to fill in the gap in my CV. And again, not the best reason to re-architect the whole stack.
Thomas Betts: Yes. You're saying we shouldn't pay developers based on the number of microservices they create?
Holly Cummins: Yes. And I think that sometimes there is this culture of, well, if I do one container, that's good. If I do two containers, that's even better, and if I do 400, I'm a hero of containers. It's like, well, maybe.
Cloud-Native Spaghetti [04:19]
Thomas Betts: So next on the list was cloud-native spaghetti. Now, if I were to just take my spaghetti code that I've been running for years and it's building up into my monolith and I just lift and shift that into EC2, that doesn't make it cloud-native. It's just cloud hosted. So what's cloud-native spaghetti? What's the special flavor there?
Holly Cummins: I think you probably have given a good description of part of it and what makes it cloud-native spaghetti is when you then decide that really you need to have it running across 600 containers. And I think we sometimes assume that decoupled and distributed are synonyms. And we sort of think, well, if I distribute my application, that's going to ensure that it's decoupled. But all it does is ensure that if it is coupled, the costs of that coupling are going to be way more painful and way higher. And so we end up with this situation where we've got a distributed monolith and everything is connected to everything else and we don't necessarily even understand the connections, it's quite hard to get a system-level view because it's distributed and it's in the cloud.
Thomas Betts: I think someone took the Italian food metaphor and stretched it. If you've heard, we have spaghetti and we'd rather have lasagna because it's got these clean layers and then we want to go to microservices and we have ravioli, these individual pods that we can eat and it's got everything you need in there. It's a different pattern. You have to start differently. You can't just take everything out of the pot and turn it into ravioli. It does take that re-architecture, it's where that metaphor actually keeps going and working.
So how do people get around that cloud native spaghetti? Does it go back to the goals of we want to be decoupled? And so how do we design for decoupling, not just distribute for the sake of distributing?
Holly Cummins: I think continuing the sort of the ravioli metaphor, because I sometimes make ravioli and it's really skilled. And if I don't get it exactly right, what I end up with is the filling sort of escapes in the pot and then you sort of look and there's spinach all over the inside of the pot. And I think it's quite similar with microservices that at the design stage, you may think they're all encapsulated, but then no plan survives contact with reality. And then this coupling can creep in. And so one of the things that I think is quite useful, so you have to do it at the sort of the design stage, but then you have to continue doing it at the development stage. And you have to ensure that you've got something like contract testing to really make sure that you're aware of your dependencies and managing your dependencies and discovering leaks in your dependencies where a change in one microservice will break other microservices and you didn't maybe know about it.
The Need for Contract Testing [06:20]
Thomas Betts: I remember that contract testing. I just shared that with some coworkers, the example of how to do Pact testing, how do I test that if my service to changes or my dependency changes? How do we both agree on we'll both break respectively, we'll know that ahead of time. And that's hard because no one gets it for free. You either get half of it and then you fail or you have to take the time to make the contract test something you both agree on.
Holly Cummins: And I think Pact isn't as popular as it should be. And I think there's a couple of reasons for that. I think one is that it can be quite hard to sort of get your head around the full capabilities, and if you don't, then you sort of think, well, this is just open API validation, and it's a lot more than that. But as well, when you do something like Pact testing, you pick up a whole bunch of burdens. So you pick up the risk that, your build might fail because of something someone else does, and you pick up this obligation to talk to other people, to figure out what your combined behavior should be. But of course that's the exact same things that you're going to need to be doing anyway, if you're going to have a successful microservices architecture. So it's just front loading it so that you have to have the conversations at development time, rather than having the conversations when you've just had a ticket come in, because everything is broken in production.
Thomas Betts: Right. It defines where you have that coupling. You don't just get to ignore the fact that, oh, we have two separate microservices. They get to deploy independently. Well, if you have APIs that are changing, you can't just change that and hope that everyone else just magically upgrades, you have to think about API versioning them. You said open API validation. Yes. Your schema validation and saying, here's what it does. That's sometimes just telling you that it's valid. It doesn't tell you if anyone else agrees with what that expectation is.
Holly Cummins: Yes.
Thomas Betts: I think that's also in the presentation where you went into the Mars Climate Explorer as distribution doesn't solve your problems. Can you go into that example?
Holly Cummins: Yes. I think that's a good example of where if they'd had open API, it probably would've passed because some of these misunderstandings can be quite subtle. And so the syntax can be correct, but the semantics can be wrong. The Mars climate Explorer was probably about 20 years ago now. It was a rather sad unsuccessful NASA mission to Mars, and it was sort of going along fairly well until it got close to Mars, and then it was supposed to orbit round Mars. And instead of orbiting round Mars, it got a little bit too close to Mars, and then it got pulled in by the Mars gravity and that was the end of the Mars Climate Explorer.
And when they did the postmortem to try and figure out, "something seems to have gone a little bit wrong here." What they discovered was, I mean, it's hilarious in retrospect, but they had kind of a distributed architecture. I mean, it was literally distributed, part of it was in space and part of it was on Earth, but they had two control units, one on the ship and then one on Earth and they were developed by different teams. So again, we're living the microservices dream of we have this very distributed thing and we have different teams developing in the different parts of it, but they had a problem with the contract. They had a problem with the understanding between the two teams. And one of the teams worked in metric because that's a really sensible unit to use for scientific things. And the other team worked in Imperial because that's quite a typical unit. And for the units that they were using, the scales weren't a million miles apart. It was sort of maybe like one and a half times out. So it was a subtle enough difference that things mostly seemed to behave and nobody really tested that interaction. And so it was only when it was in space that they sort of started to think, oh, this doesn't seem to be behaving quite the way we expect, but by then it was too late.
Thomas Betts: Yes. I think that, "oh, we distributed our system and why do we still have problems?" is a common example. It goes back to that cloud native spaghetti. You have all these things spread out, but if they don't work well to cohesively as a system, you can still have the problems. So identifying those, going back to your goals of what are you actually trying to accomplish, not just split this up into small pieces. I do like that something in space is pretty much as distributed a system as you can get. Like that's worse than US east one and two.
Holly Cummins: Yes. Completely. And you're pushing some of the complexity of your system. It's not going away, but it's getting pushed out to the connections and the communication between the services. And I think that can make problem solving quite challenging because each team thinks, oh, I only have this very small thing to do. I can completely understand my code base. And then there's the question about, okay, but what happens if something goes wrong in the system as a whole, when do we discover that? How do we make sure that we can discover that without doing sort of huge end-to-end integration testing? Because if we have to do that, we lose a lot of the benefit that we were hoping to achieve with microservices, but if we don't do that, it doesn't make the problems go away. It just means that they are more subtle and more expensive to fix.
Thomas Betts: Yes. I think that goes to the bigger question of who's in charge of the architecture? There's been discussion of, do we still need architects? Who should be doing the architecture? And solving those problems, being allowed to distribute the design decisions so that each team can act independently, but someone still has to be responsible for the big system. And when it was all in a monolith, it was easy, like I said, to run the integration test because all the code can just interact. Once you get to, oh, I have these services depend on other services, you start mocking things out and the mocks let you down fairly easily, I think in some cases, the mock doesn't actually behave the way the system does or the mock hasn't been updated based on what the dependency now does. It's now out of date. That's part of the, if you take on microservices, you have to assume someone has to do this work to say these two services work together and that communication still has to happen. You don't get to eliminate that from your design anymore.
Holly Cummins: Absolutely. And I think one of the beauties of contract tests is that they take that overall system level governance and they manage to kind of push it down the stack so that we still have that system-level view of does it work, but without the expense of sort of a system-level chief oversight or try everything out or architecture board, because again, those sort of layers remove a lot of the flexibility and a lot of the nimbleness that is usually one of the goals of microservices.
Thomas Betts: Right. Going back to, if you wanted your goal to be, we wanted to deliver, not to use agile, but more frequently or smaller increments, get rid of that big three month process of review. We want to be able to release every day. Well, you got to be able to test those things every day.
The Enterprise Hairball [12:44]
Thomas Betts: And I think we'll come back to that in a minute, but I'm going to move us on to number four. What is an enterprise hairball?
Holly Cummins: Oh Yes. So I love this mental image mostly, because it makes me think of cats and the internet is for cats, but the enterprise hairball is something that, again, I think we can sometimes forget when we first start thinking about microservices because we think about our business layer and we divide it up into microservices and we think job done. But then we realize that, of course our architecture has a bunch of other layers as well. So there's the front end layer. And I think we're getting a lot better now at dividing that up into micro front ends as well. So then we think, okay, I've divided two layers into microservices, I'm completely job done. But any kind of enterprise system of a reasonable size there's going to be the other layers as well. So there's going to be the databases. And again, we're sort of starting to learn that we want to divide those up, but then usually there's the integration layer as well, all of the messaging and those layers often can be like a hairball.
They connect everything else in the system together and it can be really difficult to tease them apart. And we're starting to sort of learn the techniques for micro integration and dividing these things up into much smaller modules, but that's still a work in progress, and if we don't get that right, we end up in that same situation where we have some things that are decomposed, but then when we actually try and deliver it and push it to production, we have to put it into the integration team and they can only release every few months and they've got too much work. And so then it ends up sort of running into this brick wall again and getting blocked.
Thomas Betts: I think the example I like to go back to is the, if you've read the article on data mesh, we've brought it up a few times on the podcast, but it's that paradigm shift of you can't have the bottleneck in your system where, oh, we put everything to a single data warehouse and it's got to match the schema and everything has to be exactly right, and we've got to take all the time. And every time you change something upstream in the source system or downstream in the reports, the integration team has to say, okay, we've got to change all of our data adapters to fit that. And data mesh is kind of the DDD approach to data that you say, okay, we're going to attack that specific part of this problem of I'm going to make our data integration really on the product team that owns that data, they now understand the entire process up and down the stack. And if we want to then the build in integrations between two or three data services, we can put them together. Is that the same idea you're getting to, that's just one aspect of those integration hairballs that always exist in a giant company?
Holly Cummins: Yes, exactly that.
Thomas Betts: Also I do like the cat metaphor because I just see cats coughing up hairballs over a data center and that's totally legit.
The (Not Actually Continuous) Continuous Integration and Continuous Deployment [15:19]
Thomas Betts: All right. For the next fail, I'm going to pull out my Princess Bride quote and say, you keep using that word, I do not think it means what you think it means. The not actually continuous continuous integration and continuous deployment. So I have CICD, aren't I done? What is not actually continuous?
Holly Cummins: We're starting to use CICD as a noun rather than a verb and we think it's something that we can buy and then put on the shelf and then we have CICD. But if we sort of think about the words in CICD it's continuous integration and continuous delivery or deployment, confusingly. And so what I often see is I'll see teams where they're using feature branches and they'll integrate their feature branch once a week. So that of course is not continuous integration. It's better than every six months, but it's fundamentally not continuous. And really, I think if you're doing continuous integration, which you should be, everybody should be aiming to integrate at least once a day. And that does mean that you have to have some different habits in terms of your code, you sort of need to start coding with the things that aren't visible and then go on to the things that are visible and other things like that. You need to make sure that your quality's in place so that you've got the tests in place first so that you don't accidentally deliver something terrible.
But I think it's something that we should be aspiring to. And I see even more with the continuous delivery and continuous deployment, I see the gap between the number of teams who talk about CD and the number of teams who are actually delivering on any kind of continuous schedule is there's a lot of us who are talking about it and very few are actually doing it. And particularly with microservices, what we should be aiming for is that should be getting us more to a genuinely continuous continuous delivery, because we should have independent deployability, but that's hard, that's scary. And so what I often see is that we'll put in these patterns. So I heard about one shop and what they did was they, it was quite the idea, quite scary, the idea that the microservices might deploy independently, because what if they didn't work together? So they made sure that they had a single pipeline so that everything went through this one pipeline and deployed at the same time.
I heard another story where it was a bank and they were getting really taken apart by the challenger banks. And so they looked around and they said, well, we've got this big COBOL estate, that's probably not helping us, I don't think the challenger banks have COBOL estates, which is probably true, and we need to change to microservices. Probably also a reasonable thing to do. But then they added that their release board met every six months. So if your release board only meets every six months, there's no point talking about CD, you just don't have CD. And so that was the sort of the thing that they probably needed to be looking at first, before they started thinking about the re-architecture to microservices.
Thomas Betts: I think the CI part of CICD, so they get lumped together, they get turned from a verb to a noun. CI I think has been around long enough people understand it. I check in my code, tests run, I find out that even if the tests don't run, what if someone checks in code and it no longer compiles because they broke something of mine. That was my first experience with CI more than 15 years ago, right? Somebody checked in code and I saw someone else check in code and we didn't know what we were working on. And the build broke. It was like, yay, we found out the bill broke. That's great. And the compiler was our first unit test, might have been our only unit test at that point. And the continuous delivery didn't exist. It was like, well, at least I can just download that latest thing and manually deploy it because we have a manual deploy process.
I think people sometimes get hung up of what the D means. I think you did say both of them, continuous delivery and continuous deployment. Where do you see the difference between those two? And where do you think people are on the spectrum of actually doing either or both?
Holly Cummins: I always get this one backwards, but you can correct me if I get it wrong. I think part of the reason I get it backwards is I'm sure sometimes you see contradictory definitions, but so one D is delivery. And that is, I could actually, if I wanted to, push this out into the wild, but I'm not actually going to push it anywhere. And then the other D is deployment, which is okay. I have actually pushed this to prod, but there's another one, which is the R, which is release. As we start to think more about CD, we need to draw a distinction between deployment and release. So we can have things that are running on the production server, but no one's ever going to see them because they're switched off either using feature flags or using some other mechanism, or if we want to be sophisticated and get some confidence, we have 1% of our traffic routing to them. We have friends and family routing to them. So we can try these things out in production because they're going to behave differently in production, but without the risk of that full blown, we have just deployed this and now everybody sees it and the only way to go back to unseen it is to undeploy and do a rollback, because that's scary and risky.
Thomas Betts: I think you're right that the terms don't have very clear definitions and you can probably Google and find the conflicting ones fairly easily.
Holly Cummins: Yes.
Thomas Betts: I think you got to the fundamentals that I've understood, which is, is my code running on the server in production? That doesn't necessarily mean anyone's using it. That's either the release that you described or the delivery because I've heard delivering value to the customer is sometimes the stretch and that's where you have to have some way to turn it off so that it's not delivered to everybody. And that's the feature flags or the percentage the slider scale of some people get it, some people don't, but that's complexity you have to bake in.
Holly Cummins: It's not free for sure. I think with the D, I think it seems to be sort of defined at both ends of the spectrum. So one of the D's is as a developer, I have done my job, I have pushed my code and it's somewhere and it kind of works. So it could go out and if it doesn't go out, that's not my fault. That's the fault of the release process. And then as you say, the other end is, well, this thing is out there. People are actually using it. But I think with all of it, the key is to make doing it so boring because we're doing it so often. So instead of it being this big celebration, it's just something, well of course, you know, I pushed it and it ended up out in the wild and nobody noticed, which is a good thing.
Thomas Betts: That "it's not my job anymore," really gets to the heart of DevOps as an actual, like I've got DevOps in my bones. We don't even talk about because it's just the thing we do. When I was in a small shop and there were only three developers, we had to be DevOps because there were no ops. Like we had to run it. When you get to the bigger companies, when you have development teams and operation teams, it's like, well, that's no longer my responsibility. And that's the kind of the pattern that we're trying to break from. But does it get to a point where the ability to release that to customers now becomes something that developers have to be concerned with, they have to build in the feature flags, they have to check for the feature flags, whatever mechanism they're using to handle the release. That becomes part of the code base, just like infrastructure as code becomes part of the mindset.
Holly Cummins: It's a good question, because I think we sort of push it in both directions. I think as a developer, it's really a positive thing and I should have an interest in getting my stuff out and if that means doing extra things with feature flags and that kind of thing, but it means my stuff gets consumed within a week of me writing it rather than having to sit on the shelf for six months and I don't get any feedback, as a developer, I want feedback. So that's a good thing. But on the other hand, with sort of the general shifting left of everything, that does seem to put more and more burden on developers because now we have to know the code, we have to know ops, we have to know security, and some people like those aspects and I think some people are a bit uncomfortable that their job is expanded and the expectations on them have expanded, but the time has not expanded.
Thomas Betts: I think it's that level of abstraction of where do you work and how do I make it easier? So how do you focus on your code? And you still get to write your Java, or your C#, your Angular, whatever it is, but now you have to be a Kubernetes expert. Like I don't have time to learn Kubernetes. How do I abstract that away? So yes, I'm still responsible for deploying the services, but you don't have to be a full backend engineer, infrastructure engineer. Giving people the paved road, doing the supporting teams that can take care of some of that for you so you don't have to worry about it. Let me give you an easy way to implement feature flags and now it's just, you just call it, but that has to be a solved problem. Again, it's that idea of you must be this tall to ride the microservices roller coaster mindset. These are one of those things that you have to be considering, and maybe it's not day one that you have them, but you're going to have to think about it before you get to the 400 microservices.
Holly Cummins: I think that the paved road is really important and automation is incredibly important. And what both of those I think are doing is bringing, depending how sympathetic you're feeling, either that empathy for your team into your architectural practices. So you do have to be this high as an organization, but we do want to make it so that it's easier to do the right thing than to do the wrong thing. And we want to make it that people can avoid making mistakes fairly easily because that's good for everybody. So another way of thinking of that is we want to be idiot proofing it. We want it to be that there's a seamless flow and things just work. And you don't end up in this situation where we have these really challenging processes. And with Kubernetes, for example, that's something where I think as an industry, we're struggling a little bit because it's really hard. And so how do we get the value from Kubernetes without having this extra cognitive load on everybody in the stack?
Thomas Betts: I always like the term falling into the pit of success. Like how do we define it? So you just go through and like, boom, oh, it just worked. That's a great mindset I like.
The Someday Automation [24:45]
Thomas Betts: You did a great segue because you mentioned automation being really important. Number six of the ways to fail is the someday automation. So what happens if we put off our automated tests or releases or compliance, what do we end up with at the, oh, we'll get to that someday.
Holly Cummins: The problem with automation is everybody loves the idea, but automation is expensive. And the value that automation delivers is kind of subtle because what the value that automation delivers is the mistakes that we don't make. And that's hard to measure. So if, as an organization, we're up against it, we have deadlines. It's really easy to say, well, yes, I know that I should be doing this automation, but I have this much more pressing thing, so I'll do the automation second. Oh, I have this other more pressing thing. I'll do the automation. And then we end up that the system has become more and more complex. And so automating it becomes more and more complex and it just keeps getting pushed off because there's always something.
Thomas Betts: That's the same mindset as, oh, we'll implement security at the end. We'll just patch it at the end or we'll bolt it on. And you don't know what problems you didn't encounter when they don't happen. Like, oh, the data center didn't catch on fire. The servers didn't crash. Was that, they just didn't crash, or we took measures to prevent that from happening. We have all of our antivirus scanning, whatever other preventative measure to prevent us from being hacked, we do all those things, or did we just get lucky? And you can't test for the false negative situation and the automation, yes, it's great when it catches it and I love seeing something prevented me from deploying bad code. I always benefit from that when it works correctly. But when it lets it go through, I always scratch my head. I'm like, are you sure it went through? Did I actually write the right test? I hope it works. Hope is not a plan. And so doing that automation, I think you also brought up the idea of the longer you wait, the bigger the problem becomes, and having that paved road as being part of it, like make it easier for people to write the automation, the tests that they need so that you can get to the point where I feel more confident that the code I'm deploying, that I'm releasing to customers, I'm delivering value and it works.
Holly Cummins: And how about an expectation is such a big part of this as well, because automation is hard. So it does take skill, but the more we do it, the cheaper it becomes. And so my team, when we started a new client project, on day one, the first thing we would do is the build pipeline because we knew we'd get the most value from it if we did it on day one and the cost would be the same whenever we did it. And I worked with another team recently and they sort of were two months into the project and they hadn't yet found the time to do the build pipeline. And I thought, oh, there's so many things that could have been avoided if we'd done this every day, a bit like brushing your teeth. You just say, of course, I'm going to brush my teeth because it would be really bad not to.
Thomas Betts: Let me build the build pipeline before I write the hello world app.
Holly Cummins: Yes, totally.
Thomas Betts: And the automation, I think you said, it's the same thing as microservices. If the release is hard and you only do it every six months, if something's hard, do it more frequently. How do we get to the point where I can deploy our code more frequently? How do we test our code more frequently? How do we do all of these things more frequently? And that means each of those steps has to be smaller. You can't have the three-month process every day. You have to have the, how do I get this done quickly? And how do I get a computer to do that? So I don't have to.
Holly Cummins: That actually goes with the tooth analogy as well, because a dentist told me once teeth are one of the only body parts where if it hurts, you should poke at it more. And so if a tooth is hurting, you should brush it more enthusiastically. And that will often sort out the problem. There you go. That's just the value that we provide to InfoQ is free dentistry advice.
Thomas Betts: We do, we've got the pasta metaphors, we've got the teeth metaphors, apparently lunch is coming up.
The Un-Cloudy Cloud [28:08]
Thomas Betts: But we're getting to the end, the number seven on your list was finally the un-cloudy cloud. So what does a cloud look like when it's not a cloud? Is this a bigger plate of cloud-native spaghetti? What are we talking about?
Holly Cummins: So this is the cloud-native spaghetti where we looked at it and we thought, this spaghetti, people might actually eat this spaghetti if we're not careful. We need to make sure that doesn't happen. Let's keep this spaghetti under lock and key. And so what a lot of organizations are discovering about cloud is it's kind of scary. You can spin services up and they're not free, and anybody can spin services up. And so the same governance that was applied before the cloud gets applied to the cloud. And so it means that in order to get an instance, you have to fill in loads of forms and you have a big long wait and these are sort of put in place to try and manage the cost and to try and manage the security, but I'm not sure they're very effective at either of those. And they introduce all this friction into the process, which means that you're carrying some of the costs of the cloud without getting the benefits of the cloud. You don't have that self service.
Thomas Betts: I think it's the analogy to people switching from waterfall to agile processes and still saying, oh, I want to have all my requirements done up front before we go into our agile sprints. Well, why? And then, oh, we have to do all these same processes, we're just going to do all the processes more quickly. You need to change the processes, right? You have to adapt to being cloud-native. All of your processes from 10 years ago don't necessarily apply. They might have been good at the time and done for good reasons, but you have to rethink everything.
Holly Cummins: And it's not just that people are lazy and don't want to do work. It's that these processes aren't actually going to give you the outcome that you're hoping for. At the recent QCon London, Sam Newman, he had a lovely story where he went into an organization and he talked to the developers and he said, what do you really want? And they said, oh, we would love to have a cloud. And he talked to management and they said, we have a cloud. And then he sort of had to go around the circle a few times and then he realized that what was missing was the self-service. It was someone else's data center, but the layers of governance were so oppressive that it felt like it was on-prem.
Some Hope and Good News [30:06]
Thomas Betts: So we've covered a lot of ways to fail. I think we gave a little bit of hope. Is there anything else, else? Because I love talking about anti-patterns, I think sometimes they're guilty pleasure of like, oh, look at how badly that went. Can we end on a little bit of hope for the listeners? What should they really be doing? What are some guiding principles to avoid these failures and hopefully, and end in the pit of success?
Holly Cummins: I think the good news is I think we really are going in the right direction as an industry. A lot of the things that we're sort of wishing we were doing now, even if we're not doing them, we weren't even thinking about doing 10 years ago.
But really I think the path to success is first of all, have that really hard and clear conversation about what problem are we trying to solve? And then look at the holistic picture of what do we need to do in order to achieve that. Some of it is going to be technology, but not all of it is going to be technology. Some of it's going to be about the processes and the people and the business flows. And then once you've got that definition of your problem, then it's all about optimizing for feedback. So how can I make my feedback as fast as possible so I know if I'm actually on the right path, put in the psychological safety, because that enables better feedback and it also just makes it a nicer place to work so that we can try things out, we can do the little experiments. If it goes wrong, we fix it quickly rather than trying to pretend it didn't go wrong in case anybody comes and punishes us for the bad thing.
Thomas Betts: I think that wraps up our seven ways to fail at microservices. Holly, you'll be at QCon Plus in May, is that correct?
Holly Cummins: I will. Yes. I'm really looking forward to it.
Thomas Betts: So if you want to see Holly, sign up for that, and you'll her at the modern Java track and otherwise I hope the listeners join us again for a future episode of The InfoQ Podcast.