BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Podcasts Sam Newman: Monolith to Microservices

Sam Newman: Monolith to Microservices

Today on the InfoQ Podcast, Wes Reisz talks with one of the thought leaders in Microservices, CI/CD, and Cloud -- Sam Newman. The podcast covers many of the topics, techniques, and patterns that Sam writes about in his latest book, Monolith to Microservices: Evolutionary Patterns to Transform Your Monolith. Topics covered in the podcast include understanding the problem you’re trying to solve, organizational/people changes when it comes to microservice architectures, database strategies for decomposing monolithic datastores, and why we’re seeing projects reverting from microservices to monoliths. 

Key Takeaways

  • Fundamentally, microservices are distributed systems. Distributed systems have baggage (complexity) that comes along with them. The best way to deal with this complexity is not to address it. Try to solve the problem in other ways before choosing to take an organization to microservices.
  • A common issue that large enterprises run into that might be a strong indicator for implementing microservices occurs when lots of developers are working on a given problem and they’re getting in each other’s way.
  • A useful structure to follow with microservices is to make sure each service is owned by exactly one team. One team can own more than one service but having clear ownership of who owns a service helps in some of the operational challenges with microservices.
  • A release train should be a stop in the journey towards continuous delivery. It’s not the destination. If you find that you can only release in a release train, you are likely building a distributed monolith.
  • There are challenges of operating microservices when the end customer has to operate and manage it. These challenges are part of why we’re seeing projects move from microservices to process monoliths.

 

Correctly or not, microservices has become the de facto way to build apps today. However, the idea of microservices is really a tough thing to put a clear beginning to. A commonly pointed to birth of the architectural pattern starts around 2011 with James Lewis. James, consultant at ThoughtWorks, at the time was interested in something he called micro apps. It was a pattern that was becoming increasingly used at places like Netflix. You can actually find a talk from Adrian Cockcroft at QCon SF in 2010, where he talked about cloud services at Netflix on InfoQ. In 2012, several of these players gathered at a conference and debated this architecture. At that conference, the term microservices became an architectural pattern with a name.

 

Today on the podcast, we're talking to one of the thought leaders who was at that conference, and through his consulting, his books, his talks, and his passion has helped to shape what we think of microservice architectures today. Today, we're talking to Sam Newman. Sam is a consultant in the cloud CI/CD space, perhaps most well known for his thought leadership around microservices. The big focus on this podcast is really to chat about his new book, "Monolith to Microservices: Evolutionary Patterns to Transform Your Monolith." That story from James Lewis in part is taken from that book.

Outline

Today on the podcast, we're going to be talking about understanding the problem that you're really trying to solve when it comes to microservices before you make that choice. We're going to talk about technical and organizational challenges around microservices, and Sam's thoughts on how to address them. We'll spend some time discussing decomposing the database and some of the patterns to think about when you're dealing with that monolithic database. Then once you're on that path, we'll talk about some of Sam's thoughts on how to deal with the growing pains that inevitably come from microservices. Today on today's podcast, expect to hear stories, advice, patterns and anti-patterns, and some best practices on microservices from Sam Newman. As always, thank you so much for joining us on "The InfoQ Podcast."

Sam, welcome back to the podcast.

Newman: Thanks so much for having me back again.

Reisz: Late last year, November time-frame, you published a new book, "Monolith to Microservices: Evolutionary Patterns to Transform Your Monolith." What have you been doing with all the copious amounts of free time since then?

What I've Been Doing since Writing My Last Book

Newman: Writing a second book.

Reisz: A third book.

Newman: Yes. A third book, really, yes. The reason I wrote that book was I started doing a second edition of my first book, "Building Microservices." The chapter in building microservices, I think it's Chapter 5, was where I looked at how you break systems apart. I started rewriting the second edition of my book by looking at that chapter. That chapter went from being 5,000 words to being 40,000 words in about a month. I thought, "This needs to be its own book." I split it off really as its own books. You can read it by itself, but it also does work as a companion. It's a deeper dive on those topics. Since I wrote that book, I've been dealing with the current situation, being locked down in my home. That's, as an independent consultant, getting your customers happy with doing things online. It's actually been easier than I thought. People are much more open to doing online consulting now than they were in the past. In the past, they wanted to see you in front of them. Literally, was just about half an hour ago, I was working on some stuff around the second edition of building microservices, and actually doing a lot of exploration of computer science papers written in the early 1970s. I was descending down that particular rabbit hole this morning.

Reisz: It's definitely an interesting time now. I can't say it's exactly working remotely, the culture we're in now. It's an interesting time, to say the least. One of the questions that I always have when I'm talking to someone who writes books and consults is, what does your writing process look like? I'm curious.

What the Writing Process Looks Like

Newman: I've spoken to lots of authors, and they're all different, as in how it works. It's all quite personal. I'm what I consider to be a bit of a momentum writer. When I'm writing an initial draft of a chapter, I have to have at least a rough structure already mapped out. That's the thing that will bounce around in my head for months beforehand. I'll sketch that out. Then literally, I'd like to sit down and over a period of four or five days just brain dump. For that to be most effective, I really need to write blocks at a time. I need to have three or four days in a row to really get up to speed on that. That then helps me get out that first initial draft. Then I can review a few times.

Writing Days

For me, I can get my stream of consciousness down as prose fairly effectively. I'm not somebody that could write that initial bit of work an hour here, half an hour there. That means, really, from a work life point of view, or work-book balance, I have days where I'm writing, and I put nothing else in that day apart from writing. If I've got calls, if I'm doing some online training, I'll mix those days with other administration type things I've got going on, or things just getting around the house. I try and keep writing days as writing days. I can't write for more than five or six hours a day: rare exceptions. Today was a writing day. I finished at 3:30. After that, anything else I get done is gravy. Once I've written that, and I'm getting review feedback from people. At that stage, I can go through and process that review feedback in little chunks here and there around other bits of work.

Reisz: For me when I'm writing, I find I have to write code and then shape the text for the code. Do you find code first, or write first, and then shape the code to what you wrote?

Order of Writing Code

Newman: It's interesting, because I think most of mine actually start as a story. I think most of the things I write start off life as a presentation, or a workshop. On those things, I'm taking somebody on a journey. Then I'm writing that journey down. The books I write, I don't write code-centric books, because there are other people that do that really well. I want to make my book more broadly applicable to people. I don't want to alienate the .NET'ers because I'm talking about Java, or whatever. For me, it's more that if I want to share these ideas, how would I do that if I was chatting to somebody in front of me? I'm actually quite fortunate I do training actually as part of how I make my living. I actually get to go and take those messages out and take those stories out, and almost road test them by actually delivering training. Then after iterating on that a while and getting the flow, the beginning, the middle, and the end of each of the topics right in that forum, I then almost write that down. That's the process that works most for me. I think I do still try and think in terms of that narrative arc. That tends to be the way I work more than just throwing in lots of topics I want to fit in. It's more that I'll do that narrative arc. Then I'll come back and say, "I've missed some stuff. How do I shoehorn it back in again?"

Reisz: Then you have a second and third book?

Focus on One Book

Newman: Yes. I've got lots of ideas about books that I want to write after this one is done, but you got to focus on what's in front of you.

Reisz: Speaking of that, let's dive in and start talking about, "Monolith to Microservices." Tell me about this book. You already mentioned briefly the origin but tell me more. Tell me about why this book and why now?

Why Monolith to Microservices Book

Newman: The vast majority of the people I speak to that come to my workshops, I ask this question often, who here has got systems too big? When I do a conference talk, and everyone puts their hands up. The vast majority of people don't start with a blank sheet of paper. They start with an existing system. They think, "Microservices are attractive." How do you make those two things work? Do you ditch the entire system and rebuild something from scratch? I think that's extremely problematic in most situations. The reality for the vast majority of people is that if they're interested in a microservice architecture, then they're going to have to find some way to take what they've already got, and migrate it towards a microservice architecture.

The People Process Side

Even if you start microservices first, you get to situations where you find further decomposition is needed. I really wanted to go a bit deeper into that area to give people some concrete tips and advice about how they could make that journey happen for themselves. As part of that, you're looking at the people process side of things, then looking at the how the code, so how you pull the code apart, patterns around that, and then spending a lot of time around the data. The idea being that I can share some concrete patterns. I've got more case studies in the book that I had in "Building Microservices" to show what is possible. I actually think it's healthier for people, in fact, to take that migratory approach in an iterative fashion. I think it's a much more sensible approach than trying to build a microservice architecture from scratch. Even if you're rebuilding an entire system from scratch that you already know. I still actually think migrating the existing monolithic architecture, or whatever you want to call your current system, is a much healthier approach.

Reisz: Back in March at QCon London, I was doing the architecture you've always wondered about, track. Ian Thomas from, The Stars Group, was in that track. He started the talk off by referring back to your talk from the previous day. Ian was part of a project that took the sports book for PokerStars and the sports book for Sky Bet, and merged those things together. He had a Greenfield app that they were building from these two things. In your talk, you talked about, look, microservices isn't the point you start. It might be the architecture you get to, but really, it's starting at that MVP. You're starting at the simplest thing that works. He put up this picture, I remember, and everyone cracked up, because he showed this picture of you looking straight at the camera. He said, I felt like when you were saying that don't start with microservices, you were looking directly at him. One of the things that people that first come to one of your talks, I think that they're initially surprised of, you're not ringing the bell that microservices is the jumping off point. Can you talk a bit more about that?

Microservices Is Not the Jumping Off Point

Newman: I don't think that anybody that's spent a lot of time looking seriously at the challenges associated building a microservice architecture thinks that they should be writing on situations. Fundamentally, a microservice architecture is a distributed system, and distributed systems are complicated. They have baggage associated with them. There's a lot of complexity that comes from these systems. The idea that everybody is aware of those issues and knows how to deal with those problems just isn't true. We may have learned about some of those ideas at university perhaps, but they're not things that most people have to deal with. For me, it's like, "Microservices has all these problems. Yes, that's cool. It's also with these problems as well." For me, when I'm working with a client, my mindset typically is not why shouldn't we use microservices? It's normally, why should we? I think I got asked, what's the one thing you'd say when you would use microservices? The answer normally is when we've tried everything else. Because for an awful lot of things that people want to do, there are often easier ways to achieve the same goal. A lot of people want to have improved scale of replication. Have you tried having multiple copies of your monolith? Have you tried getting a bigger machine? They're often the quicker things that you can explore and experiment with beforehand.

The Extent of Microservices

Even then, a lot of this comes down to if you do want to do microservices is the extent to which you do them. The analogy I've always tried to use with adopting microservices is that it's not like flicking a switch. It's not off or an on state. It's much more like a dial. If you're interested in trying microservices out, then just create one service. Create one service. Integrate it with your existing monolithic system. That's turning the dial a little bit. Just try one and see how one works. Because the vast amount of the problems associated with a microservice architecture are not really evident until you're in production. It's super important that even if you think microservices are right for you, that you don't jump on it. You dip your toe in the water first and see how that goes before you turn that dial up.

Reisz: One of the things that I liked in the book, and I've fallen in this trap before thinking of monolith is this one nebulous, just one classification of an app. You break it down and you talk about different types of monolith. That even these different types of monolith can move you towards this journey towards microservices. Can you talk a bit about that?

Different Types of Monolith towards Microservices

Newman: I talk primarily about a monolith as being a unit of deployment. I say that could be all of my code in one process. It could also be a distributed monolith where I've got a distributed system that I'm all deploying together. One of those monolithic patterns that can often work well, as a migratory step, would be what we now call the modular monolith. Where all of your code is packaged together in a single process. That code is actually compartmentalized into separate modules. This is a cutting edge idea from the late 1960s that we've just realized existed. If you look at all the work behind structured programming, information hiding, it's all, how do you organize code into these modules that can be worked on independently? Actually, for a lot of people, if you have all of your code running on a single process, you sidestep a lot of the challenges of distributed systems. If that code is organized around modules, and if you've got your module boundaries right, then that also is a potential migratory path. You could say, "My first microservice, I'm going to take one of those module boundaries, and potentially use that now as a service." There are different ways you can make that journey happen.

Reisz: Without necessarily having to deal with the network partition in the whole process.

Newman: Absolutely.

Taking People on a Microservices Journey

Reisz: Another part of the book that I really enjoyed was about organizational challenges. There was a phrase you used about taking people on a journey. When you say take people on a journey when it comes to microservices, what did you mean?

Newman: It can happen at a lot of different levels. One of the common questions I get is, "How do I convince my boss that we should do microservices?" That's one of the ones I get a lot. This often comes from developers. I always say, why should they care? What's in it for them? Get that side of it, which is some of what you're going to do. Implementing a microservice architecture or moving to a microservice architecture won't be cheap, and it won't be quick. If you're going to take your boss on a journey towards that microservice outcome, you've really got to explain why you're doing it. I talk in the book about separating activity from outcome. You're implementing microservices as an activity. The outcome shouldn't be microservices. You don't win by doing microservices. The only person that wins if you do microservices is me because I sell books about microservices. You're implementing microservices to achieve something. What is it you're actually trying to achieve? It's amazing how many people I chat to who can't give me a clear articulation of why they're doing microservices. It starts about having a real clear vision for what it is you're trying to achieve. Then you can say, is microservices the right approach for this? Are we using microservices in the right way?

Problems That Require a Microservices Approach

Reisz: Decompose that a bit. What are the smells? You're running a monolith and you start to have a set of problems? What are some of those smells? What are some of those problems that indicate maybe microservices is something you want to look at?

Newman: I would say the most common one that I see for larger enterprise organizations is that they want to have lots of developers working on a given problem. Those developers are getting in each other's way. What they want is they want those developers to be able to work more independently from each other, reduce what I call as delivery contention. How do I have different autonomous teams working more in isolation, but still have the things that they create come together to create an overall system? That's a big part. How do I get more people working more effectively, efficiently? Scaling comes up sometimes, but much less frequent than you think. There are some aspects around scaling that could be pretty beneficial with microservice architectures. Data partitioning is something I'm seeing a lot more. If you isolate, for example, where your personally identifiable information is, you can sidestep GDPR concerns. You could say, "This part of my system has to have PCI sign-off or GDPR signup. This part of my system doesn't." Those are three quick examples of types of problems which can often lend themselves quite well to a microservice architecture.

Things to Have Right before Decomposing Your Monolith into Microservices

Reisz: You start to get these smells of team velocity, and you start to see some areas where privacy concerns or scaling different parts of your system may be different. There was famously a blog post years ago Martin Thompson had, "You must be this tall for microservices." Before you can really jump off that cliff and even incrementally decompose your monolith into microservices. What are some of the things that you really need to make sure you have right?

Newman: Because I see that process of dipping your toe in the water as being something which is so gradual and it is a very gentle first step, I don't have a big shopping list of things that I say that people have to do. Some people say, "We've got to do X, Y, and Z." Actually, you probably should have automated deployment of your system. If you don't, and you're only adding one more thing to deploy, it's not the end of the world. There's one big prerequisite I always say people should do first. That's actually to implement some form of log aggregation. By log aggregation, I mean some means by which a process can log files locally, and those files can automatically be aggregated in a central location where they can be queried and stuff. Traditionally, you think of things like the ELK Stack. I really like Humio for this. That's the one prerequisite I really say and I'm quite firm on. The reasons for that are actually quite straight forward. The first is, it's really useful and it's the thing that is going to help you a lot early on.

A Good Test of an Organization

The second thing is it's often a good test of an organization, if you as an organization and the different parts of your organization can't work out how to choose a tool and implement that tool effectively. A log aggregation stack is a very simple thing in the grand scheme of things. If that's something that you can't do as an organization, the chances are that all the other problems that you're going to have to face in microservice architectures, they're going to be much tougher for you, and probably not some things you're ready for. I always use it as a test of the organization. Because rolling out log aggregation means you've got to find a tool, pick the tool. Get the operations team on board. Get it bought. Get it installed. Get it configured with your existing applications. It's not massive amounts of work, theoretically, but if it is for you, you might be thinking maybe we should sidestep this whole microservices thing for a little bit.

How the Organization Changes to Support Microservices Architecture

Reisz: I want to definitely come back and hit on logs, particularly with microservices as you start to scale them out. Those logs and metrics start to become more voluminous. Before we get there, though, I want to talk about organizational culture and the impact that has on your architecture. Conway's Law talks about how your app models the communication structures. You've got this monolithic application that probably presumes you have an organization that models that. I guess as you start moving towards microservices, how does the organization need to change its shape to be able to support that architecture?

Newman: It's interesting because I find that these things are often talked about in isolation from each other. Ideally, you get the most benefit where your organization and your architecture are aligned. I talk about ownership models. I got a talk called, Rip It Up, and Start Again, where I look at the different types of ownership models. The model which I find most effective for a microservice architecture is a service is owned by exactly one team, one team might own more than one service. That keeps your ownership lines nice and clear. This really is what you would think of as being a strong ownership model. That's the model I've seen this works most effectively. Virtually, every large scale microservices organization I've looked at, they adopt that model. My starting point is often what is the existing organization? Often, the organization is already aligned around business concepts. That's a lot more modern shift that's been happening away from siloed IT teams to organizational structures which are more aligned around the nature of the product and around maybe product verticals. If you're in that world already, your life becomes much easier. Then you'd be looking to create services that sit clean within these organizational lines.

Three-tiered Architecture Approach

I think it becomes much more problematic, where you've got things like the classic three-tiered architecture types approaches. Where you still have the database team, the services team, the front-end team, who just deal with tickets and they roll over on every single different part of the system. In that world and in that environment, it becomes very difficult to bring microservices in because microservices are fundamentally a vertical slice, and hopefully, a vertical slice through presentation, business, and data tiers. That becomes more difficult. In those environments, it's much more like saying, "This is what a microservice is. We need to get a team that is working in a more poly-skilled fashion to own this stuff." You don't do the whole organization. You convince people on the model and then you pick one team and see how that model works.

The UI Tier

I think what I've been quite unhappy with since I wrote "Building Microservices" is the fact that although I always felt that this was evident from what I was saying, so many people stop that end-to-end slicer functionality at the UI tier. They don't include the UI. I go to organization after organization, they still talk about front-end and back-end developers. I have no problem with you being a front-end specialist or a back-end specialist. I think calling yourself a full stack developer is daft because Charity Majors says, "You're not a full stack developer, unless you design the chips." I think having poly-skill teams makes sense to me. Many organizations have kept their UI as a silo. That's been really disheartening. I think part of this is maybe because we didn't talk about it enough early on. I didn't highlight these challenges enough. I think it's also partly been around the prevalence of single page app technology which does cause issues around UID conversation.

Micro Frontend

Reisz: What are your thoughts on the term micro frontend? Is that just another name for microservices? Or is it a complementary technology?

Newman: Micro frontend is very specifically the concept whereby you take a single-page app and make it not a single page app anymore. It's talked about in the context of right now a single-page app. The micro frontend being, I don't have one single page app representing my user interface, I now potentially have multiple single-page apps. We could then debate the word single at this point. I can coexist them inside the same browser pane. This is really about how do you make single page app frameworks and work with them in such a way that you can now break that work apart into different applications, which can be worked on and deployed independently? You're now dealing with issues like, how would you sandbox your whole NPM chain and all this stuff? It's absolutely, I think, would be a supporting technology for me. The use of micro frontends would be how I would decompose my user interface if I was trying to do that with single-page apps. If I've got a website, I could use pages as my unit of decomposition. I don't even need to think about micro frontend. I'm just serving up different pages from different services. It's definitely a supporting piece of technology and also a supporting concept.

Release Train as a Point in Time towards CI/CD and Not a Destination

Reisz: One of the things that you talked about in the book was a release train being a point in time in the journey towards CI/CD, but not necessarily the destination itself. Can you talk a bit about that?

Newman: I used to do a lot of CI/CD work, and I still do. A release train is an idea that you set a cadence. You say every four weeks, the release train leaves, and all functionalities ready, go out on the release train. That's the idea. It's a really effective technique to help people get used to delivering software on a regular cadence. The idea then is that you increase how frequently the release train leaves, and then eventually, you get rid of the release train and move towards delivering on-demand. I always described a release train as a useful set of training wheels on your bike, but you want to learn how to cycle properly, and so eventually, you get rid of them. I talk about in the context of the book, because I see some organizations adopt a release train for a services-based system. That can codify the idea that lots of services get released together. I talk about, in those situations, if you've got 20 different services, each individual service should have its own release train. You don't have a release train for the whole system.

Sarah Wells has talked about this as well, I think, when she's seen this happen, where org teams which have a, we release every four weeks. You end up with a distributed monolith as a result. I said, "If you've got a release train, that's fine, but understand why you've got it." You want to move beyond it. If you've got one for, all of your teams, allow each team to have their own release train. That's a good first step. Then start to increase the cadence of those releases and eventually move to release on-demand. That's just what's in the "Continuous Delivery" book, is cutting edge information, really. I didn't feel I had to have this conversation, but with something like SAFe, for example, which effectively codifies the release train as being the way to release software. It does come up more now. I think it just needs to be talked about a bit more to explain, this isn't an aspirational technique. This is something which for many people is a stage you move through towards proper release on-demand, or release when ready type continuous delivery flow.

Patterns in the Book Equivalent to the Strangler Pattern

Reisz: It made a lot of sense when you made the relationship that if you can't move beyond it, that you may be building distributed monoliths, and reinforcing the patterns of a distributed monolith made a lot of sense to me. One of the big aspects of this book, as you're reading through there, was a lot of patterns. You'll see familiar patterns in there like the strangler pattern, to be able to break off units of a monolith, and be able to strangle it down and move it out to a microservice. There are other patterns in there. There's a branch by abstraction, parallel runs, decorating collaborators. What are some of the other patterns in the book that might give an architect the same mileage that they got out of the strangler pattern?

Newman: I think branch by abstraction is a big one. I think it's overlooked because I think the branch by abstraction pattern is typically only talked about in the context of trunk-based development. Some people find trunk-based development controversial. We won't have that argument again, because people who think trunk-based development is right are also correct. Because of that, that pattern is only looked at in the context of trunk-based development. People don't even look at it unless they're doing trunk-based development. I should probably explain it, shouldn't I?

Branch by Abstraction

Reisz: I was just about to ask that. I'll set it up for you. What is branch by abstraction?

Newman: You could distill it down to its smallest form and say that this is the true Liskov Substitution Principle at one level. The idea behind branch by abstraction is that you want to reimplement a piece of functionality, potentially to change that functionality, or in the context of a microservice architecture, to migrate that functionality to be a new service. What you're doing is basically you've got the existing code, and you don't want to break the existing code, but you need to come up with a new version of that code. I could do that in source code branching, or I could do that in the same code base. How do I do that? With a branch by abstraction, so effectively, I create an abstraction point that allows me to, say, toggle between one or the other implementation. It's a really straightforward code level where I might do that.

I might have an order processor class that wants to send updates when the order is being dispatched. In an [inaudible 00:26:11] system, I would inject a notifications interface. That notifications interface could have more than one implementation. If I use that I could have my notifications implementation, which actually does all the functionality inside my monolith. I could have a microservice notification implementation that actually calls out to a separate notification service. At runtime, I could then change which one of those implementations I'm using, could be done on the feature flag or something else.

Refactoring

At one level, this is just abstractions and toggleable abstractions. When we do it in the context of a microservice migration, what we're typically looking for is to be a refactoring. Refactoring is something that changes the code without a structure. When you're migrating functionality from an existing application to a microservice application, we typically want to keep the functionality exactly the same so that we can compare and make sure we've done it well. This is effectively creating two different implementations of the same abstraction with exactly the same behavior, which is really the Liskov Substitution Principle. Although, often when you do branch by abstraction, you're varying the behavior. We typically don't want to do that in a microservice migration. Because if we can run both those implementations, we can make sure they're working in the same way, we can compare results, and so on.

If You Do Branch By Abstraction, Do You Release It as a Canary?

Reisz: Let's keep going to that natural extension of this. If you follow along, and you do branch by abstraction, and you have the ability to inject, I may do feature flag it. Do you release that as a canary and then instrument it to compare the results. What's the follow-on once you've done that to know whether this is really the right approach?

Newman: A lot of it does depend a bit on a feature by feature basis. It's down to the risk of this change, making everything else. Probably, an ultra-cautious end to this would be to do a parallel run. A parallel run would be where you would run all calls through both implementations of that piece of functionality, compare the results. Then say, "It's working." Typically, in that situation, you'd have the old implementation, which is the one you trust, your new microservice is one which you don't yet trust. Whenever a call comes in to use that abstraction, you pass that call on to both implementations of the abstraction. Then you're getting a complete like-by-like comparison. That comparison can be done live. That comparison could be done offline. Then when you get to a place where you're confident, you can say, "I can now go on with the new version that's working appropriately. I can remove the old version."

Comparing the Results from a Parallel Run

The thing with a parallel run is, you're getting to compare the results. If the microservice implementation misbehaves, that functionality has never actually been made visible to the customer. Because the results of the implementation you surface to the customer are from the old implementation. If I'm calling the monolith calculation, and I'm calling the microservice calculation, I want the answer to be the same. I'm never going to surface any answer to the customer other than the one that comes from the monolith implementation until I'm at a point where I trust it. That means that any problems with your microservice execution are completely hidden from the customer.

A Canary

A canary is a different track on that. Of course, it would have a set portion of the traffic go to the microservice implementation. That means if that had a problem, then those people who are in that canary group would see that issue. If your canary group is actually your internal customers, or your internal beta test team, that might be acceptable. A lot of it would depend on the business context. I talk about parallel runs. Branch by abstraction, as a pattern, makes it quite easy to implement parallel runs. It also makes it quite easy to implement internal canaries. I talk about those things as being under the umbrella of what we call progressive delivery. The different techniques around how you roll out functionality to end-users.

Dealing with COTS

Reisz: It totally makes sense. What happens when you have a monolith and there is COTS functionality that is embedded within the monolith? A CRM, or a CMS, or something along those lines, how do you deal with that COTS? Do you wrap it? What are some of the strategies there?

Newman: Wrapping it can work. Wrapping is a good technique. I can't remember if it's in this book on "Building Microservices," I talk about our experiences of helping move away from a Salesforce-based system. A Salesforce-based application dealt with accounts, and revenue, and project information, and everything else. What we started doing was wrapping it with multiple services. I stopped people going straight to the Salesforce for the account information, and I said you're going to the account service. Behind the scenes, the account service was Redis shelling out to talk to Salesforce. Initially, Redis was providing almost an adapter layer on top. That then got you from rather than going direct to the COTS system, we could then look at the migration behind the scenes. That approach can work fairly effectively. At that point, you can start doing things like strangler figs inside those adapters to bypass functionality around.

Another option is just good old fashioned get hold of the database. Some of the COTS products don't have a nice API. Salesforce, relatively speaking, is a pretty easy to use API. We've got a lot of things we can do at that layer. Some software doesn't have that. Sometimes all you've got is the database where data is stored in. In those situations, this becomes more problematic. There you might have to do some weird Change Data Capture things to extract the data out. What I tend to see with COTS-based migrations of functionality because you don't have the ability to change the monolith themselves in that case, because the monolith is the COTS products. You can't use patterns like branch by abstraction, because they require an internal change of the structure of the code. Instead, you're much more limited in the options. That can often result in you having to take maybe bigger slices of functionality out of that system. A COTS product with a decent API gives you a lot more options about how to do that. If you've got maybe a graphical base, very much black-box type of COTS products, sometimes your best bet is to go into the database or to take bigger slices of functionality out of that system.

Decomposing the Database

Reisz: Speaking of the database, there's quite a bit in this book that talks about decomposing the database. Why so much attention in the book for decomposing the database?

Newman: If you want to have services that can be worked on and deployed independently, then you need to avoid the quite nasty, often pathological coupling that comes from sharing the same database or reaching into somebody else's database. With microservice architectures, typically, if a microservices needs to store, manage, or take state, that it does so in a database that it owns, that it controls, and it hides from the outside world. If we want to move from a monolithic system, where our data is probably one big database, to microservices, we've got to pull that database apart. This tends to be where a lot of these migrations falter. People can find ways to pull the code apart. They don't bother to pull the database apart. They leave themselves without really getting the benefits from that microservice migration.

If we're to break the database apart, that's difficult. They're often relational databases. A lot of their power comes from being in one place. They are amazing things, databases. We're going to have to do some horrible things to a relational database in order to get that data out. Things like breaking foreign key relationships and messing up join queries, and potentially having to give up on transactional integrity in some areas. I really wanted to give people a whole lot of patterns and different techniques in that space to show what is possible, but again, also to show that you shouldn't have to make this change daunting. You're not saying, "We're going to do it all today." You're going to say, "We're going to make one step today. That step will get us further down the line of where we want to be." Trying to show patterns that like the application things are refactorings. They're small changes you can make to a database that allow you to edge yourself into the right direction.

Leveraging Change Data Capture in Decomposing a Monolithic Database

Reisz: I love that phrase, pathological coupling. That's a great phrase. One of the techniques you talk about in there was Change Data Capture. Just one of the ones I think is fantastic for being incrementally moving towards more of an event type of driven system. Can you talk a bit about Change Data Capture and how you might leverage that technique in decomposing a monolithic database?

Newman: A lot of the patterns in the book are patterns that preexist with the microservices world. I've just tried to take those patterns and show how they can be used in the context of microservices. Strangler pattern is one example of that. The Change Data Capture, really straightforward idea. It's basically, once a change is made in one data source, you can capture that change and pass that change on somewhere else. A lot of CDC systems basically work by looking at transaction log of databases, large numbers of people now. This is one of the biggest use cases for Kafka. You use Debezium. It looks at the transaction log of your database when a bit of data gets inserted. Event gets pumped out over Kafka. Things are going to subscribe to those events. This can be used for data replication. It's a classic part of normal ETL processes. Change Data Capture could be really useful if you need to replicate state between two different systems potentially because you're looking at something like a transitionary mode where you've got maybe two sources of truth for a period of time. I have a couple of patterns right now I share about things like the example from Square and the tracer write type situation, although they didn't do CDC there. It also can just be very useful in terms of attaching behavior on changes in a monolithic system.

Awarding Points When an Order is Placed

I think I gave the example, what happens if I want to award points when an order is placed. The only way I know an order is being placed is when an order arrives in the database. When an order arrives in the database, I could have it fire an event via a Change Data Capture pipeline, maybe over Kafka. I could then receive the event that says the order has been placed. My little loyalty microservice could then start adding and awarding points to you, and exchange your order. It can be a really nice system actually. I think the one challenge with any Change Data Capture system is it requires that your CDC pipeline knows about the schema. At this point, you're normally talking about a monolithic database, which is so difficult to change that no one is changing it. It often is quite stable. There are some good modern tool chains in that space. I don't think it's just narrowly looked at in the world of ETL anymore. There was a really good case study recently from Airbnb talking about how they did CDC as part of their microservice transition. I think they've built their own internal system for that.

Are Frameworks a Foregone Conclusion With Scale?

Reisz: Debezium with Kafka that you mentioned, those are two that I'm familiar with. They're pretty powerful in that space. We're nearing the end here. I wanted to touch on just a few growing pains. We've talked about a handful of services, but I want to talk about, as you continue to scale, some of the challenges that people face. I want to talk a bit about frameworks, things like service meshes, and maybe just touch briefly on logging before we wrap up. There's a quote that you have in the book, more services, more pain. As you continue to scale out your services, are frameworks just a foregone conclusion, things like service meshes?

Newman: Not necessarily a foregone conclusion. I think service mesh is a great example. Service mesh is a conceptually simple idea but the quality of the implementations in that space is still variable. I have been giving people the same advice about service mesh for the last three-and-a-half years, which is, if you can wait six months to work out which service mesh you pick, wait six months. I'm still giving that advice. I think, conceptually, they are good. The whole point of a service mesh is that you get to push a bunch of common behavior into the platform. The difference between maybe that and a normal framework like Spring Boot, is that there you're relying on the framework running inside the service itself. Maintaining consistent behavior across a whole bunch of services is difficult. You can't say, "Everybody, upgrade to Spring Boot version 15," or whatever because you don't want to be doing those deployments. With service mesh, you can have some degree of common behavior running in the platform.

Service Mesh for Synchronous Communication

The other thing that's good with service mesh is they really are only primarily for synchronous communication. They don't really give you many solutions of anything, any help at all really, in the world of async communication. Just look at the variety of different options you've got around right now. Istio just realized the other day, we need to completely rearchitect it and rebuild the whole architecture of how Istio was built, and run, and managed. I love the idea. I think something like that might make sense as part of your future. I think what's more likely to be what we're looking at, is not, you've got to run on Kubernetes and on a service mesh. It's more like it's going to be something like most development in the next 5 to 10 years will be done on some Fast Type platform. I think that's much more likely going to be the future for most developers. Even if under the hood, it's a giant, hellish Jenga stack of Kubernetes, and Istio, and Knative, or something else.

Dealing With Volume of Logs and Metrics

Reisz: What advice do you give on the observability front, talking about logs, metrics, and tracing? As you deploy more services and get more scale, those logs continue to get bigger? How do you deal with the volume of logs and metrics?

Newman: I think volume-wise, there are some balancing forces here, because on the one hand, you don't know what you need until you need to ask the question of the data. One level you want to lean towards logging as much as you can. To be honest with you, logging and log volumes, if you're logging a lot of data, you need to quantify what a lot of data is and look at the capabilities of the log aggregation platform you're looking at. Something like ELK Stack, for example, the Elasticsearch, Logstash, Kibana, the thing that's most concerning there in terms of large volumes is Elasticsearch. I spoke to a big SaaS based company. They had a dedicated team just running Elasticsearch just for the ELK Stack cluster. They're dealing with quite large volumes. A lot of people aren't actually in that space. If you are, at the massive volume space where you're generating so many logs, then there are dedicated log aggregation tools that are great at doing that, to pay some money. Get Humio and you'll be happy.

Logs versus Other Data Types and Metrics

I think it's a bit different with logs versus other types of data, and metrics, and things like that. Because often when it comes to large volumes of data if you can collect that information in a semi-asynchronous or batch-oriented fashion, which is how most log aggregation is done, that allows you to sidestep a bunch of the problems. There are other situations where you need to collect data in such a way that you need the data to come through now. Distributed tracing is a great example of that. You need a single wall clock to do effective timing, to look at how long something takes. You look at systems like Jaeger, Honeycomb, or LightStep. Those are quite different bits of data you're gathering there, typically, because the logs, they actually need to get to me within the next couple of minutes. When I'm sending traces as part of a distributed trace, they have to come through in asynchronous fashion. Because of the large volumes of data that you often deal with there, you don't need to affect a running system, which is why you do sampling.

Getting Data Off Of Machines

To be honest with you, it's very hard to say this is what you're going to need. This comes back to this idea of the dial. I always come back to the key point. You need to get the data off of those machines and stored somewhere centrally, where you can actually go and ask questions of that data. That's the first thing. You need to get your logs out. You need to get your metrics out and you stick them somewhere where you can query them. Ideally, these are queries that are structured queries. They're repeatable structured, and repeatable queries. The tools I tend to look at for metrics and certainly for tracing, if I've got no money, I'm looking at maybe using a mix of Jaeger and Prometheus. If I've got money, I'm looking at LightStep and Honeycomb just because they are tools that have been built with these systems in mind. I think the biggest challenge is they look quite alien to some people.

Going from Microservices Back To a Monolith

Reisz: Recently, we've seen some projects, some companies, some different architectures that have been reverting, going from microservices back to a monolith, at least a process monolith. Istiod comes to mind. Why do you think that is?

Newman: It's really straightforward. They didn't read my books. One of the things I've talked about a number of times is the challenges of microservice related architectures being operated by the end customer. When you create a piece of software that you're giving to somebody to have them own and manage it, that's an issue. With a microservice architecture, you push a lot of complexity into the operational space. All of that complexity is visible to the customer. You've now got to run effectively a little mini microservice stack, which was the original Istio architecture in order to run your microservices stack. Actually, by moving it all back into being a single monolithic process, they make the operation of that much simpler.

I think if people can't see the inherent irony in this I don't know what to do. That's why they've done it. Actually, I think it's good and it's healthy. The fact it has taken so long for them to do this is a bit of a worry, especially given that Istio is effectively the underpinning of what is now going to be Knative. Although, now it's in Google's only hands, who knows whatever is going to happen there? It was totally understandable. I'm glad they did it. I think it was the right thing to do. I think it's a brave decision. I don't read anything into it about Istio not being fit for purpose. This is a fairly fundamental change that has come two years since they stabilized what Istio was supposed to be. I think this is why I say the space of service meshes is still not stabilized.

Reisz: I think it was pretty brave, though. You got to eat your own dog food, so you want to build on top of the patterns that you're espousing. They also understood that customers were fighting with the complexity of what was there. I think it made a lot of sense.

Newman: That was very brave.

Reisz: What's next?

What's Next?

Newman: Writing and lots of remote consulting and remote training for my clients and hoping my internet holds up. That's basically my next life for the next six months.

Reisz: I hear you, Sam. Sam, thank you so much for joining us on "The InfoQ Podcast." As always, it's great to chat with you.

Newman: Thanks so much.

More about our podcasts

You can keep up-to-date with the podcasts via our RSS Feed, and they are available via SoundCloud, Apple Podcasts, Spotify, Overcast and YouTube. From this page you also have access to our recorded show notes. They all have clickable links that will take you directly to that part of the audio.

Previous podcasts

Rate this Article

Adoption
Style

BT