Everyone likes the idea of building something new. So much freedom. But what about making changes after you have users? In this episode, Thomas Betts talks with Brandon Byars about how you can evolve your API without versioning, a topic he spoke about at QCon San Francisco.
Key Takeaways
- When an API has to evolve, creating a new version creates work for every consumer.
- Strategies exist for API producers to modify an API without versioning. This centralized the cost of the changes, instead of distributing it to all consumers.
- Full backwards compatibility can be maintained by chaining together the compatibility code in chronological order.
- Obviousness, elegance, and stability are three criteria for evaluating patterns for API evolution.
- Mountebank is an open-source project that has used various strategies to evolve its public API without versioning.
Subscribe on:
Transcript
Promo [00:01]
Hi everyone. Registration is now open for QCon London 2023 taking place from March 27th to the 29th. QCon International Software Development Conferences focus on the people that develop and work with future technologies. You'll learn practical inspiration from over 60 software leaders, deep in the trenches, creating software, scaling architectures, and fine tuning their technical leadership to help you adopt the right patterns and practices. Learn more at qconlondon.com.
Intro [00:29]
Thomas Betts: Everyone likes the idea of building something new, so much freedom. But what about making changes after you have users? Today I'm talking with Brandon Byers about how you can evolve your API without versioning the topic he spoke about at QCon San Francisco. Brandon is a passionate technologist, consultant, author, speaker, and open source maintainer. As head of technology for Thoughtworks North America, Brandon is part of the group that puts together the Thoughtworks technology radar, a biannual opinionated perspective on technology trends. He is the creator of Mountebank, a widely used service virtualization tool and wrote a related book on testing microservices. Brandon, welcome to the InfoQ podcast.
Brandon Byars: Oh thanks. Happy to be here.
The pain of API versioning [01:05]
Thomas Betts: I set this up a little in the intro. Let's imagine we have a successful API and it's in use by many people and other services calling it, but now it's time to make a change. In general, if we're adding completely new features, that's easy, but when we need to change something that's already being used, that's when we run into trouble. Why is that so difficult and who's impacted by those changes?
Brandon Byars: Yes, it's a really hard problem and often the pain of absorbing the change is often overlooked. So let's start with that second question first. The API consumers, when you see a new major version, regardless of how that's represented as an API versioning or SemVer or some equivalent, that's indicative of breaking changes because the API producer wanted to either fix something or change the contract in a breaking way, that is work for you to consume. And that work is oftentimes easy to overlook because it's federated amongst the entire population of API consumers. And a lot of times you don't even have a direct connection with them for a public API like Mountebank as a public command line tool and it's a hybrid REST API with some interesting nuance behind it.
The standard strategy that you always hear about is versioning and of course versioning works. You can communicate to the consumers that they need to change their code to consume the breaking changes in the contract. But that is work, that is friction. And what I tried to do very intentionally with Mountebank, which is open source, so I had a bit more room to play, it's just a volunteer project, was really try to come up with strategies outside of versioning that make that adoption easier because you're not frustrated with changes over time. And Mountebank itself is nine-years-old. It, itself, depends on APIs. It's a node JS project, so it depends on node JS libraries.
And I've spent more volunteer nights and weekends time than I care to admit not adding features, simply keeping up with changes to some of the library APIs that had breaking changes because they legitimately cleaned up their interface but they cleaned up the interface at the cost of me doing additional work and that adds up over time. And so I really pushed hard to come up with other strategies that still allow me to improve the interface over time or evolve it in ways that would typically be a breaking change, but without forcing the consumers to bear through the work associated with that breaking change.
Thomas Betts: And I like how you mentioned in that case, were a consumer that's also a producer. A lot of us, software developers straddle both lines. We're creating something that someone else consumes and sometimes that's a customer facing product, it's a UI, but sometimes it is an API that's a product, which is more like what you're describing with Mountebank.
Brandon Byars: Yes, of course, API is a broad term, application programming interface. So I mentioned no JS libraries, those are in process and the JavaScript function definition for example might be the interface. Mountebank has a REST API, but it also has an embedded programmable logic inside of it that is similar to what you might expect as a Java function interface because you can pass in Java functions inside it as well. So it works on a couple different levels of that. But you're absolutely right, it is a API, it's a product released publicly. I don't have a direct line of communication to each of the individual users of it. I do have a support channel, but I would prefer, for my own sanity, that they don't use the support channel for just simple upgrade options. I would prefer to take that work off of both them and me in terms of the hand holding around it.
What is Mountebank and how does it provide useful examples? [04:40]
Thomas Betts: And so what exactly is Mountebank and then why was it a good system that allowed you to explore these ways of how to evolve an API?
Brandon Byars: Mountebank is what's called a service virtualization tool. And that phrase I stumbled across after writing Mountebank, I hadn't come across it previously that I considered it an out of process stub. So if you're familiar with a JMock, or one of those mocking tools that's in process stubbing, this allows you to take that out of process. So if I want to have black box tests against my application and my application has run time dependencies on another service that another team maintains, perhaps, anytime I run my tests against that service, I need an environment where both my application and the service are deployed. Especially if another team is controlling the release cycle of that dependency, then you can introduce non-determinism into your testing.
And so service virtualization allows you, and testing your application to directly control the responses from the dependent service so that you can test both happy paths, and it's much easier to test exceptional scenarios once you understand what the real service should respond like in those exceptional scenarios to test the sad paths as well, allowing you a lot more flexibility and test data set up, test determinism.
And of course, it still needs to be balanced with other tests approaches like contract testing to validate your environmental assumptions. But it allows you to give higher level tests, integration or service or component tests with the same type of determinism that we're used to in process.
So why is it a good platform for exploring these concerns? Part of that is just social. It's a single owner open source product. I manage it so I have full autonomy to experiment, is also because of the interesting hybrid nature of it that I mentioned previously where it's both the REST API that you can start up with or the command line interface that listens on a socket and exposes arrest API and of course, it can spin up other sockets because those need to be the virtual services that you're configuring.
And the programmable nature of it where you can pass in certain JavaScript under certain conditions that try to cover off security concerns allows for some really interesting evolutions of both what you would normally represent on something like an OpenAPI specification. And recognize an OpenAPI specification will never be rich enough to give you the full interface of the programmable interface that's embedded inside the REST interface. So it allowed me to explore a lot of nuance around what it means to provide an API specification and have the autonomy to do that. And a tool that I was fortunate had some pretty healthy adoption early on. So I was doing this in the face of real users or in the natural course of work with real users not trying to do something artificial that was just a science experiment on the side.
APIs are promises, not contracts [07:34]
Thomas Betts: So one of the things we usually talk about APIs, we describe them as contracts, but I remember in your QCon talk, you said that the better word was promises. Can you explain the difference there?
Brandon Byars: Yes, and it's really just trying to set expectations with users the right way, and have a more nuanced conversation around what we mean by the interface of an API. So we talk about contracts and we have specifications and of course, if you remember, we went through that awkward transition from SOAP to REST in 2008-era time frame, we really didn't have any specification language for REST. There was a lot of backlash against WSDL for SOAP. It was very verbose and so we went for a few years without having some standard like what Swagger ultimately became.
So we had some room in my career that I was part of where we experimented without these contracts, but we obviously still had an interface and we would document that maybe on wikis or whatever that might be to try to give consumers an indication of how to use the API. We could get so far with that, it still had flaws in it. And so we filled that hole appropriately with tools like Swagger OpenAPI. There were other alternatives that allowed us to communicate with consumers in a way that allowed us more easily to builds STKs that allowed generic tools like the graphical UI that you might see on a webpage that described the documentation around it for Swagger docs, but it's never rich enough to really define the surface area of the API.
And that is particularly true when you have a complex API like Mountebank with an embedded programmable interface inside of it, because now you're talking about what is just a string on the JSON interface. But inside that string might be a function declaration that also has to have a specific interface for it to work inside the JavaScript context that it's executed inside of. And that's an example, but it's a more easily spotted example than what tends to happen even when you don't have a programmable interface, because you still have edge cases of your API that are always difficult to demonstrate through the contract.
And this idea of promises came out of the configuration management world. Mark Burgess, who helped create CFEngine, one of the early progenitors to Puppet and Chef and the modern infrastructure-as-code practices, defined a mathematical theory around promises that allowed him to build CFEngine. But it was really also a recognition that promises can be broken in the real world. When I promise you something, what I'm really signaling is I'm going to make a best faith effort to fulfill that promise on your behalf. And that's a good lens to think about APIs because under load, under exceptional circumstances, they will respond in ways that the producers could not always predict. And if we walk into it with this mentality, this architectural ironclad mentality that the contract directly specifies with the API is, how it's going to behave, we're missing a lot of nuance. It allows us to have richer conversations around API evolution.
Trade-offs and evaluation criteria for API evolutionary patterns [10:32]
Thomas Betts: I want to go back, you said there's a lot about communication and that's where you got in your talk about the evolution patterns and different ways to evolve an API. You had criteria and communication seemed to be the focal point of that and architects love to discuss trade-offs. What are the important tradeoffs and evaluation criteria that we need to consider when we're looking at these various evolution patterns?
Brandon Byars: There's an implicit one and I didn't talk about it much because it's the one that everybody's familiar with and that is implementation complexity. A lot of the times, we version APIs because we want to minimize implementation complexity and the new version, the V2, allows us to delete a bunch of now dead code so that we, as the maintainers of it, don't have to look at it.
What I tried to do was look at criteria from a consumer's perspective and the consumers don't care what the code inside your API looks like.
Obviousness [11:23]
I listed three dimensions. The first one I called obviousness. A lot of times goes by the name and the industry of the principle of least surprise. Does the API and the naming behind the fields and the nesting structure and the endpoint layout, does it match your intuitive sense of how an API should respond? Because that eases the adoption curve. That makes it much easier to embrace and you always have the documentation as a backup, but if it does what you expect, because we, as developers or tinkerers, we're experimenters, that's how we learn how to work through an API. Obviousness goes a long way towards helping us adopt it cleanly.
Elegance/Usability [12:02]
I listed a second one that I called elegance, which is really just a rough proxy for usability and the learning curve of the API, consistency of language, consistency of style, the surface area of the API. A simple way to avoid versioning for example is to leave Endpoint1 and just call it Endpoint1V2 and have a separate endpoint, that allows you to not version. And it's a legitimate technique, but it decreases elegance because now you have two endpoints that the consumer has to keep in mind and have some understanding of the evolution of the API over time as an example.
Stability [12:40]
And then the third one is stability, which is how much effort a consumer has to put in to keeping up with changes of the API over time. And of course, versioning that's stable within the version, but oftentimes requires effort to move between versions. Some of the techniques that I talked about in the talk meet stability to varying degrees. Sometimes, it can't be a perfect guarantee of stability. This is where the promise notion kicks in, but can make a best faith effort of providing a stable upgrade path to consumers.
Change by addition [13:12]
Thomas Betts: So that gets us to the meat of your talk was about these evolution patterns. I don't know if we'll get through all of them, but we'll step through as many as we can in our time. The first was change by addition, which the intro I said is considered the easy and safe thing to do. But can you give us an example and talk about the pros and cons of when you would or wouldn't want to change by addition?
Brandon Byars: Yes, the simplest example is just adding a new field. It's simply adding a new object structure into your API and that it should not be a breaking change for consumers. Of course, there are exceptions where it will be if they have strict deserializion turned on and configure their deserializer and throw errors if it sees the field it doesn't recognize. But in general, we have to abide by what's known as Postel's Law, which says that you should be strict in what you send out and liberal in what you accept. And that was a principle that helped scale the internet.
Postel was involved in a lot of the protocols like TCP that helped to scale the internet. And it's a good principle to think in terms of API design as well or having a tolerant reader. A more controversial example might be the example I just gave, which is if we have Endpoint1 and I decided that I got something wrong about Endpoint1 about the behavior, but I don't want to create a new version, I just create Endpoint1V2 as a separate endpoint. And so that's a new change. It's a change by addition, but it's an inelegant one because it means now, consumers have to understand the nuance between these two endpoints. So it increases the surface area for the same capability fundamentally of the API.
Multi-typing [14:39]
Thomas Betts: Yes, I can see that. GetProducts and GetProductsV2 and it returns a different type. And then what do you do with the results if you want to drill into it and that can quickly become a spaghetti pile of mess. The next one was multi-typing and what does that look like in an API?
Brandon Byars: Yes, so I did this one time in Mountebank and I regretted it because I don't think it's a particularly obvious or elegant solution, but I had added a field that allows you to specify some degree of latency in the response from the virtual service at just a number of milliseconds that you wait. And then somebody asked to be able to make the number of milliseconds dynamic. And so I mentioned in passing this programmable embedded API inside the REST API, there was a way of passing a JavaScript function, in another context. So I decided that was a solution that sort of fit within the spirit of Mountebank, but because I didn't want to have the GetProducts and GetProductsV2 endpoint, so I didn't want to have a Wait behavior is what it's called, and a WaitDynamic behavior at the time. I just overloaded the type of the Wait behavior.
So if you pass a number, it interprets it as milliseconds. If you pass something that can't be interpreted as a number, it expects it to be this JavaScript function that will output the number of milliseconds to wait and that works without having to add a new field. But it's a clumsy approach in retrospect because it makes building a client SDK harder. That's a unexpected behavior of the API. So in retrospect, I would've gone with a less elegant solution that increase the surface area of the API to just make it more obvious to our consumers.
Thomas Betts: The idea of having an overload makes sense when it's inside your code. I write C# mostly and I can overload a function with different parameters and specified defaults and that's intuitively easy to tell when it's inside your code. When you're getting to an API surface, that raises a level of complexity because of how we're communicating those changes, it's not as obvious. You don't necessarily know what language is going to be calling into your service and what they're able to do.
Brandon Byars: Yes, that's exactly right and that's why I mentioned it in passing because I did do it. That was one of the very first changes I made in Mountebank but regretted it and I don't think it's a robust strategy moving forward.
Thomas Betts: Yes, it's also a case of if you make all the decisions based on the best information you have at that point in time and the 500 milliseconds sounded like a good option but quickly ran into limitations. I think people can relate to that.
Upcasting [16:58]
I know I've run into the next one myself and that's upcasting. So take a single string and oh, I actually want to handle an array of strings. How does that look in an API and do you have any advice on how to do that effectively?
Brandon Byars: Yes, upcasting is probably my favorite technique as an alternative to versioning. So the idea, the name, of upcasting is really this idea of taking something that looks like an old request and transforming it to the new interface that the code expects. And did something very similar to what you just described, had something that was a single string. It was this notion that I could shell out to a program that could augment the response that the virtual service returns, but quickly realized that that needed to be an array because people wanted to have a pipeline of middleware that other tools supported so they could have multiple programs in that list. And the way that I went about that in Mountebank was I changed the interface. So if you go to the published interface on the documentation site, it would list the array. That was the only thing that was documented.
Because this was the request processing pipeline, every request came through the same code path. So I was able to just insert one spot and that code path for all requests that said, "Check if we need to do any upcasting." And what it would do is it would go to that field and say, "Hey, is the type a string? If it is, then just wrap an array around that string." And so the rest of the code only had to care about the new interface. And so that made the implementation complexity. It reduced what having to scatter a lot of this logic all throughout the code, it was able to centralize it in one spot.
It also is really effective because you can nest upcasts. In fact, this happened in the example that we're talking about where it went from a string to an array and then, without getting too much detail, it actually needed to turn back to a string. But I needed to put an array at the outer level. And so I had to then have a second upcast that just said, "Hey, is this an array?" Turn it back to a string and is this outer thing an array or an object? And make sure it's the right type and go through the transformation to fix it if it's not.
But again, it's very simple and very deterministic because it all requests in the pipeline go through the same code pass. It'll centralize the logic and as long as you execute the upcasts in order in chronological order of when you made those changes, what would otherwise be versions, then it's a determinist output and you're accepting basically anybody who has any previous version of your API, it will still work. Even if it doesn't match what's documented as a published interface, if it matched what used to be documented, the code will transform it to the current contract itself.
And so that's a really powerful technique that balances those concerns that we talked about around obviousness, and elegance, and stability. It's a very stable approach. There still are edge cases where you can break a consumer if they're then retrieving the representation of this resource that has had its contract transformed to the upcast and that breaks some client code that they have. You can still imagine scenarios where that could happen, but it's quite stable and very elegant because it requires no additional work for the consumer to consume it.
Centralize the cost of changes inside the API [20:05]
Thomas Betts: Yes, that's a key point that you're trying to get to is minimizing the impact to the consumers. So having a version pushes the cost to them for this breaking change. But here, you're saying it is a breaking change but you are accepting the cost as the producer of the API.
Brandon Byars: Yes, and what I like so much about upcasting is that accepting the cost is centralized and easy to manage. And so whereas every consumer who used that field would've had to make that change with a new version. Only me as the producer has to make this change with an upcast and I can centralize it and it's not a lot of change. And I have all of the context around why the change happened because I'm the producer of the API so I can manage it in probably a safer way than a lot of consumers. I know where a lot of the mine fields that you might step on are during the transformation process itself.
Thomas Betts: Yes, I like the idea of having these versions. You talk about the versioning increasing the surface area of the API. It's also a matter of increasing the surface area of the code that you're maintaining. And here, by implementing that one upcast, it's in one place and it's very clear as opposed to now I've got the two endpoints, I've got double the code to maintain and how do I support that going forward? You've almost effectively deprecated the old one by assuming all of the functionality in the new one automatically.
Brandon Byars: Yes, so it's a clean technique because what you document as your published interface or contract is exactly what you would've otherwise done with a new version. It represents the new interface and the transformation code itself is very easy to manage with an upcast in my experience, at least with the upcasts I've done to date. And even when it's a complicated transformation, well that same transformation you would be asking your consumers to do were you to release a new version.
Thomas Betts: And like you said, this specifically, you changed the published specification. So you said, "I accept an array," but if someone still sent you a single string, which no longer abides by your published contract and you're like, "Oh, that's still good." And so it's no impact to them, but how do you resolve that discrepancy of, "Here's what I say works, but that's not just what I do." It's like an undocumented feature.
Brandon Byars: That's where you run some risk because now this undocumented feature, in fact a subsequent example that, hopefully we'll get to, tripped over this, those undocumented features can cause bugs. So you have to be thoughtful about that. You have to be careful. But it's part of the trade offs. We talked about architectural trade-offs and this is allowing us to have a clean interface that represents the contract we want without passing complexity to the consumers to migrate from one version to the next. So it reduces the friction of me changing the interface because I have to worry less about the cost of the consumers with it while maintaining the clean interface that I want as long as I don't run into too much risk of these hidden transformations causing bugs.
And in the case that we just talked about where it was simple type changes, I feel really confident that those don't cause bugs. The only bugs would be people round tripping the request and then getting the subsequent resource definition back into their code and doing some additional transformations in the client side. So there's broader ecosystem bugs that could happen, but then it's the same cost that the consumer would've had to do if I had released a new version. So it's not making their life any worse than a new version would.
Thomas Betts: And then you said that you just apply these in chronological order. So it's almost like a history. You have comments in there that say, "Hey, this was version zero or version one, then version two," and you can see the history of I had to do this and then I had to do that and then I had to do that. And so is your code self-documenting just for your benefit of, "Oh Yes, I remember that decision that I had to make and this is how I solved it."
Brandon Byars: Close. So I have a module called compatibility, and just in one spot in the request processing pipeline, I say Compatibility.Upcast() and I pass in the request. And then that upcast function calls a bunch of sub-functions. Every one of those sub-functions represents effectively aversion, a point in time. And so the first one might have been changed the string to an array and the second one might have been changed this outer structure into an object, whatever it is. But each of those are named appropriately and the transformation's obvious. And then the documentation, the comments and so forth around the code give you the context. And I just have the advantage of having it being a single maintainer, the advantage and disadvantage. There's other disadvantages of being a single maintainer, but the advantage is that I know all the history, so it's well contained and very easy to follow.
Downcasting [24:24]
Thomas Betts: So that's a lot of talk about upcasting. What's the opposite of that? Downcasting?
Brandon Byars: Yes, downcasting is a little bit harder to think through. So this is taking something that looks like the new interface and making it look like the old interface. And I had to do this at a couple points in Mountebank's history and the implementation logic for this is more complex. The reason I had to do this is because of that embedded programmable API that I mentioned. So the REST API was the same, it just accepted the string. The string represented the JavaScript function that ran in a certain context. And over time, as often happens with functions where people are adding features over time as it just took on more and more parameters. And some of the parameters actually should be deprecated. So it's starting to look inelegant.
The usual solution for this in the refactoring world is you introduce a parameter object, like a config parameter, single parameter that has properties that represent all the historical parameters that were passed to the function. So I did that. The challenge is I needed the code to work for the consumers who passed in both the new interface and the old interface. And so the only thing that's documented is the new interface. It just takes a single parameter object. But what the code does on the downcast is then it secretly passes the second, third, fourth, and fifth parameters as well. And it secretly adds properties to the first parameter that's now the parameter object so that it had all of the properties of what was the previous first parameter as well. So anybody who is passing the old interface, the code has been changed so that it will still pass what effectively looks like the same information, especially with if you consider some duck typing on that first parameter because they'll have more than it used to have.
And for people who are passing the new interface where they just have a single parameter object, everything works great. If they want to inspect the function definition, they can tap into those other parameters, but they have no need to. That's just there for backwards compatibility purposes.
So that code had to be sprinkled. I couldn't centralize that code. I could centralize the transformation to an extent, but I had to call the downcast everywhere it was needed. There wasn't a single point in the request processing pipeline where I could do that. So it was a little bit harder to manage downcasting.
Thomas Betts: But that again, is your problem that you, as a maintainer, have to absorb versus having to figure out for your consumers, do this here and do that there and make all these little selective changes to the consumption of the API. You just accepted this isn't good for them. Does it make it easier for them to understand because you haven't added the complexity to the API surface?
Brandon Byars: Yes, and this one would've been awkward, for certain API consumers to embrace the change because it's not directly visible from the REST API contract. It's an embedded API inside the REST API contract. So I was particularly concerned about how to roll this change out in a way that was stable for consumers, didn't cause a lot of friction, or have them scratch their head and having to pour over documentation to understand the nature of the change. I want it to be as seamless as possible for them, while giving everybody who's adopting the API for the first time or getting started with it, what is a much cleaner interface.
Intentionally hidden interfaces [27:33]
Thomas Betts: I wanted to go back to... You mentioned the hidden interfaces was another landmine to worry about. Is it just a matter of you didn't provide documentation but you accept something? And is that from laziness or is it actually an intentional choice to say, "I'm not going to document this part of the API?"
Brandon Byars: Yes, it's intentional. I certainly have examples of laziness too. So I'm not trying to dismiss that as an approach, but in the cases that I wanted to at least call out, what happened was I got something wrong to begin with. And of course, when you get something wrong inside a code base, you just refactor it. But when you get something wrong in a way that is exposed to consumers, to which you don't control, it's a public API, it's harder to fix it. And this is generally where versioning kicks in that allows me a path to fix it. And then it's the consumer's problem to upgrade.
I had an example where I mentioned that one of the bits of functionality from Mountebank was shelling out to another program to transform something about the response from this virtual service. Originally, I had passed these parameters as command line arguments and it turns out that I just was not clever enough to figure out how to quote them for the shell escaping across all the different shells, primarily the Windows ones are where a lot of the complexity kicks in, especially the older cmd.exe is where you get a lot of complexity around shell quoting that isn't as portable to a lot of the posix shell based terminals.
So I got it wrong and I spent probably a full day trying to fix it. And I remember asking myself at one point, "Why am I doing this?" To just pass the arguments as environment variables problem solved. And eventually, I did that, and so I changed the interface of this programmable interface to pass in environment variables instead of command line parameters because I couldn't figure out how to pass in the command line variables the right way with the right shell escaping of quotes. And to try to strike a balance on stability and giving the new interface that I wanted, I wrote it in such a way that Mountebank was still passing the command line interfaces and if it didn't work because it broke something on a Windows shell around shell escaping, well, it never worked, so that's fine. And if it used to work for you, it should continue to work for you. Everybody else, just the new adopters just do the environment variables. They get a much more scalable solution.
But this is where those hidden mines that we talked about can trip you up because it turns out that escaping quotes on the shell wasn't the only problem. It turns out that shells also have a limit of how many characters you can pass to a command line program. And again, especially cmd.exe has the lowest limit. And so I had to end up truncating the amount of information passed in ways that actually could break previous consumers just to get over that limitation.
And it was a really interesting exercise to go through because I had to trade off what was the lesser of two evils. Should I cut a new version and force everybody to upgrade? When in fact, I had no evidence that anybody was tripping over this bug. If I truncated the number of characters into the shell, I had no evidence that was breaking anybody. And to this day, I don't have any evidence that it did. So I ended up making the change where I hid the previous interface. It's not on the documentation, it's still passed the command line parameters that shortened them in ways that could have broken somebody, that used to work in the past and no longer does.
Thinking like a product manager [30:50]
And I left some notes in the release notes that get pushed out with every release of Mountebank, but I tried to make it instead of a pure architectural guarantee of correctness, I put on my product manager hat and say, "As a user, what's the lesser of two evils?" If I run into this bug, the path to resolution is pretty clear. Can I give them as much direction as possible if that's something they're running into in the document, the release notes and so forth? And can I do this in a way that hopefully impacts nobody, but if it does, in fact as few people as possible? And it felt like that was a path with less friction than releasing a new version that would've affected everybody in the upgrade path. But that was a really different way for me of thinking about API evolution, because I had to think about it more like a product manager than an architect.
Versioning is generally an architectural concern, but it's really part of your public interface that you release to users. It's also part of your product interface. And when you come at it with a product mentality and you think about how to minimize friction, you have a more nuanced understanding of the trade offs. And I certainly did in that case.
Thomas Betts: Yes, that got to where I wanted to wrap up with talking about how developers and architects should think about API evolution, not just from the programming problems that you have. And I like that last example. Actually, I wanted to go back to it because you had a bug and sometimes, when you have a coding bug, you're like, "Oh, I can solve this inside this function and no one will know anything about it." But sometimes, you realize the bug is only a bug because of what's being passed in as input. And the fix is you have to change the input and that in this case, changes the API. Tell me more about that product management thinking of saying, "Well, we haven't seen any evidence that our customers are using this and we think it'll be a minimal impact and it'll be an acceptable impact for them."
Brandon Byars: And there's a lot. This is where it's a judgment call. It always is anytime you're managing a product, but if you never risk upsetting some users with some feature changes as a product manager, then your product is going to be stuck in stasis. So you know have to evolve. But you also know that you want to introduce as little friction as possible because in your request for new users, you don't want to lose the users you already have. It's one of the difficult parts of product management. And so in this case, it felt like the way to walk that tight rope was to take into consideration a few facts. The feature under consideration had not been out in the wild for very long before the bug was reported. So it's not like it had seen wide adoption yet.
The first change, switching to environment variables happened pretty quick. So most people who had used it should be using the new interface. And the problem was some of those people were running into a bug because they passed large strings of text to the command line. They had no idea why this was breaking because it's just an environment variable. They don't understand that it's also passing this in the command line. That was more confusing to them than stripping that functionality out.
So I was breaking people using the new interface. I had no evidence that there was any adoption of the old interface because it had this bug. And so it was a risk/reward trade off that said, "Hey, this feels like the path of least friction for the most people that leads to the cleanest outcome, so let's go down that path." And I haven't regretted it in that instance, but it's certainly something that requires a lot of nuance.
Brandon’s upcoming article [34:00]
Thomas Betts: I remember in your talk, you briefly mentioned that you are working on an article for this that'll be published, coming soon. Is that still in the works?
Brandon Byars: That is still in the works? So I had started an article last year and I put it on ice and this QCon talk that I gave in this podcast, Thomas, as a good nudge 'cause I'm hoping over the winter break to get that over the line. If I can, then I've had a first pass review with Martin Fowler who's posted some of my other work, and I'm hoping that we can get it on his blog in the new year.
Thomas Betts: All right. Well hopefully, people will be able to find that on... Is it martinfowler.com?
Brandon Byars: That's it. Yes. The one and only.
Thomas Betts: All right. Hopefully, that'll be coming out soon, early next year.
Brandon Byars: That's my hope. Yes.
Thomas Betts: Well, I want to thank you again, Brandon Byers for joining me on another episode of the InfoQ podcast.
Brandon Byars: Thank you so much, Thomas, for having me.