Jessica Tai, a self-described "ex-monolith developer" at Airbnb, spoke at the 2018 QCon San Francisco about her company's move from a Ruby on Rails monolith architecture to a service-oriented architecture. The company has expanded from 200 engineers in 2015 to 1,000 and has less downtime due to rollbacks and, Tai said, has improved performance with page load times up to 10x faster.
As the company grew, the deployments became more complex, providing the motivation for change. Airbnb engineers were delayed from deploying their code to production on average 15 hours each week due to reverts and rollbacks of code.
Airbnb managed the complexity of splitting a monolith, which the company called Monorail, by phasing the migration, comparing Monorail functionality with that of the new services. They would take 1% of the load in new world, and compared the results down both paths. They progressively increased the load towards the services, until comparison is clean against 100% of load.
This was a relatively straightforward process with reads, because they are idempotent–simply doing the same read query twice and comparing the results could be done with no side effects. However, write comparisons required a different approach. Airbnb achieved this by having the service write to a shadow database and then issue a read request to both the production database and the shadow one, and compare the results.
Airbnb used open-source tools as part of the new architecture. They developed Spinaltap, which they have open-sourced, to listen to changes in databases and put them onto a Kafka queue.
Airbnb ensured standardisation of best practices in coding their services. Airbnb looked to auto-generate the boilerplate code using Apache Thrift, an open-source framework that auto-generates code in multiple languages, including Ruby and Java.
Tai interchanged between the phrases SOA and microservices during her presentation and it's clear there are a number of concepts borrowed from each architectural style. "Airbnb's approach is closer to service-oriented architecture (SOA)," Tai told InfoQ after the presentation. "Like traditional SOA, we focus on reusing common components as much as possible, including business functionality and storage," while more akin to a microservices architecture was the ability of Airbnb engineers to code and deploy their own services independently.
Airbnb built their architecture so that all client requests go through an API gateway which routes to the relevant service. Every service is built, deployed and scaled independently. While initially run on EC2 instances in AWS, they moved to a Kubernetes cluster to simplify scaling needs.
To ensure a standard approach to building services, Airbnb defined a set of tenets to design the architecture and services, as well as a service interaction design. These were created up front before the execution phase of the migration. According to Tai, "these service design principles were inspired by the widespread problems we were seeing with our monolith, especially regarding unclear ownership and tight coupling".
Tai describes four tenets in her talk:
- Services should own both reads and writes to their own data. If several services need access to the same data, this is done via the owning service's API.
- Services should address a specific concern. For Airbnb, this was about trying to find the right granularity when creating services. They wished to avoid services having too much functionality, becoming monoliths in their own right, while also avoiding such fine-grained services that there would be a "polylith" or distributed monolith.
- Avoid duplicate functionality by using shared services and libraries.
- Data mutations should publish via standard events.
In a nod to traditional SOA service design, Airbnb designed their services to "interact with a specific direction as well to have a strict flow of dependencies". Below is a diagram for the interaction design they ended up with. Airbnb added the middle tier services type at a later date, as a need for shared validation logic was identified.
[Click on the image to enlarge it]
To avoid changing every database query within the monolith, Airbnb created a custom ActiveRecord adapter that effectively diverts database calls to the new service. Tai described the benefits in her presentation, "by having it at the bottom layer, this allowed product engineers to continue with their interactions of the ActiveRecord methods and not need to change the workflow, while under the hood, we were calling our new service."
While Diffy is not a specific SOA tool, Airbnb decided to use it to replay traffic to compare releases. By replaying production traffic to their staging environment they could then compare the effect of the new service against real world traffic, and confirm new services would not degrade customer experience.
A video recording of the presentation, slides, and a full transcript are available on InfoQ.