At QCon San Francisco 2022, Frank Yu, senior engineering manager at Coinbase, presented Leveraging Determinism, which draws direct experiences and examples from building and running financial exchanges. This talk is part of the editorial track Building Modern Backends.
Yu began his talk by making two claims for deterministic logic:
-
If you have important logic - make it deterministic.
-
If you have deterministic logic - don’t be afraid to replay it anywhere(for efficiency and profit).
After a brief introduction to the history of the Coinbase derivative exchanges, Yu describes the work his team does for a trading exchange as "mission-critical code" that must:
-
Be correct. The order of magnitude of money involved versus the expected revenue that an exchange makes is massive.
-
Have consistent and predictable performance. For Coinbase specifically, that is to have a 99th percentile response time to stay comfortably under 1ms.
-
Remember everything for audibility. Regulations require that everything needs to be able to reproduce to the exact state for every millisecond for the last seven years.
In order to continue to add features and evolve the underlying system in a reasonable time while not introducing any critical bugs to the production system, Yu emphasizes that:
"We’ve got to make sure the core logic stays simple. We also avoid concurrency at all cost at our logic."
A system is deterministic if given the exact same set of inputs in the same order we get the exact same state and outputs. In practice, if we take all the requests sequenced into a log and apply determinism, we’ll effectively get replicated state and output events.
Since Coinbase operates at microsecond time scales, a really fast consensus algorithm was needed. Yu and his team were able to leverage Aeron Cluster to provide support for fault-tolerant services as replicated state machines based on the Raft consensus algorithm.
Input and output event disparity can lead to instability and scaling bottlenecks. A system’s input sizes and rates are often consistent and predictable. However, a system’s output sizes and rates are often hard unpredictable, and difficult to validate. Yu further explains this phenomenon with a direct example from the Coinbase system where one event can lead to multiple new events, risking running into the thundering herd problem and incurring very expensive data ingress and egress cost from their cloud provider.
Since deterministic systems offer consistent computation, generating the same output given the same input and system state, replicating systems and the original request should give the same output as if the output was sent from the previous node in the system. This offers a new school of thought - in a deterministic system, rather than replicating data, compute can also be replicated and achieve a more stable and predictable system.
Yu closed his talk by summarizing what we should think about when building a system like this:
-
Replicate well-tested code or bugs will replicate too
-
No drift: old behavior must be respected when replaying inputs
-
Enable new behavior with a request to the monolith after deploy
-
Use a seed for deterministic pseudorandom outputs
-
Divide large chunks of work into stages
-
Everything should fit in-memory
-
You’d be surprised how much data fits in memory
-
You’d be surprised how much work fits on one CPU core
-
Keep your 99s and 99.9s down
-
Protect your monolith from chatty clients
Todd Montgomery had previously given a talk on Aeron Cluster at QCon New York 2018. And Martin Thompson had previously given a different talk on Cluster Consensus with Aeron at QCon London 2018.
Other talks on Building Modern Backends will be recorded and made available on InfoQ over the coming months.