Google has launched the public beta for Cloud Spanner, its globally distributed relational database service. Part of Google Cloud Platform, it delivers both ACID transactions and high availability, appearing to violate CAP theorem.
Spanner is already used by Google internally, and is now being made public. It is a managed cloud database which is available as part of Google Cloud Platform, so there is no access to the underlying infrastructure.
Spanner behaves like a traditional relational database, with ACID transactions, SQL, and relational schemas. However, it can also horizontally scale across Google's infrastructure, allowing it to handle increasingly larger transaction workloads. Despite this, it still remains strongly consistent, with single-digit milliseconds latencies when serving data.
CAP theorem states that a database cannot provide more than two out of the following three properties: availability, consistency and partition tolerance. Relational databases tend to sacrifice availability, whereas alternatives in the NoSQL space can be highly available in exchange for eventual consistency.
Whilst Spanner does not technically break CAP theorem, it can be functionally treated as if it does. Eric Brewer, vice president of infrastructure at Google Cloud, explains:
Eric Brewer: Does this mean that Spanner is a CA system as defined by CAP? The short answer is "no" technically, but "yes" in effect and its users can and do assume CA.
Brewer summarises that with Spanner the chance of a network partition is one in 105. If this eventually does happen, the system chooses consistency, technically making it CP. But, due to this unlikeliness, it can also be treated as available.
In Brewer's whitepaper, he explains that this level of reliability comes from running Spanner in Google's global private network. Spanner packets never reach the public internet, and with such high levels of redundancy, catastrophic events like cut fibre lines do not lead to outages.
Third parties such as Henry Robinson, distributed systems engineer at Cloudera, have also validated this claim, explaining:
Henry Robinson: Think of it this way: CAP tells us that every system has an achilles heel, or Kryptonite, that means giving up C or A for some period. Google has taken their Kryptonite and buried it deep in some black hole somewhere.
In order to implement ACID guarantees, Spanner implements a two-phase commit, a typical distributed transaction pattern. Brewer explains that whilst this can be an "anti-availability" pattern due to all members needing to be up, Spanner gets around this by using a Paxos group — in other words, a majority vote which would be valid even if some members are not available.
Spanner also makes use of TrueTime, Google's global synchronized clock. Brewer explains that TrueTime uses a combination of GPS receivers and atomic clocks in order to guarantee accurate timekeeping. It enables external consistency by correctly timestamping distributed transactions.
Cloud Spanner is still in beta, and is available to try online for free.