Microsoft recently discussed the results of distributed PostgreSQL benchmarks, comparing transaction processing and price performance for Azure Cosmos DB for PostgreSQL, CockroachDB, and Yugabyte. With different implementation trade-offs, the results show a higher throughput for Azure Cosmos DB but highlight the challenges of benchmarking distributed databases.
According to the GigaOm benchmark, Azure Cosmos DB for PostgreSQL with Citus distributed tables provides better transaction performance and price performance than CockroachDB Dedicated and Yugabyte Managed, two alternative fully managed distributed databases.
As previously reported on InfoQ, PostgreSQL is becoming the new standard for cloud-distributed databases, with different vendors extending, reimplementing, or forking the popular open-source relational database. The Microsoft-sponsored benchmark has been performed using HammerDB, the open-source tool for benchmarking relational databases managed by the Transaction Performance Council (TPC). Marco Slot, principal software engineer at Microsoft, writes:
GigaOM used HammerDB TPROC-C to benchmark Azure Cosmos DB for PostgreSQL and two comparable managed service offerings (...) GigaOM initially used 1,000 warehouses for their benchmarks, which results in ~100GB of data. However, both CockroachDB and Yugabyte gave surprisingly low throughput. They could get better performance by increasing the number of warehouses for both, without changing the number of connections.
Source: https://devblogs.microsoft.com/cosmosdb/distributed-postgresql-benchmarks-using-hammerdb-by-gigaom/
The critical and contentious difference is the usage of Citus, the open-source extension to distribute tables in PostgreSQL that requires developers to specify a distribution column, the shard key:
A core tenet of Citus has always been that Distributed PostgreSQL is about performance at scale because PostgreSQL is good enough for everything else.
The other distributed databases tested do not rely on the definition of a distribution column. On Reddit, Slot acknowledges the differences:
The performance difference is almost a little awkward. I wanted to highlight that using Citus does require some additional steps (e.g. create_distributed_table) to define distribution columns and co-location (otherwise, you're just using a single node). Our experience is that without co-locating related data your typical transactional PostgreSQL workload will perform much worse than a single server.
Tweeting about the benchmark, Franck Pachot, developer advocate at YugabyteDB, questions:
Is this comparing Citus (sharding over SQL databases with two-phase commit protocol) with YugabyteDB & CockroachDB (SQL over distributed storage with global ACID transactions)? Not the same level of resilience, global consistency, elasticity, auto-splitting/rebalancing. They have different use cases.
The report acknowledges that different distributed databases might excel for different characteristics, including response time, concurrency, fault-tolerance, functionality, consistency, or durability, targeting different deployments. Slot concludes:
Distributed systems, and distributed databases especially, are all about trade-offs at every level. CockroachDB and Yugabyte make different trade-offs and do not require a distribution column (...) The decision to extend Postgres (as Citus did), fork Postgres (as Yugabyte did), or reimplement Postgres (as CockroachDB did) is also a trade-off with major implications on the end user experience, some good, some bad.
To encourage customers to run benchmarks that match their workloads, Microsoft shared helper scripts to run HammerDB benchmarks on Azure Cosmos DB. Jelte Fennema, senior software engineer at Microsoft, shows how to run a benchmark automatically, including cluster setup and tear down.
According to GigaOm, Google Spanner Postgres Interface was not included in the comparison as the service does not provide the level of Postgres compatibility required to run the benchmark.