At the recent re:Invent conference, AWS announced the general availability of DocumentDB Elastic Clusters, a service that manages the elasticity and sharding for MongoDB workloads.
One of the main announcements of Sivasubramanian's keynote during the conference, Elastic Clusters supports hash-based partitioning across DocumentDB’s distributed storage system, enabling the scale out of a database beyond the vertical scaling limits of a single instance. According to the cloud provider, the new service can handle up to 300K connections, millions of reads and writes, with petabytes of storage capacity. Veliswa Boya, senior developer advocate at AWS, writes:
When creating a cluster, you will specify the vCPUs that you want for your Elastic Clusters at provisioning. With the size of vCPUs that you provision, you will also get a proportionate amount of memory, expressed in vCPUs. Elastic Clusters automatically provisions the necessary infrastructure (shards and instances) on your behalf.
Source: https://aws.amazon.com/documentdb/
The new option provides aggregation pipelines to filter, group, process, and sort data across shards and supports some of the management features of Amazon DocumentDB instance-based clusters, including multi-AZ support and automated backups. Released in 2019 as a managed MongoDB-compatible service, DocumentDB is a database structured around JSON documents supporting MongoDB 3.6 and 4.0 APIs. Boya adds:
If you need more compute and storage to handle an increase in traffic, modify the shard-count parameter. Elastic Clusters scales the underlying infrastructure up or out to give you additional compute and storage capacity.
As for any sharded database, AWS recommends choosing an evenly distributed hash key with high frequency and high cardinality, avoiding hotspots. Scaling operations cause brief periods of intermittent database and network errors, with the vertical scaling of the shard capacity (changing vCPU count per shard) the fastest option. The cloud provider claims that Elastic Clusters is different from MongoDB sharding:
With Elastic Clusters, you can easily scale out or scale in your workload on Amazon DocumentDB typically with little to no application downtime or impact to performance regardless of data size. A similar operation on MongoDB would impact application performance and take hours and in some cases days.
DocumentDB is not the only option to run MongoDB workloads on AWS. MongoDB Atlas is available on the AWS marketplace, among other options from the NoSQL database company. MongoDB Atlas is available on Google Cloud and Azure too.
It is currently not possible to convert an existing Amazon DocumentDB database to Elastic Clusters and customers have to use AWS DMS or a native MongoDB tool like mongodump to migrate workloads to the new option. Some experts find the name of the service confusing. Ganesh Swaminathan, head of technology at T. Rowe Price, tweets:
Interesting to see the "Elastic Clusters" terminology. Maybe something like this should replace the recent "Serverless but no scale to Zero" debate.
Customers pay for the amount of compute measured in vCPUs, the database storage, and the backup storage. DocumentDB Elastic Clusters is available in a subset of AWS regions, including Ohio, Northern Virginia, Frankfurt, and Ireland.