10gen recently announced limited release of their MongoDB Backup Service providing incremental backups and point-in-time recovery.
For backing up or restoring MongoDB, you would usually use mongodump and mongorestore utilities. Optionally you can also use --oplog to get a point-in-time snapshot. However backing up the entire database every time can start consuming more time and disk space. This is where the new service introduced by 10gen comes in – by providing a continuous incremental backup which allows for point-in-time restore. Also being a cloud-based backup service, users can pay for what they need without having to plan up-front for storage capacity.
Key features include -
-
SSL encryption for data transfers
-
High availability
-
Point-in-time recovery
-
Sharded Cluster support
-
Low overhead
General availability is expected later in the year.
10gen explains how the solution works from a technology point of view -
A lightweight agent gathers oplogs from all the replica sets that are being backed up, compresses and encrypts them, then sends them over SSL to data centers where the Backup Service operates. This approach has a number of benefits, including: 1) data is incrementally backed up, so the data in motion is relatively small, 2) the data in the Backup Service is very close in time to that of the primary system; 3) the impact to the primary system is no more than adding another replica to a replica set, which is very low; 4) the oplog allows us to restore a replica set to any point in time.
There are two options for restore: snapshots and custom snapshots. The Backup Service creates and maintains snapshots of the backups according to a policy. Any of these snapshots are available for a restore. Alternately, a user can specify a precise point in time they would like to use for creating a snapshot. In this case, the most recent snapshot preceding the point in time is used and the oplog is applied up to the point in time the user specifies.
Similar to MMS, 10gen might make this service available on-premise for larger enterprises. The company has decided not to open-source the software powering this service for now.
There is an open source project that also uses the oplogs from replica sets to create incremental backup – Tayra from EqualExperts. You can look at the documentation to see all the features it provides – including selective restore and point-in-time restore. However, it does not support sharded systems.