MongoDB 2.4 was recently released with new features such as Text Search, hash-based sharding, better geo-spatial capabilities with GeoJSON support and several performance and tooling improvements. We also discussed with 10gen about what’s next on the roadmap.
Some of the key improvements are as follows –
- Text Search is introduced as a beta-feature, supporting stemming and tokenization in 15 languages
- Hash-based sharding, for cases where data spread across any natural sharding key cannot be easily predicted
- Geo-spatial indexes with GeoJSON support
- Security Improvements – new modular authentication system, integration with Kerberos, Role-based Access control
- Several Performance improvements, significant ones for some specific scenarios such as Count or Aggregation
- V8 as the default JavaScript engine in the mongo shell (replaces SpiderMonkey); leads to performance and concurrency improvements for JavaScript based actions
- Additional metrics for monitoring cluster status
10gen also introduced an enterprise version of MongoDB along with the 2.4 release.
We got in touch with Kelly Stirman, director of product marketing at 10gen, to know more about the new features and what to expect next.
Kelly explains why collection-level locks may not make sense for MongoDB –
The improvements to lock yielding in 2.2 provide substantial benefits to write throughput by reducing lock contention. There was a good write up on this subject by David Mytton.
MongoDB 2.4 does not include any additional granularity of locks beyond the improvements provided in 2.0 and 2.2. We are considering document-level locks for 2.6. The lock yielding improvements were substantial enough that collection-level locks might not provide a major additional improvement, and so document-level locks may be the next step.
About when to use range-based sharding instead of the the new hash-based sharding -
When using range-based sharding, if your application requests data based on a shard key range, then those queries will be routed to the appropriate shards, which is typically just one shard, or perhaps a few shards. The same query in a system that has used hash-based sharding will route the request to a greater number of shards, perhaps all the shards. Ideally, queries are routed to a single shard or as few shards as possible as this scales better than routing all queries to all shards. So, if you understand your data and queries well, it is possible range-based sharding is the best option.
With MongoDB 2.4, Counts can be up to 20x faster, and the Aggregation Framework is 3 - 5 times faster on average. Kelly explains that the improved count performance relies on some improvements to traversing the B-trees in MongoDB – low cardinality index-based counts are where you see the biggest improvements. The improvements to the Aggregation Framework are a reflection of many smaller changes in MongoDB internal implementation that add up to big benefits.
On what’s coming next in the enterprise features –
MongoDB 2.4 makes some major steps forward in the areas of security and monitoring, but we have much more planned for future releases. We think of security along the dimensions of authentication, authorization, and auditing. Future releases of MongoDB will continue to focus in these areas, and we will continue to enhance the tooling we provide with MongoDB. MongoDB Monitoring Service (MMS) has been hugely popular in the MongoDB community with over 15,000 users and growing quickly. We will continue to invest in MMS and to provide both free, cloud-based tools as well as on-prem offerings as part of our Enterprise subscriptions.
You can read more about the new features in MongoDB 2.4 in the release notes as well as the overview.