Version 2.0 of Hazelcast, a Java-based caching, clustering and data distribution solution, has recently been released. As part of this, the product is now offered in both commercial Enterprise and free open-source Community Editions.
The Community Edition is released under the Apache version 2 license and hosted on Google Code. Version 2 includes a distributed backup feature which is designed to ensure that each node will be evenly backed up by all the other nodes. "I believe that our backup distribution is totally new," Hazelcast founder, Talip Ozturk, told InfoQ
Backup data is distributed in a way that losing a node has very little effect on the cluster. This matters a lot when you have big-data in memory.
The mechanism works by distributing more-or-less even portions of the data in each node across all the other nodes in the cluster. So, for example, in a 50 node cluster with each node storing 20GB of primary data and 20GB of backup data, 1/49th of the data from node 1 will be backed up by each of the remaining 49 nodes. If node 1 goes down no migration is required and the cluster remains balanced. As new nodes are added to the system Hazelcast will slowly migrate data to the new nodes so that all nodes are eventually equally loaded.
Other new version 2 features include:
- Parallel IO, which combines In and Out communications into a single thread (in version 1.0 each member had one -In- and one -Out- thread to handle the communication to other members using NIO channels).
- Improvements to Connection Management such that Hazelcast will attempt to repair a broken connection before declaring it dead.
- New event containers for Queue, List, Set and Topic.
The Enterprise Edition adds off-heap storage (which Hazelcast call Elastic Memory), additional security capabilities, and a Native C# Client.
For security, the product includes a JAAS-based implementation, which can be used to authenticate both cluster members and clients, and to perform access control checks on client operations. Access control can be managed according to endpoint principal and/or endpoint address. Security can be enabled and configured either using XML or via an API.
Elastic Memory is essentially a work-around for the lengthy Garbage Collection (GC) pause times. With the exception of Azul's C4 collector, which has eliminated GC pauses altogether, garbage collection pause times in commercial JVMs increase broadly in-line with heap size. Elastic Memory can be used to reduce the size of the JVM heap, thereby reducing the length of Garbage Collection pauses. As a rough guide Ozturk suggested that
If you have 10GB+ data per JVM to store and if your values are 1KB+ then elastic memory will help. We don't recommend Elastic Memory if you have around 4GB data per JVM and/or values less than a KB.
Hazelcast's Elastic Memory implementation is built using direct byte buffers, with each buffer divided into blocks with a default size of 1KB. The feature is similar to those offered by Oracle's Coherence, Terracotta's Ehcache and a number of other cache vendors.
The Enterprise Edition licensing model is based on an annual subscription/node, with pricing information available via sales@hazelcast.com. Hazelcast also offer two different levels of support for the Community Edition and publish indicative pricing on their website.