Ehcache BigMemory provides an in-process off-heap cache to store large sets of data closer to the application. Terracotta last week announced the general availability of BigMemory module for Enterprise Ehcache product. BigMemory is part of the standard Ehcache API and is enabled by defining two new attributes on a cache, overflowToOffHeap and maxMemoryOffHeap as shown in the following code snippet.
<cache name="sample-offheap-cache" maxElementsInMemory="10000" eternal="true" memoryStoreEvictionPolicy="LRU" overflowToOffHeap="true" maxMemoryOffHeap="1G"/>
BigMemory differs from traditional caching solutions in its memory storage strategy. It avoids the Java Virtual Machine (JVM) garbage collection (GC) problems by not storing the data on the Java heap. This extra BigMemory store is referred to as the Off-Heap Store. Traditionally, caching solutions have sought to avoid these issues by distributing the data over a cluster of caching nodes. BigMemory provides a new architectural alternative and allows an application to run on a Java Virtual Machine (JVM) with less than a gigabyte of heap, while using the off-heap memory for faster access to data.
InfoQ caught up with Ari Zilka, CTO of Terracotta, about the new BigMemory feature of Ehcache framework, the use cases where it helps with the application performance and its limitations.
InfoQ: What was the main motivation behind the development of BigMemory feature in Ehcache framework?
The primary motivation was to solve GC issues we were having in the Terracotta server. GC in the server caused variation in response times and in the event a large GC occurred, could cause Cache clients (L1's) to failover to a backup Terracotta server. Once we realized how good the solution was we expanded its use to also include an additional memory store for Ehcache standalone, which became BigMemory, an add-on for Enterprise Ehcache.
InfoQ: Can you discuss the technical details of how Off-Heap store (BigMemory) provides the way to avoid the traditional complexities of Java garbage collection?
BigMemory stores its cache objects outside of the Java heap but still in the Java operating system process. So it is still an in-process cache, with all of the high performance associated with that, but it does not use the heap and therefore allows applications to be configured with very small heaps, thus avoiding GC issues. BigMemory uses DirectByteBuffers, which were introduced into Java in JDK 1.4. All Java implementations can run BigMemory, so everyone can use BigMemory without the need to change JDKs.
We pretty much perform the function of an Operating System memory manager. We then allocate memory on put and free it on remove, something we can do because we are a cache, rather than a general purpose Java program. DirectByteBuffers are slow to allocate, but very fast to use. We therefore grab all the memory we need from the operating system right at startup.
The key to BigMemory, and the thing many people find hardest to understand initially, is how we are able to tell when an object is no longer being used and the associated memory freed. Well for a cache, it is dead simple. A map is basically puts, get and removes. We allocate memory on put (malloc) and we free memory on remove (free). We implement a Memory Manager, which leverages well-understood computer science algorithms combined with our own proprietary enhancements for doing so.
Responding to a question on the best use case where BigMemory helps with the application performance (in terms of read-only, read-mostly, or read-write operations), Ari said that they saw good performance results for both common 90% read / 10% write type uses, and for write heavy 50% read / 50% write type uses. The reason is that the cache is in-process. Read only matters for distributed caches. The hot set can be read much more quickly than the rest, which must be fetched over the network.
InfoQ: What are the limitations of BigMemory solution?
Given the fact that it is pure Java, in-process, and compatible with all common JVMs and containers, it does not have any obvious limitations. We have tested it on the largest memory boxes we could find - with 384GB of memory - and shown that we have linear performance with no noticeable increase in latency all the way up 350GB of BigMemory.
The only constraint we highlight to users is that using an off-heap store imposes the requirement that objects must be serialized to be placed in BigMemory. For the types of data that are normally stored in a cache, this is not a problem.
Once an object has been serialized, it must be deserialized back into the Java heap before it can be used. This does involve a performance overhead. Thus, without garbage collection, BigMemory is slower than on-heap storage. It is however, much faster than the next available tier of storage (be it local disk, network store, or going back to the original system of record - such as an RDBMS - for the data).
It should also be noted that the serialization/deserialization performance overhead is much lower than many users assume. BigMemory is optimized for byte buffers, and has built-in optimizations for objects serialized using standard Java serialization. For example, with our optimizations between alpha and GA release, we were able to double the performance for complex Java objects, and quadruple performance for byte arrays - which is also how the Terracotta Server Array stores data. Using custom serialization can reduce this overhead even further.
InfoQ asked a question about the best practices and gotchas that architects and developers should keep in mind when using BigMemory in their applications. Ari said that all successful business applications face scale limitations. Caching is one of the least disruptive and easiest to implement solutions. The new part is that now this does not necessarily involve a caching cluster.
The Best Practice is to look at your performance architecture with new eyes and see if you can benefit from very large in-process caches. BigMemory lets architects optimize the server and process density to meet their specific needs, rather than being held hostage to the limits of Java.
The biggest gotcha is that most people have already optimized for the limitations of Java. For example, the majority of Ehcache users still run 32 bit JVMs. 32 bit Java has an address space, depending on OS, of 2 to 4GB. So these users have given up on using lots of memory with Java. Chances are they are currently running on hardware with small amounts of RAM. So, if they want to use BigMemory to run 100GB of cache in-process, it probably means new (albeit now cheap) hardware.
InfoQ: What is the future road map of Ehcache framework in general and BigMemory in particular?
Work on the next release of Ehcache and Terracotta, code named Freo (an Australian nickname for Fremantle), is already well underway, with a beta release planned this month. We plan to include a series of feature and performance enhancements. One example is Ehcache Search, which provides Ehcache users the ability to search a cache like you can a database. An alpha release of the code and full documentation of Ehcache Search are already available.
For BigMemory, we are continuing to work on enhancing performance, as well as making a series of practical enhancements, such as more tooling to help people better understand the optimum configuration to set for their use case.