Hazelcast, the distributed computation and storage platform, has announced the release of the Hazelcast Platform version 5.0. This new platform unifies the existing products Hazelcast IMDG and Hazelcast Jet, the former having provided fast ways to store, retrieve and modify data while the latter provided fast processing of data.
The new version 5.0 also introduces new features like expanded SQL support, streaming capabilities (provided by Jet module), a redesigned UI console and a new serialization format (as a preview feature).
The SQL engine in the Hazelcast Platform now supports basic data manipulation language (DML) functionality to enable INSERT, UPDATE and DELETE on data in Hazelcast. The platform also adds sorting, aggregations and new SQL expressions.
The streaming capabilities provided with the merging of Hazelcast Jet in Hazelcast Platform 5.0, enable stateful and fault-tolerant data processing and querying over data streams and data at rest using SQL or dataflow API.
Additionally, Hazelcast Platform provides a comprehensive library of connectors such as Kafka, Hadoop, S3, RDBMS, JMS and many more, distributed messages via queues or pub/sub topics and a cloud-native architecture.
The use cases proposed by Hazelcast can vary between caching, fraud detection, real-time streaming analytics or used in a microservice architecture for fast data lookups. Their documentation states that their unique data processing architecture results in 99.99% latency of under 10ms for streaming queries with millions of events per second.
John DesJardins, chief technology officer at Hazelcast, spoke to InfoQ about this new platform.
InfoQ: What are the motivations and the advantages of joining Hazelcast IMDG and Hazelcast Jet in recent version 5.0?
John DesJardins: Unifying these products is very powerful, in several ways.
First, is simplicity - This simplicity starts with developers, who can now leverage the full power of these capabilities by simply adding one library to their project and having a simplified developer experience. That continues through to architectural simplicity - a unified runtime is very powerful when deploying distributed applications. This means deploying, scaling, and managing is simplified. That also translates into operational simplicity. And all these, combined, lead to simplified DevOps processes and greater agility. This simplicity also means simplified scaling of data and compute, and simplified resilience.
Second, is performance - Unifying compute and data, when coupled with a distributed architecture, data-aware-compute, and an in-memory-optimized architecture, is very powerful. Data does not have to be moved to compute and analytics as this reduces overhead tremendously due to data locality. This alone can shave off huge latency and throughput impacts when compared with the well known "Lambda Architecture" or what has recently been discussed, a "Delta Architecture" both of which involve quite a bit of ping-ponging of data between compute and storage. When this is combined with in-memory-optimized compute and data, those advantages are amplified dramatically. This can mean processing can go from minutes to seconds, or seconds to milliseconds, or even from milliseconds to microseconds.
Third, is scaling and resilience - because BOTH compute AND storage are distributed together, this means you can scale performance in a very linear way while still delivering resilience. This is demonstrated, for example, in our ability to easily scale to over one billion events per second, as discussed in this blog post.
Finally, unifying these products, and their Open Source projects, enables faster release cycles for us, allowing us to innovate faster and bring more value to our ecosystem of Open Source and Enterprise users.
InfoQ: What are the contributions from the community?
DesJardins: Our community actively contributes in a range of areas, including support for many languages, connectors to data sources, and support for other frameworks or projects. Community contributions come from many areas including other ISVs, small and large companies across industries, as well as systems integrators, and from all parts of the globe.
InfoQ: What's on the horizon for Hazelcast?
DesJardins: Hazelcast has tremendous plans for 2022, including expanding our capabilities around storage, SQL and analytics, as well as connectivity to other data platforms and Open Source projects, building on our DNA expressed by our vision - Empower the world to act instantaneously on data everywhere.
Instantaneous compute and data processing is our DNA, as is delivering a resilient product that enables zero downtime architectures.
We will expand on this to add more capabilities around streaming, machine learning, more advanced connectors, as well as innovations around Microservices and Cloud. We are also working to grow our technology partnerships across these areas. There are some other exciting announcements to come in these areas and how we are working to co-innovate with this ecosystem. More to come on that soon, including news around our 5.1 release in Q1.
InfoQ: What has the response been from the Java community?
DesJardins: As a technology built on Java, the Java Community has always been a strong ecosystem for us. We have seen this continue with our 5.0 release. Unifying our capabilities means that Java developers can leverage streaming and data grid capabilities with one line added to their Maven or Gradle project. We have also seen strong interest from Java ISV partners. I won't mention names here, as I don't want to speak for them, but Hazelcast has been incorporated into over a dozen other projects.
An Enterprise Edition is also available that adds a new persistence functionality and an increase of the high-density memory limit to 200GB per member.