Twitter announced the open sourcing of Heron, a stream-processing engine that is a successor to Apache Storm. Heron is backwards compatible with Apache Storm, which eases its adoption amongst developers. Heron has replaced Apache Storm as the stream data processing engine inside Twitter due to its scalability, debug-ability, ability to work in a shared cluster infrastructure and better performance. A comprehensive list of features is listed in the documentation.
InfoQ caught up with Karthik Ramasamy, engineering manager at Twitter and co-creator to the project, regarding the announcement.
InfoQ: Let’s start with Apache Storm. What were the technical limitations of Apache Storm that caused you to start a new project instead of enhancing the original project?
Karthik Ramasamy: At Twitter, Apache Storm at our current scale was becoming increasingly challenging due to issues related to scalability, debug-ability, manageability, and efficient sharing of cluster resources with other data services.
In Storm, work from multiple components of a topology is bundled into one operating system process, which makes debugging very challenging. In addition, Storm needs dedicated cluster resources, which requires special hardware allocation to run Storm topologies. This approach leads to inefficiencies in using precious cluster resources, and also limits the ability to scale on demand.
With Storm, provisioning a new production topology requires manual isolation of machines, and conversely, when a topology is no longer needed, the machines allocated to serve that topology now have to be decommissioned. Managing machine provisioning in this way is cumbersome.
Finally, we wanted to meet all the goals outlined above without forcing a rewrite of the large number of applications that have already been written for Storm.
After examining various options, we concluded that we needed to design a new stream processing system to meet the design goals outlined above. This new system is called Heron. Heron is API- compatible with Storm, which makes it easy for existing Storm users to migrate to Heron and new users to develop on Heron. All production topologies inside Twitter now run on Heron. Besides providing us significant performance improvements and lower resource consumption over Storm, Heron also has big advantages in terms of debug-ability, scalability, and manageability.
InfoQ: There are so many streaming frameworks in Apache alone. How does Heron differ from these projects?
Ramasamy: We outlined our design considerations for Heron in our SIGMOD '15 paper Twitter Heron: Stream Processing at Scale (May, 2015). Twitter invested heavily in Storm topology development, including many other companies operating or aspiring to operate at significant scale. Storm in practice did not scale to our needs, and other solutions appeared both immature and would require a complete and costly re-engineering of existing topologies.
The current landscape demonstrates our initial prediction that the world is trending in real-time. Needs for real-time analytics with extremely low latency continue to grow. Heron is being applied to rapidly expanding real-time use cases, including anomaly/fraud detection, IoT/IoE applications, embedded systems, VR/AR, advertisement bidding, financial, security, and social media.
The main unknown in the current landscape is proven reliability at scale. Storm was one of the first solutions, and Heron was needed as an upgrade, resulting in a 10x reduction in incidence reports in addition to being efficient. Current systems may have these problems at scale, but it is unknown since Twitter has a unique set of requirements that other companies may have yet to experience. We are excited to see the ecosystem grow and will continue to be a leader in the open source space.
InfoQ: You mention in your blog that Heron is a fundamental change in streaming architecture. How so?
Ramasamy: The fundamental change in streaming architecture is from Storm’s thread-based to Heron’s process-based architecture, allowing users to easily reason about how their topologies work and profile/debug its components in isolation. This fundamental change required a complete re-write of Storm, but allows easier development and debuggability for Heron topologies. We feel the development investment has already paid off by a 10x reduction in incidence reports, demonstrating Heron “just works” at Twitter-scale.
InfoQ: Is Twitter the biggest contributor to Heron? Which other companies are contributing?
Ramaswamy: Absolutely, Twitter is currently the leading contributor to Heron since we recently open sourced the platform, but we have been surprised by the interest with other large-scale companies to adopt and contribute to Heron. Heron is proven at Twitter-scale, which we feel is unique to streaming frameworks, and other companies are leveraging our designs and implementation to further improve the system. In particular, as publicly disclosed in our initial release, Microsoft has led with key contributions enabling Heron on YARN with Apache REEF, along with sophisticated packing algorithms to further improve computing efficiencies. We have already 50 contributors who submitted more than 900 pull requests and we expect it to grow.
We’ve been excited as well to see extremely large scale Chinese companies benchmark Heron and adopt on some of the largest clusters we know of outside of Twitter. We are collaborating closely and hope to announce a few more names publicly.
We are committed to the open-source community. Prior to Heron, we donated Apache Storm, which we open sourced in 2011, along with cluster manager Apache Mesos, MapReduce streaming framework Summingbird, and more recently, the high-performance replicated DistributedLog. We developed Heron with the intention of open sourcing the project. Ever since the initial SIGMOD paper we’ve received constant inquiries for our open source Heron plans, especially from other companies operating at serious scale with real-time needs, and those experiencing the same production issues we experienced with currently available systems. It has been exciting to see the interest, and now exciting to see the adoption.
InfoQ: Backwards compatibility to Apache Storm is seen as key to developer adoption for Heron. Is this guaranteed for all future versions? Talk a bit about the caveats (if any) from a developer perspective and the challenges from an implementer’s perspective?
Ramasamy: It was essential for Twitter’s use cases to be seamlessly supported, mainly backwards compatibility with our usage of Storm. However, Apache Storm includes new use cases that we never needed and thus never supported, mainly Distributed Remote Procedure Calls (DRPC). Now that Heron is open sourced, we are finding motivated contributors interested in supporting more functionality, and therefore our hope is that future versions of Apache Storm continue to be supported. It benefits both ecosystems, in our opinion, to remain as compatible as possible, with any needed syntax supported as quickly as possible.
InfoQ: Can you elaborate on the roadmap for Heron?
Ramasamy: Much of the roadmap is focused on quickly and efficiently supporting major schedulers in addition to Apache Aurora, and we’ve already received support for YARN via REEF, SLURM, and stable but still experimental support for Apache Mesos. We anticipate DC/OS, Marathon and Kubernetes support with implementations already being developed. We aim for simple installation and reliable operations. Furthermore, we are expanding Heron’s capabilities operationally to include hot deploys, manually scaling topologies during big events, and scaling topologies automatically. We are also working on some ground breaking algorithms on how to continue to process streams especially in the presences of stragglers and network partitioning.
Heron continues to focus on core use cases for Twitter, but features are being added that we are testing in production. If you are interested in contributing a specific feature or want to help on existing milestones, please reach out to join the Heron community and heronstreaming developer Slack channel. We look forward to growing the Heron community.
There is a getting started guide that goes into details of downloading Heron binaries and creating some example topologies.