InfoQ Homepage Event Stream Processing Content on InfoQ
-
Scaling a Distributed Stream Processor in a Containerized Environment
The article presents our experience of scaling a distributed stream processor in Kubernetes. The stream processor should provide support for maintaining the optimal level of parallelism. However, adding more resources incurs additional cost and also it does not guarantee performance improvements. Instead, the stream processor should identify the level of resource requirement and scale accordingly.
-
Apache Kafka: Ten Best Practices to Optimize Your Deployment
Author Ben Bromhead discusses the latest Kafka best practices for developers to manage the data streaming platform more effectively. Best practices include log configuration, proper hardware usage, Zookeeper configuration, replication factor, and partition count.
-
Democratizing Stream Processing with Apache Kafka® and KSQL - Part 2
In this article, author Robin Moffatt shows how to use Apache Kafka and KSQL to build data integration and processing applications with the help of an e-commerce sample application. Three use cases discussed: customer operations, operational dashboard, and ad-hoc analytics.
-
A Critique of Resizable Hash Tables: Riak Core & Random Slicing
This fall, Wallaroo Labs will be releasing a large new feature set to our distributed data stream processing framework, Wallaroo. One of the new features requires a size-adjustable, distributed data structure to support growing & shrinking of compute clusters. It might be a good idea to use a distributed hash table to support the new feature, but what distributed hash algorithm should we choose?
-
How to Choose a Stream Processor for Your App
Choosing a stream processor for your app can be challenging with many options to choose from. The best choice depends on individual use cases. In this article, the authors discuss a stream processor reference architecture, key features required by most streaming applications and optional features that can be selected based on specific use cases.
-
Democratizing Stream Processing with Apache Kafka and KSQL - Part 1
In this article, author Michael Noll discusses the stream processing with KSQL, the streaming SQL engine for Apache Kafka. Topics covered include challenges of stateful stream processing and how KSQL addresses them, and how KSQL helps to bridge the world of streams and databases through streams and tables.
-
Migrating Batch ETL to Stream Processing: A Netflix Case Study with Kafka and Flink
At QCon New York, Shriya Arora presented “Personalising Netflix with Streaming Datasets” and discussed the trials and tribulations of a recent migration of a Netflix data processing job from the traditional approach of batch-style ETL to stream processing using Apache Flink.
-
Exploring the Fundamentals of Stream Processing with the Dataflow Model and Apache Beam
At QCon San Francisco 2016, Frances Perry and Tyler Akidau presented “Fundamentals of Stream Processing with Apache Beam”, and discussed Google's Dataflow model and associated implementation of Apache Beam.
-
Is Batch ETL Dead, and is Apache Kafka the Future of Data Processing?
At QCon San Francisco 2016, Neha Narkhede presented “ETL is Dead; Long Live Streams”, and discussed the changing landscape of enterprise data processing. A core premise of the talk was that the open source Apache Kafka streaming platform can provide a flexible and uniform framework that supports modern requirements for data transformation and processing.
-
Processing Streaming Human Trajectories with WSO2 CEP
Extracting useful information from an inaccurate data stream is a significant issue in data stream processing for IoT applications. This article describes the use of Kalman filters to smooth human trajectory information gathered from an iBeacon sensor network and demonstrates its effectiveness. The solution has been built with WSO2 CEP, a complex event processing middleware.
-
Key Takeaway Points and Lessons Learned from QCon San Francisco 2016
The 10th annual QCon San Francisco was the biggest yet, bringing together over 1500 team leads, architects, project managers, and engineering directors. Over 125 practitioner-speakers presented 92 full-length technical sessions and 32 in-depth tutorials, providing deep insights into real-world architectures and state of the art software development practices from a practitioner’s perspective.
-
Big Data Processing with Apache Spark - Part 3: Spark Streaming
In this article, third installment of Apache Spark series, author Srini Penchikala discusses Apache Spark Streaming framework for processing real-time streaming data using a log analytics sample application.