InfoQ Homepage Event Stream Processing Content on InfoQ
-
Tales of Kafka at Cloudflare: Lessons Learnt on the Way to 1 Trillion Messages
Cloudflare uses Kafka clusters to decouple microservices and communicate the creation, change or deletion of various resources via protobuf, a common data format in a fault-tolerant manner. The authors suggest investing in metrics for problem detection, prioritizing clear SDK documentation, and balancing flexibility and simplicity for standardized pipelines.
-
Billions of Messages Per Minute Over TCP/IP
Chronicle Wire offers an alternative way of transferring data between systems, delivering more messages, faster, than common JSON/XML approaches. This approach to data serialization improves both latency and throughput.
-
Streaming-First Infrastructure for Real-Time Machine Learning
This article covers the benefits of streaming-first infrastructure for two scenarios of real-time ML: online prediction, where a model can receive a request and make predictions as soon as the request arrives, and continual learning, when machine learning models are capable of continually adapting to change in data distributions in production.
-
How to Create a Network Proxy Using Stream Processor Pipy
In this article we are going to introduce Pipy, an open-source cloud-native network stream processor. After describing its modular design, we will see how to rapidly build a high-performance network proxy to serve our specific needs. Pipy has been battle-tested and is already in use by multiple commercial clients.
-
Beyond the Database, and beyond the Stream Processor: What's the Next Step for Data Management?
Databases have been around forever with the same shape: you make a request to your data and then you receive an answer. Now, stream processors came along with a different approach: data isn’t locked up, it is in motion. Understand how stream processors and databases relate and why there is an emerging new category of databases that focus on data that stays in place as well as data that moves.
-
Real Time APIs in the Context of Apache Kafka
Events offer a Goldilocks-style approach in which real-time APIs can be used as the foundation for applications which is flexible yet performant; loosely-coupled yet efficient. Apache Kafka offers a scalable event streaming platform with which you can build applications around the powerful concept of events.
-
The Challenges of Building a Reliable Real-Time Event-Driven Ecosystem
Globally, there is an increasing appetite for data delivered in real time; we are witnessing the emergence of the real time API. When it comes to event-driven APIs engineers can choose between multiple different protocols. In addition to choosing a protocol, engineers also have to think about subscription models, too: server-initiated (push-based) or client-initiated (pull-based).
-
Applied Probability - Counting Large Set of Unstructured Events with Theta Sketches
In this article, author Ronen Cohen discusses the solution to processing the event data using Theta Sketches and technologies like HBase and Kafka.
-
Is Edge Computing a Thing?
Edge Computing is definitely a thing, but the computing need not occur at the edge. Instead what is needed is an ability to compute (anywhere) on streaming data from large numbers of dynamically changing devices, in the edge environment. This in turn demands an architectural pattern for stateful, distributed computing.
-
The Kongo Problem: Building a Scalable IoT Application with Apache Kafka
In this article, author Paul Brebner discusses the best practices for developing IoT projects using Apache Kafka and Kafka Streams technologies and how to maximize Kafka scalability.
-
Rethinking Flink’s APIs for a Unified Data Processing Framework
Since its very early days, Apache Flink has followed the philosophy of taking a unified approach to batch and streaming. The core building block is the “continuous processing of unbounded data streams, with batch as a special, bounded set of those streams.” Recent updates to the Flink APIs include architectural designs by the community to support batch and streaming unification in Apache Flink.
-
Increasing the Quality of Patient Care through Stream Processing
Today’s healthcare technology landscape is disaggregated and siloed. Physicians analyse patient data streams from different systems without much correlation. Even though health-tech domain is mature and rich with data, the value of it is not directed towards increasing the quality of patient care. This article presents a stream processing solution in which streams are co-related.