InfoQ Homepage Streaming Content on InfoQ
-
Tales of Kafka at Cloudflare: Lessons Learnt on the Way to 1 Trillion Messages
Cloudflare uses Kafka clusters to decouple microservices and communicate the creation, change or deletion of various resources via protobuf, a common data format in a fault-tolerant manner. The authors suggest investing in metrics for problem detection, prioritizing clear SDK documentation, and balancing flexibility and simplicity for standardized pipelines.
-
Billions of Messages Per Minute Over TCP/IP
Chronicle Wire offers an alternative way of transferring data between systems, delivering more messages, faster, than common JSON/XML approaches. This approach to data serialization improves both latency and throughput.
-
Building & Operating High-Fidelity Data Streams
At QCon Plus 2021 last November, Sid Anand, chief architect at Datazoom and PMC Member at Apache Airflow, presented on building high-fidelity nearline data streams as a service within a lean team. In this talk, Anand provides a master class on building high-fidelity data streams from the ground up.
-
Streaming-First Infrastructure for Real-Time Machine Learning
This article covers the benefits of streaming-first infrastructure for two scenarios of real-time ML: online prediction, where a model can receive a request and make predictions as soon as the request arrives, and continual learning, when machine learning models are capable of continually adapting to change in data distributions in production.
-
Designing IoT Solutions with Microsoft Azure
In this article, we will learn how the IoT solutions can work with Microsoft Azure and what services are available to perform different operations across multiple domains. Furthermore, it covers a few case studies to gain hands-on experience on Azure IoT that are common and provide a good starting point for utilizing cloud-based IoT services.
-
How to Create a Network Proxy Using Stream Processor Pipy
In this article we are going to introduce Pipy, an open-source cloud-native network stream processor. After describing its modular design, we will see how to rapidly build a high-performance network proxy to serve our specific needs. Pipy has been battle-tested and is already in use by multiple commercial clients.
-
Indestructible Storage in the Cloud with Apache Bookkeeper
At Salesforce, we required a storage system that could work with two kinds of streams, one stream for write-ahead logs and one for data. But we have competing requirements from both of the streams. Being the pioneers in cloud computing, we also required our storage system to be cloud-aware as the requirements of availability and durability are ever more increasing.
-
The Future of Data Engineering
Chris Riccomini examines the current and future states of the art in data pipelines, data streaming, and data warehousing. He presents a six-stage evolution that data ecosystems follow, from a simple monolith to a complex data-microwarehouse architecture as the data engineers who manage them solve problems and clarify their roles as infrastructure engineers, rather than data stewards.
-
How Apache Pulsar is Helping Iterable Scale its Customer Engagement Platform
In this article, author Greg Methvin discusses his experience implementing a distributed messaging platform based on Apache Pulsar.
-
Beyond the Database, and beyond the Stream Processor: What's the Next Step for Data Management?
Databases have been around forever with the same shape: you make a request to your data and then you receive an answer. Now, stream processors came along with a different approach: data isn’t locked up, it is in motion. Understand how stream processors and databases relate and why there is an emerging new category of databases that focus on data that stays in place as well as data that moves.
-
Real Time APIs in the Context of Apache Kafka
Events offer a Goldilocks-style approach in which real-time APIs can be used as the foundation for applications which is flexible yet performant; loosely-coupled yet efficient. Apache Kafka offers a scalable event streaming platform with which you can build applications around the powerful concept of events.
-
The Challenges of Building a Reliable Real-Time Event-Driven Ecosystem
Globally, there is an increasing appetite for data delivered in real time; we are witnessing the emergence of the real time API. When it comes to event-driven APIs engineers can choose between multiple different protocols. In addition to choosing a protocol, engineers also have to think about subscription models, too: server-initiated (push-based) or client-initiated (pull-based).