InfoQ Homepage Streaming Content on InfoQ

Articles

RSS Feed

Newer Older

Architecture & Design

A Critique of Resizable Hash Tables: Riak Core & Random Slicing

This fall, Wallaroo Labs will be releasing a large new feature set to our distributed data stream processing framework, Wallaroo. One of the new features requires a size-adjustable, distributed data structure to support growing & shrinking of compute clusters. It might be a good idea to use a distributed hash table to support the new feature, but what distributed hash algorithm should we choose?

Scott Lystig Fritchie
on Aug 26, 2018
AI, ML & Data Engineering

How to Choose a Stream Processor for Your App

Choosing a stream processor for your app can be challenging with many options to choose from. The best choice depends on individual use cases. In this article, the authors discuss a stream processor reference architecture, key features required by most streaming applications and optional features that can be selected based on specific use cases.

Miyuru Dayarathna Srinath Perera
on Aug 21, 2018
AI, ML & Data Engineering

Democratizing Stream Processing with Apache Kafka and KSQL - Part 1

In this article, author Michael Noll discusses the stream processing with KSQL, the streaming SQL engine for Apache Kafka. Topics covered include challenges of stateful stream processing and how KSQL addresses them, and how KSQL helps to bridge the world of streams and databases through streams and tables.

Michael Noll
on Jun 15, 2018
AI, ML & Data Engineering

Migrating Batch ETL to Stream Processing: A Netflix Case Study with Kafka and Flink

At QCon New York, Shriya Arora presented “Personalising Netflix with Streaming Datasets” and discussed the trials and tribulations of a recent migration of a Netflix data processing job from the traditional approach of batch-style ETL to stream processing using Apache Flink.

Daniel Bryant
on Feb 08, 2018
Architecture & Design

Exploring the Fundamentals of Stream Processing with the Dataflow Model and Apache Beam

At QCon San Francisco 2016, Frances Perry and Tyler Akidau presented “Fundamentals of Stream Processing with Apache Beam”, and discussed Google's Dataflow model and associated implementation of Apache Beam.

Daniel Bryant
on Jan 30, 2018
Architecture & Design

Is Batch ETL Dead, and is Apache Kafka the Future of Data Processing?

At QCon San Francisco 2016, Neha Narkhede presented “ETL is Dead; Long Live Streams”, and discussed the changing landscape of enterprise data processing. A core premise of the talk was that the open source Apache Kafka streaming platform can provide a flexible and uniform framework that supports modern requirements for data transformation and processing.

Daniel Bryant
on Jan 22, 2018
Mobile

Processing Streaming Human Trajectories with WSO2 CEP

Extracting useful information from an inaccurate data stream is a significant issue in data stream processing for IoT applications. This article describes the use of Kalman filters to smooth human trajectory information gathered from an iBeacon sensor network and demonstrates its effectiveness. The solution has been built with WSO2 CEP, a complex event processing middleware.

Ramindu De Silva Miyuru Dayarathna
on Feb 27, 2017
Architecture & Design

Key Takeaway Points and Lessons Learned from QCon San Francisco 2016

The 10th annual QCon San Francisco was the biggest yet, bringing together over 1500 team leads, architects, project managers, and engineering directors. Over 125 practitioner-speakers presented 92 full-length technical sessions and 32 in-depth tutorials, providing deep insights into real-world architectures and state of the art software development practices from a practitioner’s perspective.

Abel Avram
on Dec 14, 2016
AI, ML & Data Engineering

Traffic Data Monitoring Using IoT, Kafka and Spark Streaming

Internet of Things (IoT) is an emerging disruptive technology and becoming an increasing topic of interest. One of the areas of IoT application is the connected vehicles. In this article we'll use Apache Spark and Kafka technologies to analyse and process IoT connected vehicle's data and send the processed data to real time traffic monitoring dashboard.

Amit Baghel
on Sep 28, 2016
AI, ML & Data Engineering

Chris Fregly on the PANCAKE STACK Workshop and Data Pipelines

InfoQ Interviews Chris Fregly, organizer for the 4000+ member Advanced Spark and TensorFlow Meetup about the PANCAKE STACK workshop, Spark and building data pipelines for a machine learning pipeline

Dylan Raithel
on Aug 29, 2016
AI, ML & Data Engineering

Big Data Processing with Apache Spark - Part 3: Spark Streaming

In this article, third installment of Apache Spark series, author Srini Penchikala discusses Apache Spark Streaming framework for processing real-time streaming data using a log analytics sample application.

Srini Penchikala
on Jan 07, 2016
Storm Applied Review and Q&A with the Authors

Storm is a distributed, fault-tolerant, real-time computation system that was originally developed at BackType and later open sourced by Twitter. Storm Applied is a new book from Manning that aims to provide a practical guide on using Storm, both in a development and in a production setting. InfoQ has spoken with two of the book’s authors, Sean T. Allen and Matthew Jankowski.

Sergio De Simone
on Jul 27, 2015