InfoQ Homepage Data Pipelines Content on InfoQ
-
Future of Data Engineering
Chris Riccomini talks about the current state-of-the-art in data pipelines & data warehousing, and shares some of the solutions to current problems dealing with data streaming & warehousing.
-
Building and Operating a Serverless Data Pipeline
Will Norman discusses the motivations of switching to a serverless infrastructure, and lessons learned while building and operating such a system at scale.
-
Announcing Broadway
José Valim discusses how Broadway connects multiple stages and producers, how it leverages GenStage to provide back-pressure, and other features such as batching, rate-limiting, partitioning and more.
-
Cloud-Native Data Pipelines with Apache Kafka
Gwen Shapira discusses how data engineering requirements changed in a cloud-native world, and how the solutions change with them.
-
Event-Driven Architectures with Apache Geode and Spring Integration
Charlie Black deploys Spring Integration pipelines to react to changes of the data stored in Apache Geode.
-
Designing Automated Pipelines for Unseen Custom Data
Kevin Moore discusses some challenges in designing automated machine learning pipelines that can deal with custom user data that it has never seen before, as well as some of Salesforce’s solutions.
-
Zero to Multi-Cloud
Marcin Grzejszczak, Jon Schneider discuss using Spring Cloud Pipelines and Spinnaker together.
-
ML Data Pipelines for Real-Time Fraud Prevention @PayPal
Mikhail Kourjanski focuses on the architectural approach towards PayPal’s real-time service platform that leverages ML models, delivers performance and quality of decisions.
-
Simplifying ML Workflows with Apache Beam
Tyler Akidau discusses how Apache Beam is simplifying pre- and post-processing for ML pipelines.
-
Orchestrating Data Microservices with Spring Cloud Data Flow
Mark Pollack discusses how to create data integration and real-time data processing pipelines using Spring Cloud Data Flow and deploy them to multiple platforms – Cloud Foundry, Kubernetes, and YARN.
-
Data Pipelines for Real-Time Fraud Prevention at Scale
Mikhail Kourjanski discusses the architecture of PayPal’s data service which combines a Big Data approach with providing data in real time for decision making in fraud detection.
-
Developing Data and ML Pipelines at Stitch Fix
Jeff Magnusson discusses thoughts and guidelines on how Stitch Fix develops, schedules, and maintains their data and ML pipelines.