InfoQ Homepage Apache Flink Content on InfoQ
-
Netflix Studio Search: Using Elasticsearch and Apache Flink to Index Federated GraphQL Data
Netflix engineers recently published how they built Studio Search, using Apache Kafka streams, an Apache Flink-based Data Mesh process, and Elasticsearch to manage the index. They designed the platform to take a portion of Netflix's federated GraphQL graph and make it searchable. Today, Studio Search powers a significant portion of the user experience for many applications within the organisation.
-
Real-Time Exactly-Once Event Processing at Uber with Apache Flink, Kafka, and Pinot
Uber faced some challenges after introducing ads on UberEats. The events they generated had to be processed quickly, reliably and accurately. These requirements were fulfilled by a system based on Apache Flink, Kafka, and Pinot that can process streams of ad events in real-time with exactly-once semantics. An article describing its architecture was published recently in the Uber Engineering blog.
-
ApacheCon 2019 Keynote: Google Cloud Enhances Big-Data Processing with Kubernetes
At ApacheCon North America, Christopher Crosbie gave a keynote talk title "Yet Another Resource Negotiator for Big Data? How Google Cloud is Enhancing Data Lake Processing with Kubernetes." He highlighted Google's efforts to make Apache big-data software "cloud native" by developing open-source Kubernetes Operators to provide control planes for running Apache software in a Kubernetes cluster.
-
Netflix Keystone Real-Time Stream Processing Platform
Netflix recently published a post in their tech blog discussing the design considerations and insights of Keystone, their Real-time stream processing platform. Keystone has been operational since December 2015 and has grown significantly over the years as Netflix subscribers have grown from 65 to over 130 million in the past 3 years. This article follows on the latest state of Keystone platform...
-
Data Artisans Announces Serializable ACID Transactions on Streaming Data
Data Artisans has announced the general availability of Streaming Ledger, which extends Apache Flink with capabilities to perform serializable ACID transactions across tables, keys, and event streams. The patent-pending technology is a proprietary add-on for Flink and allows going beyond the current standard where operations could only consistently work on a single key at a time.
-
Julien Le Dem on the Future of Column-Oriented Data Processing with Apache Arrow
Julien Le Dem, the PMC chair of the Apache Arrow project, presented on Data Eng Conf NY on the future of column-oriented data processing. Apache Arrow is an open-source standard for columnar in-memory execution. InfoQ interviewed Le Dem to find out the differences between Arrow and Parquet.
-
Microservices and Stream Processing Architecture at Zalando Using Apache Flink
Javier Lopez and Mihail Vieru spoke at Reactive Summit 2016 Conference about cloud-based data integration and distribution platform used for stream processing in business intelligence use cases. Their solution is based on technologies such as Flink, Kafka and Elasticsearch.
-
Stream Processing and Lambda Architecture Challenges
Lambda architecture has been a popular solution that combines batch and stream processing. Kartik Paramasivam at LinkedIn wrote about how his team addressed stream processing and Lambda architecture challenges using Apache Samza for data processing. The challenges described are the late arrival of events and the processing of duplicated messages.
-
Data Streaming Architecture with Apache Flink
Jamie Grier recently spoke at OSCON 2016 Conference about data streaming architecture using Apache Flink. He talked about the building blocks of data streaming applications and stateful stream processing with code examples of Flink applications and monitoring.
-
Apache Flink 1.0.0 is Released
InfoQ's Rags Srinivas caught up with Stephan Ewen, a project committer for Apache Flink about the 1.0.0 Release and the roadmap
-
Yahoo! Benchmarks Apache Flink, Spark and Storm
Yahoo! has benchmarked three of the main stream processing frameworks: Apache Flink, Spark and Storm.
-
Adatao Launches Full Stack Data Intelligence Platform
Adatao recently announced the general availability of its Data Intelligence platform. Its platform aims to make data analysis and predictive analytics available to everyone in large organizations. Adatao had secured an investment of $13 million last year from a group of investors including Bloomberg Beta, Lightspeed Venture Partners and Andreessen Horowitz.
-
Fabian Hueske on Apache Flink Framework
Apache Flink is a distributed data flow processing system for performing analytics on large data sets. It can be used for real time data streams as well as batch data processing. It supports APIs in Java and Scala programming languages. Fabian Hueske, PMC member of Apache Flink, spoke about the data processing framework at the recent ApacheCon Conference.