InfoQ Homepage Streaming Content on InfoQ
Integrate 2017 Recap: Adding Intelligence to Integration
Integrate 2017, an annual integration event focused on Microsoft Integration technologies, took place in London from June 26th – 28th. Some of the key themes that were discussed include the role of cognitive computing in integration, API orchestration, SaaS connectivity, cloud native integration, the impact of serverless on integration and cloud messaging at scale.
Nikita Ivanov on Apache Ignite In-Memory Computing Platform
Apache Ignite is an in-memory computing platform with transactional support, that supports both key-value persistence as well as streaming and complex-event processing. Ignite was open-sourced by GridGain in late 2014 and accepted in the Apache Incubator program. InfoQ interviewed Nikita Ivanov, CTO of GridGain, to find out more about Apache Ignite.
Overview of the Reliable Event Delivery System at Spotify
Spotify clients generate up to 1.5 million events per second at peak hours and all are handled by their Event Delivery System, designed to have a predictable latency and to never lose an event, Igor Maravic noted in his presentation at the recent QCon London conference, where he gave a high level overview of the system and some of the key operational aspects.
Challenges Building Facebook Live Streams
Facebook Live started in a hackathon two years ago, and was launched to users eight months later. One of the challenges has been dealing with the unpredictable number of viewers of a single stream, Sachin Kulkarni noted in his presentation at the recent QCon London conference, where he described his team's architecture and design considerations when building Facebook live streams.
Apache Flink 1.2 Released with Dynamic Rescaling, Security and Queryable State
Apache Flink 1.2 was announced and features dynamic rescaling, security, queryable state, and more. The release resolved 650 issues, maintains compatibility with all public APIs and ships with Apache Kafka 0.10 and Apache Mesos support. Flink’s dynamic rescaling allows one to change the parallelism of a streaming job or of an operator within the job.
Hazelcast Release Jet, Open-Source Stream Processing Engine
Hazelcast, previously known for the open-source caching and in-memory data grid technologies, has announced a major release of their new stream processing engine, Jet.
Chaperone - A Kafka Auditing Tool from the Uber Engineering Team
The Uber Engineering team released their Kafka auditing tool called Chaperone as an open-source project. Chaperone allows for auditing and detection of data loss, latency, and duplication of messages in the multi-datacenter and high-volume Kafka setup at Uber.
Apache Eagle, Originally from eBay, Graduates to top-level project
Apache Eagle, an open-source solution for identifying security and performance issues on big data platforms, graduates to Apache top level project on January 10, 2017. Firstly open-sourced by eBay on October 2015, Eagle was created to instantly detect access to sensitive data or malicious activities and, to take actions in a timely fashion.
Mathieu Ripert on Instacart's Machine Learning Optimizations
Instacart is an online delivery service for groceries under one hour. Customers order the items on the website or using the mobile app, and a group of Instacart’s shoppers go to local stores, purchase the items and deliver them to the customer. InfoQ interviewed Mathieu Ripert, data scientist at Instacart, to find out how machine learning is leveraged to guarantee a better customer experience.
Julien Nioche on StormCrawler, Open-Source Crawler Pipelines Backed by Apache Storm
Julien Nioche, director of DigitalPebble, PMC member and committer of the Apache Nutch web crawler project, talks about StormCrawler, a collection of reusable components to build distributed web crawlers based on the streaming framework Apache Storm. InfoQ interviewed Nioche, main contributor of the project, to find out more about StormCrawler and how it compares to other similar technologies.
Azure Functions Reach General Availability
Microsoft recently announced an addition to its Platform as a Service (PaaS) offering called Azure Functions. Initially launched as a preview service in March 2016, Azure Functions provide developers with an event-driven serverless compute platform that allow organizations to pay for only what they consume.
Julien Le Dem on the Future of Column-Oriented Data Processing with Apache Arrow
Julien Le Dem, the PMC chair of the Apache Arrow project, presented on Data Eng Conf NY on the future of column-oriented data processing. Apache Arrow is an open-source standard for columnar in-memory execution. InfoQ interviewed Le Dem to find out the differences between Arrow and Parquet.
Microservices and Stream Processing Architecture at Zalando Using Apache Flink
Javier Lopez and Mihail Vieru spoke at Reactive Summit 2016 Conference about cloud-based data integration and distribution platform used for stream processing in business intelligence use cases. Their solution is based on technologies such as Flink, Kafka and Elasticsearch.
Stream Processing and Lambda Architecture Challenges
Lambda architecture has been a popular solution that combines batch and stream processing. Kartik Paramasivam at LinkedIn wrote about how his team addressed stream processing and Lambda architecture challenges using Apache Samza for data processing. The challenges described are the late arrival of events and the processing of duplicated messages.
Jay Kreps on Distributed Stream Processing with Apache Kafka and Kafka Streams
Apache Kafka and Kafka Streams frameworks help with developing stream-centric architectures and distributed stream processing applications. Jay Kreps, CEO of Confluent, gave the keynote presentation on stream processing and microservices at Reactive Summit 2016 Conference last week.