InfoQ Homepage Infrastructure Content on InfoQ
-
Netflix Keystone - How We Built a 700B/day Stream Processing Cloud Platform in a Year
Peter Bakas presents in detail how Netflix has used Kafka, Samza, Docker, and Linux to implement a multi-tenant pipeline processing 700B events/day in the Amazon AWS cloud.
-
Hunting Criminals with Hybrid Analytics
David Talby demos using Python libraries to build a ML model for fraud detection, scaling it up to billions of events using Spark, and what it took to make the system perform and ready for production.
-
Resilient Predictive Data Pipelines
Sid Anand discusses how Agari is applying big data best practices to the problem of securing its customers from email-born threats, presenting a system that leverages big data in the cloud.
-
Big-Data Analytics Misconceptions
Irad Ben-Gal discusses Big Data analytics misconceptions, presenting a technology predicting consumer behavior patterns that can be translated into wins, revenue gains, and localized assortments.
-
How Comcast Uses Data Science and ML to Improve the Customer Experience
Jan Neumann presents how Comcast uses machine learning and big data processing to facilitate search for users, for capacity planning, and predictive caching.
-
Immutable Infrastructure: Rise of the Machine Images
Axel Fontaine looks at what Immutable Infrastructure is and how it affects scaling, logging, sessions, configuration, service discovery and more.
-
The Mechanics of Testing Large Data Pipelines
Mathieu Bastian explores the mechanics of unit, integration, data and performance testing for large, complex data workflows, along with the tools for Hadoop, Pig and Spark.
-
How to Have Your Causality and Wall Clocks Too
Jon Moore talks about distributed monotonic clocks (DMC) whose timestamps can reflect causality but which have a component that stays close to wall clock time.
-
Stream Processing with Apache Flink
Robert Metzger provides an overview of the Apache Flink internals and its streaming-first philosophy, as well as the programming APIs.
-
Rethinking Streaming Analytics for Scale
Helena Edelson addresses new architectures emerging for large scale streaming analytics based on Spark, Mesos, Akka, Cassandra and Kafka (SMACK) or Apache Flink or GearPump.
-
Monkeys in Lab Coats: Applying Failure Testing Research @Netflix
The authors present how lineage-driven fault injection evolved from a theoretical model into an automated failure testing system that leverages Netflix’s fault injection and tracing infrastructures.
-
Understanding Hardware Transactional Memory
Gil Tene explores the underlying mechanics that power HTM on current platforms, focusing on things developers need to understand when contemplating the use of HTM in new and existing code.