InfoQ Homepage Big Data Infrastructure Content on InfoQ
Presentations
RSS Feed-
Big Data Infrastructure @ LinkedIn
Shirshanka Das describes LinkedIn’s Big Data Infrastructure and its evolution through the years, including details on the motivation and architecture of Gobblin, Pinot and WhereHows.
-
Petabytes Scale Analytics Infrastructure @Netflix
Tom Gianos and Dan Weeks discuss Netflix' overall big data platform architecture, focusing on Storage and Orchestration, and how they use Parquet on AWS S3 as their data warehouse storage layer.
-
The Game of Big Data: Scalable, Reliable Analytics Infrastructure at KIXEYE
Randy Shoup describes KIXEYE's analytics infrastructure from Kafka queues through Hadoop 2 to Hive and Redshift, built for flexibility, experimentation, iteration, testability, and reliability.
-
Data & Infrastructure at Airbnb
Brenden Matthews describes the infrastructure built at Airbnb using Mesos in order to support Hadoop and Storm.
-
Making the Internet a Better Place: Scaling AppNexus
Mike Nolet shares lessons learned scaling AppNexus and architectural details of their system processing 30TB/day: Hadoop, DNS built in GSLB and Keepalived, and real-time data streaming built in C.
-
Lean Data Architecture: Minimize Investment, Maximize Value
Manvir Singh Grewal and Brandon Byars propose a business intelligence workflow along with Lean principles and practices for implementing a data warehouse and reporting capability.
-
Big Data Problems in Monitoring at eBay
Bhaven Avalani and Yuri Finklestein discuss 4 aspects encountered at eBay when dealing with monitoring data: reduction of data entropy, robust data distribution, metric extraction, efficient storage.
-
Big Data, Small Computers
Cliff Click discusses RAIN, H2O, JMM, Parallel Computation, Fork/Joins in the context of performing big data analysis on tons of commodity hardware.
-
Facebook News Feed: Social Data at Scale
Serkan Piantino discusses news feeds at Facebook: the basics, infrastructure used, how feed data is stored, and Centrifuge – a storage solution.
-
Saving the World (from|with) Big Data
Bruce Durling discusses the impact of cloud computing on the climate and what can be done to reduce the amount of CO2 generated by data centers in order to process big data.
-
"Big Data" and the Future of DevOps
Ram C Singh discusses using Big Data for infrastructure telemetry along with good practices and an autonomic engine to create an autonomic computing infrastructure that might prevent downtime.