InfoQ Homepage Hadoop Content on InfoQ
-
What Can Hadoop Do for You?
Eva Andreasson presents typical categories of problems that are commonly solved using Hadoop and also some concrete examples in each category.
-
Design Patterns for Large-Scale Real-Time Learning
Sean Owen provides examples of operational analytics projects, presenting a reference architecture and algorithm design choices for a successful implementation based on his experience Oryx/Cloudera.
-
From The Lab To The Factory: Building A Production Machine Learning Infrastructure
Josh Wills discusses using Hadoop technologies to build real-time data analysis models with a focus on strategies for data integration, large-scale machine learning, and experimentation.
-
Data & Infrastructure at Airbnb
Brenden Matthews describes the infrastructure built at Airbnb using Mesos in order to support Hadoop and Storm.
-
Graph Computing at Scale
Matthias Broecheler discusses graph computing, introducing the Aurelius graph cluster enabling graph computing at scale by building on distributed systems like Cassandra, HBase, and Hadoop.
-
REEF: Retainable Evaluator Execution Framework
Rusty Sears introduces REEF along with examples of computational frameworks, including interactive sessions, iterative graph processing, bulk synchronous computations, Hive queries, and MapReduce.
-
Apache Tez: Accelerating Hadoop Query Processing
Bikas Saha and Arun Murthy detail the design of Tez, highlighting some of its features and sharing some of the initial results obtained by Hive on Tez.
-
Big Data Platform as a Service at Netflix
Jeff Magnusson details some of Netflix' key services: Franklin, Sting and Lipstick.
-
High Speed Smart Data Ingest into Hadoop
Oleg Zhurakousky discusses architectural tradeoffs and alternative implementations of real-time high speed data ingest into Hadoop.
-
A Guide to Python Frameworks for Hadoop
Uri Laserson reviews the different available Python frameworks for Hadoop, including a comparison of performance, ease of use/installation, differences in implementation, and other features.
-
Leveraging Your Hadoop Cluster Better - Running Performant Code at Scale
Michael Kopp explains how to run performance code at scale with Hadoop and how to analyze and optimize Hadoop jobs.
-
Building Applications using Apache Hadoop
Eli Collins overviews how to build new applications with Hadoop and how to integrate Hadoop with existing applications, providing an update on the state of Hadoop ecosystem, frameworks and APIs.