InfoQ Homepage Hadoop Content on InfoQ
-
Building an Impenetrable ZooKeeper
Kathleen Ting details 8 misconfigurations that can bring ZooKeeper down.
-
How to Build Big Data Pipelines for Hadoop Using OSS
Costin Leau discusses Big Data, current available tools for dealing with it, and how Spring can be used to create Big Data pipelines.
-
Storm: Distributed and Fault-Tolerant Real-time Computation
Nathan Marz introduces Twitter Storm, outlining its architecture and use cases, and takes a look at future features to be made available.
-
Extending the Enterprise Data Warehouse with Hadoop
Rob Lancaster explains the steps made by Orbitz in order to bridge the gap between their data in the data warehouse and the data in Hadoop.
-
Introducing Apache Hadoop: The Modern Data Operating System
Eli Collins introduces Hadoop: why it came about, the benefits it produces, its history, its architecture, use cases and applications.
-
Petabyte Scale Data at Facebook
Dhruba Borthakur discusses the different types of data used by Facebook and how they are stored, including graph data, semi-OLTP data, immutable data for pictures, and Hadoop/Hive for analytics.
-
Big Time: Introducing Hadoop on Azure
Yaniv Rodenski introduces Hadoop, then running Hadoop on Azure and the available tools and frameworks.
-
MapReduce and Its Discontents
Dean Wampler discusses the strengths and weaknesses of MapReduce, and the newer variants for big data processing: Pregel and Storm.
-
Hadoop: Scalable Infrastructure for Big Data
Parand Tony Darugar overviews Hadoop, its processing model, the associated ecosystem and tools, discussing some real-life uses of Hadoop for analyzing and processing large amounts of data.
-
Big Data Architectures at Facebook
Ashish Thusoo presents the data scalability issues at Facebook and the data architecture evolution from EDW to Hadoop to Puma.
-
NetApp Case Study
Kumar Palaniapan and Scott Fleming present how NetApp deals with big data using Hadoop, HBase, Flume, and Solr, collecting and analyzing TBs of log data with Think Big Analytics.
-
Hadoop and Cassandra, Sitting in a Tree ...
Jake Luciani introduces Brisk, a Hadoop and Hive distribution using Cassandra for core services and storage, presenting the benefits of running Hadoop in a peer-to-peer masterless architecture.