InfoQ Homepage Distributed Data Content on InfoQ
-
A Distributed Transactional Database on Hadoop
John Leach explains using HBase co-processors to support a full ANSI SQL RDBMS without modifying the core HBase source, showing how Hadoop/HBase can replace traditional RDBMS solutions.
-
Building a Distributed Data Ingestion System with RabbitMQ
Alvaro Videla shows how to build a system that can ingest data produced at separate locations and replicate it across regions using RabbitMQ.
-
How Facebook Scales Big Data Systems
Jeff Johnson introduces Apollo, a hierarchical NoSQL data system meant to deal with Facebook's distributed storage needs.
-
Building a Distributed Data Ingestion System with RabbitMQ
Alvaro Videla presents the more advanced features of RabbitMQ: federated brokers, HA queues and support for many protocols and languages.
-
An API for Distributed Computing
Cliff Click introduces a coding style & API for in-memory analytics that handles datasets from 1K to 1TB without changing a line of code and clusters with TB of RAM and hundreds of CPUs.
-
Spanner - Google's Distributed Database
Sebastian Kanthak details how Spanner relies on GPS and atomic clocks to provide two of its innovative features: Lock-free strong reads and global snapshots consistent with external events.
-
Next Top Data Model
Ian Plosker shares a number of techniques for establishing the data query patterns from the outset of application development, designing a data model to fit those patterns.
-
Architectural Patterns for High Availability
Adrian Cockcroft presents Netflix globally distributed architecture, the benchmarks used, scalability issues, and the open source components their implementation is based upon.
-
Running the Largest Hadoop DFS Cluster
Hairong Kuang explains how Facebook uses HDFS to store and analyze over 100PB of user log data.
-
Executing Queries on a Sharded Database
Neha Narula provides advice on choosing a data store for a web applications and executing distributed queries.
-
Getting started with Spring Data and Distributed Database Grids
Mark Johnson and David Turanski introduce Spring Data for GemFire demoing using Spring Data for persistency across multiple distributed database grids.
-
Big Time: Introducing Hadoop on Azure
Yaniv Rodenski introduces Hadoop, then running Hadoop on Azure and the available tools and frameworks.