InfoQ Homepage MapReduce Content on InfoQ
Articles
RSS Feed-
Matt Schumpert on Datameer Smart Execution
Datameer, a big data analytics application for Hadoop, introduced Datameer 5.0 with Smart Execution to dynamically select the optimal compute framework at each step in the big data analytics process. InfoQ spoke with Matt Schumpert from Datameer team about the new product and how it works to help with big data analytics needs.
-
Interview with Raffi Krikorian on Twitter's Infrastructure
Raffi Krikorian, Vice President of Platform Engineering at Twitter, gives an insight on how Twitter prepares for unexpected traffic peaks and how system architecture is designed to support failure.
-
Building Scalable Applications in .NET: Introducing the FatDB Distributed Computing Platform
Justin Weiler introduces FatDB, a NoSQL DB and a distributed platform built on Mission Oriented Architecture meant to abstract and generalize the essential characteristics of enterprise applications.
-
Apache Crunch: A Java Library for Easier MapReduce Programming
In his new article Josh Wills introduces Crunch - a new Apache incubating project providing a Java library for creating MapReduce pipelines. Crunch is based on a set of high level abstractions simplifying MapReduce applications design and provides library of patterns to implement common tasks like data joins, aggregations, and sorting.
-
Unit Testing Hadoop MapReduce Jobs With MRUnit, Mockito, & PowerMock
Hadoop MapReduce jobs have a unique code architecture that raises interesting issues for test-driven development. In this article Michael Spicuzza provides a real-world example using MRUnit, Mockito, and PowerMock to solve these problems.
-
The State of NoSQL
Stefan Edlich, Senior Lecturer at Beuth HS of Technology Berlin, Germany, reviews NoSQL, considering its evolution, financial impact, the standards or their lack of, the current landscape, books, the leaders and some newcomers, concluding that NoSQL is here to stay.
-
Implementing Aggregation Functions in MongoDB
In this article, authors Arun Viswanathan and Shruthi Kumar discuss how to implement common aggregation functions on a MongoDB document database using its MapReduce functionality. They also discuss a typical application of aggregations which includes business reporting of sales data.
-
Uncovering mysteries of InputFormat: Providing better control for your Map Reduce execution.
In their article authors, Boris Lublinsky and Mike Segel, show how to leverage custom InputFormat class implementation to tighter control execution strategy of Maps in Hadoop Map Reduce jobs.
-
Data Mining in the Swamp: Taming Unruly Data With Cloud Computing
Matrix presents a white paper on using the open source tool, Hadoop, to implement the MapReduce strategy and a Cloud computing strategy to solve business intelligence problems.
-
SOA Agents: Grid Computing meets SOA
Grid technology for improving scalability, high availability and throughput in SOA implementations. In this article, Boris Lublinsky explains how Grid computing can be used in the overall SOA architecture and introduces a programming model for Grid utilization in service implementation. He also introduces an experimental Grid implementation that can support this proposed architecture.