BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Big Data Content on InfoQ

  • Spoilt for Choice – How to choose the right Big Data / Hadoop Platform?

    In his new article Kai Wähner compares several alternatives for installing a version of Hadoop and realizing big data processes. He compares distributions and tooling from Apache and many other vendors including Cloudera, HortonWorks, MapR, Amazon, IBM, Oracle, Microsoft. He additionally describes pros and cons of every distribution and provides a decision tree for choosing a most appropriate one.

  • Mike Barlow on Real-Time Big Data Analytics

    "Real-Time Big Data Analytics: Emerging Architecture" white paper authored by Mike Barlow covers big data analytics topic and how real-time big data analytics (RTBDA) are different from traditional analytics. InfoQ spoke with Mike about the current state of real-time big data analytics and the emerging trends in the Big Data space like Decision Science.

  • Interview and Video Review: Working with Big Data: Infrastructure, Algorithms, and Visualizations

    Paul Dix leads a practical exploration into Big Data in this video training series. The first five lessons of the training span multiple server systems with a focus on the end to end processing of large quantities of XML data from real Stack Exchange posts. He completes the training with a lesson on developing visualizations for gaining insights from the macro level analysis of Big Data.

  • Apache Crunch: A Java Library for Easier MapReduce Programming

    In his new article Josh Wills introduces Crunch - a new Apache incubating project providing a Java library for creating MapReduce pipelines. Crunch is based on a set of high level abstractions simplifying MapReduce applications design and provides library of patterns to implement common tasks like data joins, aggregations, and sorting.

  • Unit Testing Hadoop MapReduce Jobs With MRUnit, Mockito, & PowerMock

    Hadoop MapReduce jobs have a unique code architecture that raises interesting issues for test-driven development. In this article Michael Spicuzza provides a real-world example using MRUnit, Mockito, and PowerMock to solve these problems.

  • Interview and Book Review: NoSQL Distilled

    InfoQ spoke with both authors of the book, Pramod and Martin Fowler about NoSQL database space, the emerging trends in NoSQL.

  • The State of NoSQL

    Stefan Edlich, Senior Lecturer at Beuth HS of Technology Berlin, Germany, reviews NoSQL, considering its evolution, financial impact, the standards or their lack of, the current landscape, books, the leaders and some newcomers, concluding that NoSQL is here to stay.

  • Hadoop Virtual Panel

    In this virtual panel, InfoQ talks to several Hadoop vendors and users about their views at current and future state of Hadoop and the things that are the most important for Hadoop’s further adoption and success.

  • The Architecture of Datomic

    Rich Hickey, the author of Clojure, explains the architecture of Datomic - a new database designed as a composition of simple services, combining the capabilities of RDBMS and scalability of NoSQL.

  • Julien Nioche on Apache Nutch 2 Features and Product Roadmap

    Open source web-search framework Apache Nutch version 2 supports large scale crawling, link-graph database and HTML parsing. InfoQ spoke with Julien Nioche, VP of Apache Nutch project, about the framework new features and its future roadmap.

  • Blueprint for a Big Data Solution

    In his new article Jonathan Natkins explains how to use components of Apache Hadoop, including Flume, Hive and Oozie to implement a typical Data management system. He also gives a practical example of such architecture to measure Twitter user’s influence.

  • Inside the Complexity of Delivering Cloud Computing

    There's a lot more to cloud computing than meets the eye. This article presents an insider's view on what really is entailed in designing and deploying a cloud-based solution.

BT