InfoQ Homepage Big Data Content on InfoQ
-
Nikita Ivanov on GridGain’s In-Memory Accelerator for Hadoop
GridGain recently announced the In-Memory Accelerator for Hadoop, offering the benefits of in-memory computing to Hadoop based applications. It includes two components: an in-memory file system and a MapReduce implementation. InfoQ spoke with Nikita Ivanov, CTO of GridGain about the architecture of the product.
-
Introducing Spring XD, a Runtime Environment for Big Data Applications
Spring XD (eXtreme Data) is Pivotal’s Big Data play. It joins Spring Boot and Grails as part of the execution portion of the Spring IO platform. Whilst Spring XD makes use of a number of existing Spring projects it is a runtime environment rather than a library or framework, comprising a bin directory with servers that you start up and interact with via a shell.
-
MLConf NYC 2014 Highlights
The MLConf conference was going strong in NYC on April 11th and was a full day packed with talks around Machine Learning and Big Data, featuring speakers from many prominent companies.
-
Lambda Architecture: Design Simpler, Resilient, Maintainable and Scalable Big Data Solutions
Lambda Architecture proposes a simpler, elegant paradigm designed to store and process large amounts of data. In this article, author Daniel Jebaraj presents the motivation behind the Lambda Architecture, reviews its structure with the help of a sample Java application.
-
Embedded Analytics and Statistics for Big Data
This article provides an overview of tools and libraries available for embedded data analytics and statistics, both stand-alone software packages and programming languages with statistical capabilities. The authors also discuss how to combine and integrate these embedded analytics technologies to handle big data.
-
Big Data Analytics for Security
In this article, authors discuss the role of big data and Hadoop in security analytics space and how to use MapReduce to efficiently process data for security analysis for use cases like Security Information and Event Management (SIEM) and Fraud Detection.
-
Building Applications With Hadoop
When building applications using Hadoop, it is common to have input data from various sources coming in various formats. In his presentation, “New Tools for Building Applications on Apache Hadoop”, Eli Collins overviews how to build better products with Hadoop and various tools that can help, such as Apache Avro, Apache Crunch, Cloudera ML and the Cloudera Development Kit.
-
Interview with Raffi Krikorian on Twitter's Infrastructure
Raffi Krikorian, Vice President of Platform Engineering at Twitter, gives an insight on how Twitter prepares for unexpected traffic peaks and how system architecture is designed to support failure.
-
Building a Real-time, Personalized Recommendation System with Kiji
Jon Natkins explains in this article how to create a personalized recommendation system fed with large amounts of real-time data using Kiji, which leverages HBase, Avro, Map-Reduce and Scalding.
-
Agility, Big Data, and Analytics
How do you bringing agility into big data analytics? Learn what makes analytics uniquely different than application development, and how to adapt agile principles and practices to the nuances of analytics. Examine how the disciplines of data science and software development complement one another, and how these intersect in an agile project environment.
-
Costin Leau on Elasticsearch, BigData and Hadoop
Elasticsearch is an open source, distributed real-time search and analytics engine for the cloud. The first milestone of elasticsearch-hadoop 1.3.M1 was released last month. InfoQ spoke with Costin Leau about Elasticsearch and how it integrates with Hadoop and other Big Data technologies.
-
Building Scalable Applications in .NET: Introducing the FatDB Distributed Computing Platform
Justin Weiler introduces FatDB, a NoSQL DB and a distributed platform built on Mission Oriented Architecture meant to abstract and generalize the essential characteristics of enterprise applications.