InfoQ Homepage Database Content on InfoQ
-
Interview and Book Review: The LogStash Book, Log Management Made Easy
James Turnbull makes a compelling case for using Logstash for centralizing logging by explaining the implementation details of Logstash within the context of a logging project. The book targets both small companies and large enterprises through a two sided case; both for the low barrier to entry and the scaling capabilities.
-
Interview and Video Review: Working with Big Data: Infrastructure, Algorithms, and Visualizations
Paul Dix leads a practical exploration into Big Data in this video training series. The first five lessons of the training span multiple server systems with a focus on the end to end processing of large quantities of XML data from real Stack Exchange posts. He completes the training with a lesson on developing visualizations for gaining insights from the macro level analysis of Big Data.
-
Refactoring Legacy Applications: A Case Study
To refactor legacy code, the ideal is to have a suite of unit tests to prevent regressions. However it's not always that easy. This article describes a methodology to safely refactor legacy code.
-
The Datomic Information Model
Rich Hickey, the author of Clojure, explains the information model of Datomic - a new database designed as a composition of simple services, combining the capabilities of RDBMS and scalability of NoSQL.
-
Apache Crunch: A Java Library for Easier MapReduce Programming
In his new article Josh Wills introduces Crunch - a new Apache incubating project providing a Java library for creating MapReduce pipelines. Crunch is based on a set of high level abstractions simplifying MapReduce applications design and provides library of patterns to implement common tasks like data joins, aggregations, and sorting.
-
Using AWS Cloud Search
Many of today’s applications heavily rely on the search functionality. In this Article Boris Lublinsky explains how to build Java APIs for uploading data and implementing search for Amazon Cloud Search. Usage of these APIs can simplify embedding Amazon Cloud Search functionality into custom applications.
-
Unit Testing Hadoop MapReduce Jobs With MRUnit, Mockito, & PowerMock
Hadoop MapReduce jobs have a unique code architecture that raises interesting issues for test-driven development. In this article Michael Spicuzza provides a real-world example using MRUnit, Mockito, and PowerMock to solve these problems.
-
Interview and Book Review: NoSQL Distilled
InfoQ spoke with both authors of the book, Pramod and Martin Fowler about NoSQL database space, the emerging trends in NoSQL.
-
The State of NoSQL
Stefan Edlich, Senior Lecturer at Beuth HS of Technology Berlin, Germany, reviews NoSQL, considering its evolution, financial impact, the standards or their lack of, the current landscape, books, the leaders and some newcomers, concluding that NoSQL is here to stay.
-
Hadoop Virtual Panel
In this virtual panel, InfoQ talks to several Hadoop vendors and users about their views at current and future state of Hadoop and the things that are the most important for Hadoop’s further adoption and success.
-
The Architecture of Datomic
Rich Hickey, the author of Clojure, explains the architecture of Datomic - a new database designed as a composition of simple services, combining the capabilities of RDBMS and scalability of NoSQL.
-
Julien Nioche on Apache Nutch 2 Features and Product Roadmap
Open source web-search framework Apache Nutch version 2 supports large scale crawling, link-graph database and HTML parsing. InfoQ spoke with Julien Nioche, VP of Apache Nutch project, about the framework new features and its future roadmap.