InfoQ Homepage Streaming Content on InfoQ
-
Big Data Processing with Apache Spark - Part 2: Spark SQL
Spark SQL, part of Apache Spark big data framework, is used for structured data processing and allows running SQL like queries on Spark data. In this article, Srini Penchikala discusses Spark SQL module and how it simplifies running data analytics using SQL interface. He also talks about the new features in Spark SQL, like DataFrames and JDBC data sources.
-
Big Data Processing with Apache Spark – Part 1: Introduction
Apache Spark is an open source big data processing framework built around speed, ease of use, and sophisticated analytics. In this article, Srini Penchikala talks about how Apache Spark framework helps with big data processing and analytics with its standard API. He also discusses how Spark compares with traditional MapReduce implementation like Apache Hadoop.
-
Real-Time Stream Processing as Game Changer in a Big Data World with Hadoop and Data Warehouse
This article discusses what stream processing is, how it fits into a big data architecture with Hadoop and a data warehouse (DWH), when stream processing makes sense, and what technologies and products you can choose from.
-
Using SEDA to Ensure Service Availability
A new strategy for incorporating event driven architecture for scalability and availability of services in the context of SOA. These strategies are based on queuing research pioneered for the use of highly abailable and scalable services, initially in the Web context, but moving into the SOA and Web services context. Actual implementation is described in the context of Mule.