InfoQ Homepage Big Data Content on InfoQ

Presentations

RSS Feed

Newer Older

Understanding Cloud, Big Data, Mobile and Security – Do They Play Nicely Together?

Colin Mower discusses the challenges met using together Cloud, Big Data, Mobile and Security and how these can work together to achieve business value.

Colin Mower
on May 12, 2015

Icon

41:57
A Taste of Random Decision Forests on Apache Spark

Sean Owen introduces Spark, Scala and random decision forests, and demonstrates the process of analyzing a real-world data set with them.

Sean Owen
on Apr 28, 2015

Icon

48:14
Big Data in Memory

John Davies shows a Spring work-flow consuming 7.4kB XML messages, binding them to 25kB Java but storing them in just 450 bytes each, 10 million derivative contracts in-memory on a laptop.

John Davies
on Mar 14, 2015

Icon

01:06:43
Gobblin: A Framework for Solving Big Data Ingestion Problem

Lin Qiao discusses the architecture of Gobblin, LinkedIn’s framework for addressing the need of high quality and high velocity data ingestion.

Lin Qiao
on Mar 12, 2015

Icon

44:13
AI, ML & Data Engineering

Better Together - Using Spark and Redshift to Combine Your Data with Public Datasets

Eugene Mandel discusses challenges of conforming data sources and compares processing stacks: Hadoop+Redshift vs Spark, showing how the technology drives the way the problem is modeled.

Eugene Mandel
on Mar 12, 2015

Icon

35:16
High Performance Computing Contributions to the World of Big Data

Sharan Kalwani presents the history of HPC and the technologies and trends which have contributed to creating the world of big data, covering applications of HPC resulting in big data technologies.

Sharan Kalwani
on Jan 11, 2015

Icon

52:07
A Distributed Transactional Database on Hadoop

John Leach explains using HBase co-processors to support a full ANSI SQL RDBMS without modifying the core HBase source, showing how Hadoop/HBase can replace traditional RDBMS solutions.

John Leach
on Jan 02, 2015

Icon

55:55
Hadoop 201 -- Deeper into the Elephant

Roman Shaposhnik discusses more advanced features of HDFS, in addition to how YARN has enabled businesses to massively scale their systems beyond what was previously possible.

Roman Shaposhnik
on Dec 28, 2014

Icon

01:31:56
Why Would You Integrate Solr and Hadoop?

Yann Yu discusses how Solr and Hadoop complement each other, and how to use Solr as a real-time, analytical, full-text search front-end to data stored in Hadoop.

Yann Yu
on Dec 28, 2014

Icon

19:26
1.5 Million Log Lines Per Second: Building and Maintaining Flume Flows at Conversant

Mike Keane presents how Conversant migrated to Flume, managing 1000 agents across 4 data centers, processing over 50B log lines per day with peak hourly averages of over 1.5 million log lines/sec.

Mike Keane
on Dec 21, 2014

Icon

44:03
The Big Data Imperative: Discovering & Protecting Sensitive Data in Hadoop

Jeremy Stieglitz discusses best practices for a data-centric security , compliance and data governance approach, with a particular focus on two customer use cases.

Jeremy Stieglitz
on Dec 21, 2014

Icon

39:55
Why Spark Is the Next Top (Compute) Model

Dean Wampler argues that Spark/Scala is a better data processing engine than MapReduce/Java because tools inspired by mathematics, such as FP, are ideal tools for working with data.

Dean Wampler
on Dec 15, 2014

Icon

42:05

Newer Presentations

Older Presentations

InfoQ Software Architects' Newsletter

Presentations