InfoQ Homepage Database Content on InfoQ

Presentations

RSS Feed

Newer Older

AI, ML & Data Engineering

Scio: Moving Big Data to Google Cloud, a Spotify Story

Neville Li tells the Spotify’s story of migrating their big data infrastructure to Google Cloud, replacing Hive and Scalding with BigQuery and Scio, which helped them iterate faster.

Neville Li
on May 26, 2017

Icon

54:50
Architecture & Design

In-Memory Caching: Curb Tail Latency with Pelikan

Yao Yue introduces Pelikan - a framework to implement distributed caches such as Memcached and Redis. She discusses the system aspects that are important to the performance of such services.

Yao Yue
on May 02, 2017

Icon

47:56
AI, ML & Data Engineering

Data Preparation for Data Science: A Field Guide

Casey Stella presents a utility written with Apache Spark to automate data preparation, discovering missing values, values with skewed distributions and discovering likely errors within data.

Casey Stella
on Apr 23, 2017

Icon

45:00
Architecture & Design

Building Reliability in an Unreliable World

Greg Murphy describes how GameSparks has designed their platform to be tolerant of many things: unreliable and slow internet connectivity, cloud resources that can fail without warning, and more.

Greg Murphy
on Apr 20, 2017

Icon

50:39
AI, ML & Data Engineering

AI from an Investment Perspective

The panelists discuss AI from an investment perspective, the challenges, the risks, trends, the role of Deep Learning, successful AI use cases, and more.

Pankaj Mitra Doug Dooley Sanjit Dang Kiersten Stead Yashwanth Hemaraj Leonard Speiser Kartik Gada
on Apr 18, 2017

Icon

42:48
Architecture & Design

Causal Consistency for Large Neo4j Clusters

Jim Webber explores the new Causal clustering architecture for Neo4j, how it allows users to read writes straightforwardly, explaining why this is difficult to achieve in distributed systems.

Jim Webber
on Apr 07, 2017

Icon

49:40
AI, ML & Data Engineering

Big Data Infrastructure @ LinkedIn

Shirshanka Das describes LinkedIn’s Big Data Infrastructure and its evolution through the years, including details on the motivation and architecture of Gobblin, Pinot and WhereHows.

Shirshanka Das
on Apr 02, 2017

Icon

50:48
Development

Performance and Search

Dan Luu discusses how to estimate performance using back of the envelope calculations that can be done in minutes or hours, even for applications that take months or years to implement.

Dan Luu
on Apr 01, 2017

Icon

41:09
Architecture & Design

Scaling up Near Real-Time Analytics @Uber &LinkedIn

Chinmay Soman and Yi Pan discuss how Uber and LinkedIn use Apache Samza, Calcite and Pinot along with the analytics platform AthenaX to transform data to make it available for querying in minutes.

Yi Pan Chinmay Soman
on Mar 30, 2017

Icon

46:03
AI, ML & Data Engineering

Real-Time Recommendations Using Spark Streaming

Elliot Chow discusses the data pipeline that they built with Kafka, Spark Streaming, and Cassandra to process Netflix user activities in real time for the Trending Now row.

Elliot Chow
on Mar 30, 2017

Icon

47:03
Architecture & Design

Stream Processing & Analytics with Flink @Uber

Danny Yuan discusses how Uber builds its next generation of stream processing system to support real-time analytics as well as complex event processing.

Danny Yuan
on Mar 25, 2017

Icon

47:47
Architecture & Design

Demistifying DynamoDB Streams

Akshat Vig and Khawaja Shams discuss DynamoDB Streams and what it takes to build an ordered, highly available, durable, performant, and scalable replicated log stream.

Akshat Vig Khawaja Shams
on Mar 25, 2017

Icon

39:21

Newer Presentations

Older Presentations

InfoQ Software Architects' Newsletter

Presentations