BT

InfoQ Software Architects' Newsletter

A monthly overview of things you need to know as an architect or aspiring architect.

View an example

We protect your privacy.

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Logo - Back to homepage

News Articles Presentations Podcasts Guides

Topics

Development

Featured in Development

Practical Benchmarking: How To Detect Performance Changes in Noisy Results

Matt Fleming provides tips for understanding noise, where it comes, and techniques for fighting it, plus some anecdotes and real-life examples from the world of open-source.

All in development

Architecture & Design

Featured in Architecture & Design

If Architectural Experimentation Is So Great, Why Aren’t You Doing It?

Architectural experimentation sounds like a great idea, yet it does not seem to be used very frequently. In this article, we will explore some of the reasons why teams don’t use this powerful tool more often, and what they can do about leveraging that tool for successful outcomes.

All in architecture-design

AI Infrastructure

Featured in AI, ML & Data Engineering

Powering User Experiences with Streaming Dataflow

Alana Marzoev discusses the fundamentals of streaming dataflow and the architecture of ReadySet, a streaming dataflow system designed specifically for operational workloads.

All in ai-ml-data-eng

Culture & Methods

Featured in Culture & Methods

The (Not So) Hidden Social Drivers Behind the Highest Performing Engineering Teams

Lizzie Matusov delves into how trust and psychological safety serve as powerful signals of team success with practical methods to evaluate and measure these key social dimensions.

All in culture-methods

DevOps

Featured in DevOps

Efficient DevSecOps Workflows with a Little Help from AI

Michael Friedrich tells a story about experienced pain points, wasted hours debugging and solving, and learning how a little help from AI makes DevSecOps workflows efficient again.

All in devops

Events

Helpful links

Choose your language

Discover emerging trends, insights, and real-world best practices in software development & tech leadership. Join now.

InfoQ Dev Summit Boston

Learn how senior software developers are solving the challenges you face. Register now with early bird tickets.

InfoQ Dev Summit Munich

Learn practical solutions to today's most pressing software challenges. Register now with early bird tickets.

QCon San Francisco

Explore insights, real-world best practices and solutions in software development & leadership. Register now.

InfoQ Homepage Spark Content on InfoQ

Presentations

RSS Feed

Newer Older

AI, ML & Data Engineering

From Spark to Elasticsearch and Back - Learning Large-Scale Models for Content Recommendation

Sonya Liberman shares an algorithmic architecture that enables running complex models under difficult scale constraints and shortens the cycle between research and production.

Sonya Liberman
on Mar 14, 2020

Icon

25:48
AI, ML & Data Engineering

Streaming for Personalization Datasets at Netflix

Shriya Arora discusses challenges faced with stream processing unbounded datasets, comparing microbatch with event-based approaches using Spark and Flink.

Shriya Arora
on Jul 26, 2017

Icon

41:46
AI, ML & Data Engineering

When Streams Fail: Kafka Off the Shore

Anton Gorshkov discusses how to evaluate and architect a resilient streaming platform, focusing on Kafka and Spark streaming and sharing his experience of using them to process financial transactions.

Anton Gorshkov
on Jul 18, 2017

Icon

54:53
AI, ML & Data Engineering

Data Preparation for Data Science: A Field Guide

Casey Stella presents a utility written with Apache Spark to automate data preparation, discovering missing values, values with skewed distributions and discovering likely errors within data.

Casey Stella
on Apr 23, 2017

Icon

45:00
AI, ML & Data Engineering

Real-Time Recommendations Using Spark Streaming

Elliot Chow discusses the data pipeline that they built with Kafka, Spark Streaming, and Cassandra to process Netflix user activities in real time for the Trending Now row.

Elliot Chow
on Mar 30, 2017

Icon

47:03
AI, ML & Data Engineering

Data Science in the Cloud @StitchFix

Stefan Krawczyk discusses how StitchFix used the cloud to enable over 80 data scientists to be productive and have easy access, covering prototyping, algorithms used, keeping schema in sync, etc.

Stefan Krawczyk
on Feb 17, 2017

Icon

40:48
AI, ML & Data Engineering

Machine Learning and End-to-End Data Analysis Processes in Spark Using Python and R

Debraj GuhaThakurta discusses ML and data analysis processes in Spark using examples written in Python and R.

Debraj GuhaThakurta
on Feb 05, 2017

Icon

32:49
AI, ML & Data Engineering

MLeap: Release Spark ML Models

Hollin Wilkins discusses the reasons behind MLeap, outes the programming time saved by using it, shows benchmarks of several online models, and provides a demo and examples of using it in practice.

Hollin Wilkins
on Dec 04, 2016

Icon

34:13
AI, ML & Data Engineering

Hydrator: Open Source, Code-Free Data Pipelines

Jonathan Gray introduces Hydrator, an open source framework and user interface for creating data lakes for building and managing data pipelines on Spark, MapReduce, Spark Streaming and Tigon.

Jonathan Gray
on Oct 23, 2016

Icon

41:39
AI, ML & Data Engineering

Exploring Wikipedia with Apache Spark: A Live Coding Demo

Sameer Farooqui demos connecting to the live stream of Wikipedia edits, building a dashboard showing what’s happening with Wikipedia datasets and how people are using them in real time.

Sameer Farooqui
on Aug 23, 2016

Icon

59:07
AI, ML & Data Engineering

Ingest & Stream Processing - What Will You Choose?

Pat Patterson and Ted Malaska talk about current and emerging data processing technologies, and the various ways of achieving "at least once" and "exactly once" timely data processing.

Ted Malaska Pat Patterson
on Aug 14, 2016

Icon

50:44
AI, ML & Data Engineering

Monitoring and Troubleshooting Real-Time Data Pipelines

Alan Ngai and Premal Shah discuss best practices on monitoring distributed real-time data processing frameworks and how DevOps can gain control and visibility over these data pipelines.

Premal Shah Alan Ngai
on Jul 20, 2016

Icon

30:44

Newer Presentations

Older Presentations

BT