InfoQ Homepage Big Data Content on InfoQ

Articles

RSS Feed

Newer Older

AI, ML & Data Engineering

Stream Processing Anomaly Detection Using Yurita Framework

In this article, author Guy Gerson discusses the stream processing anomaly detection framework they developed by PayPal, called Yurita. The framework is based on Spark Structured Streaming.

Guy Gerson
on Jul 10, 2019
AI, ML & Data Engineering

Real-Time Data Processing Using Redis Streams and Apache Spark Structured Streaming

Structured Streaming, introduced with Apache Spark 2.0, delivers a SQL-like interface for streaming data. Redis Streams enables Redis to consume, hold and distribute streaming data between multiple producers and consumers. In this article, author Roshan Kumar walks us through how to process streaming data in real time using Redis and Apache Spark Streaming technologies.

Roshan Kumar
on May 13, 2019
AI, ML & Data Engineering

Conquering the Challenges of Data Preparation for Predictive Maintenance

Predictive maintenance (PdM) applications aim to apply machine learning (ML) on IIoT datasets in order to reduce occupational hazards, machine downtime, and other costs. In this article, the author addresses some of the data preparation challenges faced by the industrial practitioners of ML and the solutions for data ingest and feature engineering related to PdM.

Ian Downard
on Jan 04, 2019
AI, ML & Data Engineering

Analytics Zoo: Unified Analytics + AI Platform for Distributed Tensorflow, and BigDL on Apache Spark

In this article we described how Analytics Zoo can help real-world users to build end-to-end deep learning pipelines for big data, including unified pipelines for distributed TensorFlow and Keras on Apache Spark, easy-to-use abstractions such as transfer learning and Spark ML pipeline support, built-in deep learning models and reference use cases, etc.

Jason Dai
on Dec 11, 2018
AI, ML & Data Engineering

Sentiment Analysis: What's with the Tone?

Sentiment analysis is widely applied in voice of the customer (VOC) applications. In this article, the authors discuss NLP-based Sentiment Analysis based on machine learning (ML) and lexicon-based approaches using KNIME data analysis tools.

Rosaria Silipo Kathrin Melcher
on Nov 27, 2018
AI, ML & Data Engineering

Spark Application Performance Monitoring Using Uber JVM Profiler, InfluxDB and Grafana

In this article, author Amit Baghel discusses how to monitor the performance of Apache Spark based applications using technologies like Uber JVM Profiler, InfluxDB database and Grafana data visualization tool.

Amit Baghel
on Nov 18, 2018
AI, ML & Data Engineering

Natural Language Processing with Java - Second Edition: Book Review and Interview

Natural Language Processing with Java - Second Edition book covers the Natural Language Processing (NLP) topic and various tools developers can use in their applications. Technologies discussed in the book include Apache OpenNLP and Stanford NLP. InfoQ spoke with co-author Richard Reese about the book and how NLP can be used in enterprise applications.

Srini Penchikala
on Oct 10, 2018
AI, ML & Data Engineering

Democratizing Stream Processing with Apache Kafka® and KSQL - Part 2

In this article, author Robin Moffatt shows how to use Apache Kafka and KSQL to build data integration and processing applications with the help of an e-commerce sample application. Three use cases discussed: customer operations, operational dashboard, and ad-hoc analytics.

Robin Moffatt
on Sep 07, 2018
AI, ML & Data Engineering

How to Choose a Stream Processor for Your App

Choosing a stream processor for your app can be challenging with many options to choose from. The best choice depends on individual use cases. In this article, the authors discuss a stream processor reference architecture, key features required by most streaming applications and optional features that can be selected based on specific use cases.

Miyuru Dayarathna Srinath Perera
on Aug 21, 2018
AI, ML & Data Engineering

Analyzing and Preventing Unconscious Bias in Machine Learning

This article is based on Rachel Thomas’s keynote presentation, “Analyzing & Preventing Unconscious Bias in Machine Learning” at QCon.ai 2018. Thomas talks about the pitfalls and risk the bias in machine learning brings to the decision-making process. She discusses three use cases of machine learning bias.

Srini Penchikala
on Aug 14, 2018
Culture & Methods

Q&A on the Book Testing in the Digital Age

The Book Testing in the Digital Age by Tom van de Ven, Rik Marselis, and Humayun Shaukat, explains the impact that developments like robotics, artificial intelligence, internet of things, and big data are having in testing. It explores the challenges and possibilities that the digital age brings us when it comes to testing software systems.

Tom van de Ven Ben Linders
on Jul 19, 2018
AI, ML & Data Engineering

Democratizing Stream Processing with Apache Kafka and KSQL - Part 1

In this article, author Michael Noll discusses the stream processing with KSQL, the streaming SQL engine for Apache Kafka. Topics covered include challenges of stateful stream processing and how KSQL addresses them, and how KSQL helps to bridge the world of streams and databases through streams and tables.

Michael Noll
on Jun 15, 2018

Newer Articles

Older Articles

InfoQ Software Architects' Newsletter

Articles