InfoQ Homepage Big Data Content on InfoQ

Articles

RSS Feed

Newer Older

Emerging Technologies

The Brain is Neither a Neural Network Nor a Computer: Book Review of The Biological Mind

Underlying much of artificial intelligence research is the idea that the essence of an individual resides in the brain. This is contrary to neuroscience which has discovered that a brain cannot work independently from the body and its environment. Understanding this enables us see what is reasonable to expect from artificial intelligence, as well as technology designed to improve human life.

Michael Stiefel
on Jan 29, 2021
AI, ML & Data Engineering

Overcoming Data Scarcity and Privacy Challenges with Synthetic Data

In this article, the author discusses the importance of using synthetic data in data analytics projects, especially in financial institutions, to solve the problems of data scarcity and more importantly data privacy.

Dawn Li
on Dec 25, 2020
AI, ML & Data Engineering

Beyond the Database, and beyond the Stream Processor: What's the Next Step for Data Management?

Databases have been around forever with the same shape: you make a request to your data and then you receive an answer. Now, stream processors came along with a different approach: data isn’t locked up, it is in motion. Understand how stream processors and databases relate and why there is an emerging new category of databases that focus on data that stays in place as well as data that moves.

Ben Stopford
on Nov 16, 2020
Cloud

The End of the Privacy Shield Agreement Could Lead to Disaster for Hyperscale Cloud Providers

The recent ending of the Privacy Shield agreement by the European Court of Justice (ECJ) might impact cloud adoption. This article looks at the demise of this agreement, and possible solutions.

Nahla Davies
on Oct 08, 2020
AI, ML & Data Engineering

COVID-19 and Mining Social Media - Enabling Machine Learning Workloads with Big Data

In this article, author Adi Pollock discusses how to enable machine learning workloads with big data to query and analyze COVID-19 tweets to understand social sentiment towards COVID-19.

Adi Polak
on Oct 02, 2020
Cloud

From Cloud to Cloudlets: a New Approach to Data Processing?

The growing popularity of small, distributed clouds, or “cloudlets” is an implicit recognition of the limitations of the “traditional” cloud model, and could signal a major shift in the way that data is collected, stored, and processed.

Sam Bocetta
on Oct 01, 2020
Cloud

Combining DataOps and DevOps: Scale at Speed

DataOps is an extension of DevOps standards and processes into the data analytics world. It's about streamlining the processes involved in processing, analyzing and deriving value from big data.

Sam Bocetta
on Aug 14, 2020
AI, ML & Data Engineering

Data Leadership Book Review and Interview

Data Leadership book, authored by Anthony Algmin, covers the data leadership topic and how data leaders should manage and govern the data management programs in their organizations. Data Leadership is how organizations choose to apply their energy and resources toward creating data capabilities to influence their business.

Srini Penchikala Anthony Algmin
on Jul 25, 2020
Java

Apache Arrow and Java: Lightning Speed Big Data Transfer

Apache Arrow puts forward a cross-language, cross-platform, columnar in-memory data format for data. It is designed to eliminate the need for data serialization and reduce the overhead of copying.

Joris Gillis
on May 23, 2020
Culture & Methods

Data Analytics in the World of Agility

Is it all about customer-centric business, or is there any data left? Can we integrate data analytics and customer empathy? This article explores how we can move towards a more customer-centric business and what information we require in order to understand the most valuable thing we have: our customer.

Almudena Rodriguez Pardo
on Sep 06, 2019
AI, ML & Data Engineering

Stream Processing Anomaly Detection Using Yurita Framework

In this article, author Guy Gerson discusses the stream processing anomaly detection framework they developed by PayPal, called Yurita. The framework is based on Spark Structured Streaming.

Guy Gerson
on Jul 10, 2019
AI, ML & Data Engineering

Real-Time Data Processing Using Redis Streams and Apache Spark Structured Streaming

Structured Streaming, introduced with Apache Spark 2.0, delivers a SQL-like interface for streaming data. Redis Streams enables Redis to consume, hold and distribute streaming data between multiple producers and consumers. In this article, author Roshan Kumar walks us through how to process streaming data in real time using Redis and Apache Spark Streaming technologies.

Roshan Kumar
on May 13, 2019

Newer Articles

Older Articles

InfoQ Software Architects' Newsletter

Login with:

Don't have an InfoQ account?

Articles