InfoQ Homepage Data Analytics Content on InfoQ

News

RSS Feed

Newer Older

Architecture & Design

Solving Fragmented Mobile Analytics: Uber’s Platform-Led Approach

Uber Engineering outlines its platform-led mobile analytics redesign, standardizing event instrumentation across iOS and Android to improve cross-platform consistency, reduce engineering effort, and provide reliable insights for product and data teams.

Leela Kumili
on Jan 13, 2026
AI, ML & Data Engineering

DuckDB's WebAssembly Client Allows Querying Iceberg Datasets in the Browser

DuckDB has recently introduced end-to-end interaction with Iceberg REST Catalogs directly within a browser tab, requiring no infrastructure setup. The new feature leverages DuckDB-Wasm, a WebAssembly port of DuckDB that runs in the browser, allowing users to query, read, and write Iceberg tables in a serverless manner.

Renato Losio
on Jan 04, 2026
Architecture & Design

Inside Uber’s Query Architecture: Simplifying Layers and Improving Observability

Uber rebuilt its Apache Pinot query architecture, replacing the Presto-based Neutrino system with a lightweight proxy called Cellar and Pinot’s Multi-Stage Engine Lite Mode. The redesign simplifies SQL execution, improves resource management, and ensures predictable performance for large-scale analytics workloads.

Leela Kumili
on Nov 06, 2025
Cloud

Cloudflare Introduces Data Platform with Zero Egress Fees

Cloudflare has recently announced the open beta of Cloudflare Data Platform, a managed solution for ingesting, storing, and querying analytical data tables using open standards such as Apache Iceberg.

Renato Losio
on Nov 01, 2025
AI, ML & Data Engineering

Cloudflare Chooses PostgreSQL Extension over Specialized OLAP for 100K Row/Second Analytics

In a recent article from the engineering team behind the Zero Trust product suite, Cloudflare explains why it chose TimescaleDB over ClickHouse to add analytics and reporting capabilities to its internal platform. The author highlights the “phenomenal balance” between the simplicity of storing analytical data alongside configuration data and the performance of a specialized OLAP system.

Renato Losio
on Jul 31, 2025
Cloud

Amazon S3 Adds Sort and Z-Order Compaction to Improve Apache Iceberg Query Performance

AWS has recently announced that Amazon S3 now supports sort and z-order compaction for Apache Iceberg tables. The new features reduce scan times and engine costs, and are available for both S3 Tables and traditional S3 buckets using AWS Glue Data Catalog optimization.

Renato Losio
on Jul 16, 2025
AI, ML & Data Engineering

HTAP: the Rise and Fall of Unified Database Systems?

A recent article by Zhou Sun sparked a debate in the data community about the future of HTAP systems. Hybrid transaction/analytical processing was meant to help integrate historical and online data at scale, supporting more flexible query methods and reducing business complexity.

Renato Losio
on Jun 15, 2025
AI, ML & Data Engineering

The Open-Source Version of InfluxDB 3 Reaches GA

Two years after releasing the GA version of InfluxData’s enterprise edition, their open-source version also reached that level of maturity. Conceptualised for real-time workloads and ease of running, the core version leaves aside features like long-term storage optimisations, compaction or high availability (HA), read replicas, or fine-grained access controls.

Olimpiu Pop
on Apr 16, 2025
AI, ML & Data Engineering

Google Enhances Data Privacy with Confidential Federated Analytics

Google has announced Confidential Federated Analytics (CFA), a technique designed to increase transparency in data processing while maintaining privacy. Building on federated analytics, CFA leverages confidential computing to ensure that only predefined and inspectable computations are performed on user data without exposing raw data to servers or engineers.

Robert Krzaczyński
on Mar 11, 2025
AI, ML & Data Engineering

Apache Hudi 1.0 Now Generally Available

The Apache Software Foundation has recently announced the general availability of Apache Hudi 1.0, the transactional data lake platform with support for near real-time analytics. Initially introduced in 2017, Apache Hudi provides an open table format optimized for efficient writes in incremental data pipelines and fast query performance.

Renato Losio
on Jan 18, 2025
AI, ML & Data Engineering

How Uber Sped up SQL-based Data Analytics with Presto and Express Queries

Uber uses Presto, an open-source distributed SQL query engine, to provide analytics across several data sources, including Apache Hive, Apache Pinot, MySQL, and Apache Kafka. To improve its performance, Uber engineers explored the advantages of dealing with quick queries, a.k.a. express queries, in a specific way and found they could improve both Presto utilization and response times.

Sergio De Simone
on Nov 18, 2024
Development

Elastic Returns to Open Source: Will the Community Follow?

In a surprising move for both the open-source and Elastic communities, Shay Banon, founder and CEO of Elastic, recently announced that Elasticsearch and Kibana will once again be open source. The two products will soon be licensed under the AGPL, an OSI-approved license.

Renato Losio
on Sep 05, 2024
Architecture & Design

Canva Opts for Amazon KDS over SNS+SQS to Save 85% with 25 Billion Events per Day

Canva evaluated different data massaging solutions for its Product Analytics Platform, including the combination of AWS SNS and SQS, MKS, and Amazon KDS, and eventually chose the latter, primarily based on its much lower costs. The company compared many aspects of these solutions, like performance, maintenance effort, and cost.

Rafal Gancarz
on Aug 07, 2024
Cloud

Data Solutions Framework: an Open Source Project for Building Data Solutions on AWS

AWS recently released the Data Solutions Framework (DSF), an opinionated open-source framework designed to accelerate the creation of data solutions on AWS. Built using the AWS CDK, the framework exposes abstractions and patterns as building blocks for constructing data solutions and is available in TypeScript (npm) and Python (PyPi).

Renato Losio
on Mar 02, 2024
Cloud

Amazon Q Data Integration in AWS Glue Simplifies Data Transformation on AWS

Recently, AWS announced the preview of a new feature for AWS Glue, enabling customers to use natural language for authoring and troubleshooting data integration jobs. With Amazon Q data integration in AWS Glue, developers can provide a description of their data integration workload, and the service will generate an ETL script.

Renato Losio
on Feb 25, 2024

Newer News

Older News

InfoQ Software Architects' Newsletter

News