InfoQ Homepage AI, ML & Data Engineering Content on InfoQ
-
Change Data Capture for Microservices
Gunnar Morling discusses how change data capture (CDC) and stream processing can help developers with typical challenges they often face when working on microservices.
-
Back to Basics: Scalable, Portable ML in Pure SQL
Evan Miller walks through the architecture of Eppo's portable, performant, privacy-preserving, multi-warehouse regression engine, and discusses the challenges with implementation.
-
Malignant Intelligence?
Alasdair Allen discusses the potentially ethical dilemmas, new security concerns, and open questions about the future of software development in the era of machine learning.
-
What is Derived Data? (and Do You Already Have Any?)
Felix GV explains what derived data is, and dives into four major use cases which fit in the derived data bucket, including: graphs, search, OLAP and ML feature storage.
-
Speed of Apache Pinot at the Cost of Cloud Object Storage with Tiered Storage
Neha Pawar discusses how to query data on the cloud directly with sub-seconds latencies, diving into data fetch and optimization strategies, challenges faced and learnings.
-
Real-Time Machine Learning: Architecture and Challenges
Chip Huyen discusses the value of fresh data as well as different types of architecture and challenges of online prediction.
-
AI Bias and Sustainability
Leslie Miley discusses how the road to ubiquitous AI is clouded by the dangers of the inherent bias in Large Language Models and the increased CO2 emissions that come with deployment at scale.
-
A Bicycle for the (AI) Mind: GPT-4 + Tools
Sherwin Wu and Atty Eleti discuss how to use the OpenAI API to integrate large language models into your application, and extend GPT’s capabilities by connecting it to the external world via APIs.
-
A New Era for Database Design with TigerBeetle
Joran Dirk Greef discusses pivotal moments in database design and how they influenced the design decisions for TigerBeetle, a distributed financial accounting database.
-
Streaming from Apache Iceberg - Building Low-Latency and Cost-Effective Data Pipelines
Steven Wu discusses the design of the Flink Iceberg, comparing the Kafka and Iceberg sources for streaming and how the Iceberg streaming source can power many common stream processing use cases.
-
Azure Cosmos DB: Low Latency and High Availability at Planet Scale
Mei-Chin Tsai and Vinod Sridharan discuss the internal architecture of Azure Cosmos DB and how it achieves high availability, low latency, and scalability.
-
Amazon DynamoDB: Evolution of a Hyperscale Cloud Database Service
Akshat Vig presents Amazon’s experience operating DynamoDB at scale and how the architecture continues to evolve to meet the ever-increasing demands of customer workloads.