InfoQ Homepage AI, ML & Data Engineering Content on InfoQ
-
Performance and Scale - Domain-Oriented Objects vs Tabular Data Structures
Donald Raab and Rustam Mehmandarov discuss three library solutions for managing data based on an example of high-performance CSV processing.
-
Building High-Fidelity Data Streams
Sid Anand discusses how they built a lossless streaming data system that guarantees sub-second (p95) event delivery at scale with better than three nines availability.
-
Fabricator: End-to-End Declarative Feature Engineering Platform
Kunal Shah discusses how their ML platform designed Fabricator by integrating various open source and enterprise solutions to deliver a declarative end-to-end feature engineering framework.
-
Ray: the Next Generation Compute Runtime for ML Applications
Zhe Zang introduces the basic API and architectural concepts of Ray, as well as diving deeper into some of its innovative ML use cases.
-
Change Data Capture for Microservices
Gunnar Morling discusses how change data capture (CDC) and stream processing can help developers with typical challenges they often face when working on microservices.
-
Back to Basics: Scalable, Portable ML in Pure SQL
Evan Miller walks through the architecture of Eppo's portable, performant, privacy-preserving, multi-warehouse regression engine, and discusses the challenges with implementation.
-
Malignant Intelligence?
Alasdair Allen discusses the potentially ethical dilemmas, new security concerns, and open questions about the future of software development in the era of machine learning.
-
What is Derived Data? (and Do You Already Have Any?)
Felix GV explains what derived data is, and dives into four major use cases which fit in the derived data bucket, including: graphs, search, OLAP and ML feature storage.
-
Speed of Apache Pinot at the Cost of Cloud Object Storage with Tiered Storage
Neha Pawar discusses how to query data on the cloud directly with sub-seconds latencies, diving into data fetch and optimization strategies, challenges faced and learnings.
-
Real-Time Machine Learning: Architecture and Challenges
Chip Huyen discusses the value of fresh data as well as different types of architecture and challenges of online prediction.
-
AI Bias and Sustainability
Leslie Miley discusses how the road to ubiquitous AI is clouded by the dangers of the inherent bias in Large Language Models and the increased CO2 emissions that come with deployment at scale.
-
A Bicycle for the (AI) Mind: GPT-4 + Tools
Sherwin Wu and Atty Eleti discuss how to use the OpenAI API to integrate large language models into your application, and extend GPT’s capabilities by connecting it to the external world via APIs.