InfoQ Homepage AI, ML & Data Engineering Content on InfoQ
-
Modern Compute Stack for Scaling Large AI/ML/LLM Workloads
Jules Damji discusses which infrastructure should be used for distributed fine-tuning and training, how to scale ML workloads, how to accommodate large models, and how can CPUs and GPUs be utilized?
-
Building Guardrails for Enterprise AI Applications W/ LLMs
Shreya Rajpal introduces Guardrails AI, an open-source platform designed to mitigate risks and enhance the safety and efficiency of LLMs.
-
Combating AI-Generated Fake Images with JavaScript Libraries
Kate Sills discusses JavaScript libraries to use for cryptographic hashes, digital signatures and timestamping, the traditional archival process, and how cryptographic hashes can prevent tampering.
-
Generative AI: Shaping a New Future for Fraud Prevention
Neha Narkhede discusses a vision for fraud and risk management that leverages the advancements in generative AI.
-
Platform and Features MLEs, a Scalable and Product-Centric Approach for High Performing Data Products
Massimo Belloni discusses the lessons learnt in the last couple of years around organizing a Data Science Team and the Machine Learning Engineering efforts at Bumble Inc.
-
Relational Data at the Edge
Justin Kwan and Vignesh Ravichandran discuss Cloudflare’s edge database architecture, unique challenges and practices for data replication, failover and recovery, and custom performance techniques.
-
Redesigning OLTP for a New Order of Magnitude
Joran Greef discusses TigerBeetle, a new database, and why OLTP has a growing impedance mismatch, why the OLTP workload is becoming more contentious, why row locks, why storage faults, write stalls.
-
Enabling Remote Query Execution through DuckDB Extensions
Stephanie Wang focuses on DuckDB’s extension model, and on query execution and planning, which is a use case of this DuckDB extension model.
-
Multi-Region Data Streaming with Redpanda
Michał Maślanka introduces the design of Redpanda’s Multi-Region feature, and describes how they leveraged Raft’s properties, a constraint solver, automatic data balancing, and tiered storage.
-
In-Process Analytical Data Management with DuckDB
Hannes Mühleisen discusses DuckDB, an analytical data management system that is built for an in-process use case. DuckDB speaks SQL, is integrated as a library, and uses query processing techniques.
-
Going beyond the Case of Black Box AutoML
Kiran Kate covers the basics of AutoML and then presents Lale (https://github.com/IBM/lale), an open-source scikit-learn compatible AutoML library which implements Gradual AutoML.
-
Graph Learning at the Scale of Modern Data Warehouses
Subramanya Dulloor outlines an approach to addressing the challenges of warehouses and shows how to build an efficient and scalable end-to-end system for graph learning in data warehouses.