InfoQ Homepage AI, ML & Data Engineering Content on InfoQ
-
Understanding Architectures for Multi-Region Data Residency
Alex Strachan discusses challenges to build multi-region data storages, understanding why and when a business needs to do this, who are the real stakeholders, and who owns what.
-
Large Language Models for Code: Exploring the Landscape, Opportunities, and Challenges
Loubna Ben Allal discusses Large Language Models (LLMs), exploring the current developments of these models, how they are trained, and how they can be leveraged with custom codebases.
-
Streaming Databases: Embracing the Convergence of Stream Processing and Databases
Yingjun Wu discusses the evolution of streaming databases, and the features and design principles that set streaming databases apart from conventional database systems and stream processing engines.
-
Modern Compute Stack for Scaling Large AI/ML/LLM Workloads
Jules Damji discusses which infrastructure should be used for distributed fine-tuning and training, how to scale ML workloads, how to accommodate large models, and how CPUs and GPUs can be utilized.
-
Building Guardrails for Enterprise AI Applications W/ LLMs
Shreya Rajpal introduces Guardrails AI, an open-source platform designed to mitigate risks and enhance the safety and efficiency of LLMs.
-
Combating AI-Generated Fake Images with JavaScript Libraries
Kate Sills discusses JavaScript libraries to use for cryptographic hashes, digital signatures and timestamping, the traditional archival process, and how cryptographic hashes can prevent tampering.
-
Generative AI: Shaping a New Future for Fraud Prevention
Neha Narkhede discusses a vision for fraud and risk management that leverages the advancements in generative AI.
-
Platform and Features MLEs, a Scalable and Product-Centric Approach for High Performing Data Products
Massimo Belloni discusses the lessons learnt in the last couple of years around organizing a Data Science Team and the Machine Learning Engineering efforts at Bumble Inc.
-
Relational Data at the Edge
Justin Kwan and Vignesh Ravichandran discuss Cloudflare’s edge database architecture, unique challenges and practices for data replication, failover and recovery, and custom performance techniques.
-
Redesigning OLTP for a New Order of Magnitude
Joran Greef discusses TigerBeetle, a new database, and why OLTP has a growing impedance mismatch, why the OLTP workload is becoming more contentious, why row locks, why storage faults, write stalls.
-
Enabling Remote Query Execution through DuckDB Extensions
Stephanie Wang focuses on DuckDB’s extension model, and on query execution and planning, which is a use case of this DuckDB extension model.
-
Multi-Region Data Streaming with Redpanda
Michał Maślanka introduces the design of Redpanda’s Multi-Region feature, and describes how they leveraged Raft’s properties, a constraint solver, automatic data balancing, and tiered storage.