InfoQ Homepage AI, ML & Data Engineering Content on InfoQ
-
A New Era for Database Design with TigerBeetle
Joran Dirk Greef discusses pivotal moments in database design and how they influenced the design decisions for TigerBeetle, a distributed financial accounting database.
-
Streaming from Apache Iceberg - Building Low-Latency and Cost-Effective Data Pipelines
Steven Wu discusses the design of the Flink Iceberg, comparing the Kafka and Iceberg sources for streaming and how the Iceberg streaming source can power many common stream processing use cases.
-
Azure Cosmos DB: Low Latency and High Availability at Planet Scale
Mei-Chin Tsai and Vinod Sridharan discuss the internal architecture of Azure Cosmos DB and how it achieves high availability, low latency, and scalability.
-
Amazon DynamoDB: Evolution of a Hyperscale Cloud Database Service
Akshat Vig presents Amazon’s experience operating DynamoDB at scale and how the architecture continues to evolve to meet the ever-increasing demands of customer workloads.
-
Unraveling Techno-Solutionism: How I Fell out of Love with “Ethical” Machine Learning
Katharine Jarmul confronts techno-solutionism, exploring ethical machine learning, which eventually led her to specialize in data privacy.
-
Operationalizing Responsible AI in Practice
Mehrnoosh Sameki discusses approaches to responsible AI and demonstrates how open source and cloud integrated ML help data scientists and developers to understand and improve ML models better.
-
Orchestrating Hybrid Workflows with Apache Airflow
Ricardo Sueiras discusses how to leverage Apache Airflow to orchestrate a workflow using data sources inside and outside the cloud.
-
Open Machine Learning: ML Trends in Open Science and Open Source
Omar Sanseviero discusses the trends in the ML ecosystem for Open Science and Open Source, the power of creating interactive demos using Open Source libraries and BigScience.
-
Taming the Data Mess, How Not to Be Overwhelmed by the Data Landscape
Ismaël Mejía reviews the current data landscape and discusses both technical and organizational ideas to avoid being overwhelmed by the current lack of consolidation of the data engineering world.
-
Modern API Development and Deployment, from API Gateways to Sidecars
Matt Turner shows a modern approach to designing, implementing, and documenting APIs using dedicated tooling in a decentralised environment that has all the good parts of an api-gateway solution.
-
Data Versioning at Scale: Chaos and Chaos Management
Einat Orr discusses several technologies that version large data sets, the use cases they support and the technology developed to best support those use cases.
-
Resilient Real-Time Data Streaming across the Edge and Hybrid Cloud
Kai Waehner explores different architectures and their trade-offs for transactional and analytical workloads. Real-world examples include financial services, retail, and the automotive industry.