InfoQ Homepage Database Content on InfoQ
-
PostgresML: Leveraging Postgres as a Vector Database for AI
Montana Low provides an understanding of how Postgres can be used as a vector database for AI and how it can be integrated into your existing application stack.
-
LLMs in the Real World: Structuring Text with Declarative NLP
Adam Azzam discusses why building machine learning pipelines to extract structured data from unstructured text is a popular problem within an unpopular development lifecycle.
-
Performance and Scale - Domain-Oriented Objects vs Tabular Data Structures
Donald Raab and Rustam Mehmandarov discuss three library solutions for managing data based on an example of high-performance CSV processing.
-
What is Derived Data? (and Do You Already Have Any?)
Felix GV explains what derived data is, and dives into four major use cases which fit in the derived data bucket, including: graphs, search, OLAP and ML feature storage.
-
Speed of Apache Pinot at the Cost of Cloud Object Storage with Tiered Storage
Neha Pawar discusses how to query data on the cloud directly with sub-seconds latencies, diving into data fetch and optimization strategies, challenges faced and learnings.
-
A New Era for Database Design with TigerBeetle
Joran Dirk Greef discusses pivotal moments in database design and how they influenced the design decisions for TigerBeetle, a distributed financial accounting database.
-
Azure Cosmos DB: Low Latency and High Availability at Planet Scale
Mei-Chin Tsai and Vinod Sridharan discuss the internal architecture of Azure Cosmos DB and how it achieves high availability, low latency, and scalability.
-
Amazon DynamoDB: Evolution of a Hyperscale Cloud Database Service
Akshat Vig presents Amazon’s experience operating DynamoDB at scale and how the architecture continues to evolve to meet the ever-increasing demands of customer workloads.
-
How Do You Distribute Your Database over Hundreds of Edge Locations?
Erwin van der Koogh explains a new model that Cloudflare has developed to distribute a database over hundreds of locations, and where it could go next.
-
Robust Foundation for Data Pipelines at Scale - Lessons from Netflix
Jun He and Harrington Joseph share their experiences of building and operating the orchestration platform for Netflix’s big data ecosystem.
-
Evolving Analytics in the Data Platform
Blanca Garcia-Gil discusses the BBC’s analytics platform architecture, the failure modes they designed for, and the investigation of the new unknowns and how they automated them away.
-
Change Data Capture for Distributed Databases @Netflix
Raghuram Onti Srinivasan covers the challenges associated with capturing CDC events from Cassandra, discussing the Flink ecosystem and the use of RocksDB.