InfoQ Homepage BigTable Content on InfoQ
-
Reddit Migrates Media Metadata from S3 and Other Systems into AWS Aurora Postgres
Reddit consolidated its media metadata storage into a new architecture using AWS Aurora Postgres. Previously, the company sourced media metadata from various systems, including directly from AWS S3. The new solution simplifies media metadata retrieval and handles 100k+ requests per second with latency below 5ms (p90).
-
Expedia Speeds up Flights Search with Micro Frontends and GraphQL Optimizations
Expedia made flight search faster by up to 52% (page usable time) by applying a range of optimizations to web and mobile applications. To support these improvements, the company improved the observability of its applications. Expedia Flights web application has been migrated to Micro Frontend Architecture (MFA) to allow flexibility, reusability, and better optimization.
-
Managing 238 Million Memberships of Netflix: Surabhi Diwan at QCon San Francisco
During the first day of QCon San-Francisco 2023, Surabhi Diwan, a senior software engineer at Netflix, presented on managing 238 million Memberships of Netflix. The talk is a part of the “Architectures You’ve Always Wondered About" track. Diwan's work at Netflix involves the backend work regarding membership engineering, which is critical for both signups and streaming at Netflix.
-
Yelp Rebuilds Corrupted Cassandra Cluster Using Its Data Streaming Architecture
Yelp created a solution to sanitize data from the corrupted Apache Cassandra cluster utilizing its data streaming architecture. The team explored many potential options to address the data corruption issue, however, ultimately had to move the data into a new cluster to remove corrupted records in the process.
-
Azure Brings Vertical Scaling, Monitor Alerts and More for Apache Cassandra Managed Instance
Microsoft has recently released some new features for Azure Managed Instance for Apache Cassandra, such as upgrading the Apache Cassandra version to 4.0 GA, Azure Monitor alerts and insights, deallocating cluster resources to improve costs, vertical scaling and more.
-
Discord Migrates Trillions of Messages from Cassandra to ScyllaDB
Discord has migrated trillions of message records from Apache Cassandra to ScyllaDB, reducing the size of the largest cluster from 177 Cassandra nodes to 72 ScyllaDB nodes and reducing tail latencies for reads and writes. The move has unlocked new product use cases because of the improved database stability and performance.
-
Netflix Built a Scalable Annotation Service Using Cassandra, Elasticsearch and Iceberg
Netflix recently published how it built Marken, a scalable annotation service using Cassandra, ElasticSearch and Iceberg. Marken allows storing and querying annotations, or tags, on arbitrary entities. Users define versioned schemas for their annotations, which include out-of-the-box support for temporal and spatial objects.
-
Google Introduces Zero-ETL Approach to Analytics on Bigtable Data Using BigQuery
Recently, Google announced the general availability of Bigtable federated queries, with BigQuery allowing customers to query data residing in Bigtable via BigQuery faster. Moreover, the querying is without moving or copying the data in all Google Cloud regions with increased federated query concurrency limits, closing the longstanding gap between operational data and analytics.
-
Google Introduces Autoscaling for Cloud Bigtable for Optimizing Costs
Cloud Bigtable is a fully-managed, scalable NoSQL database service for large operational and analytical workloads on the Google Cloud Platform (GCP). And recently, the public cloud provider announced the general availability of Bigtable Autoscaling, which automatically adds or removes capacity in response to the changing demand for applications allowing cost optimizations.
-
Google Cloud Improves SLA for Bigtable and Adds New Security Features
Google Cloud has recently raised the availability SLA for Bigtable instances up to 99.999%, matching the SLA for Firestore and Cloud Spanner. The data storage system introduced as well two new security features for enterprise workloads, customer-managed encryption keys (CMEK) and data access audit logs.
-
Google Provides a Peek into the Architecture of Colossus - Its Storage Foundation
In a recent post, Google provided a glimpse into the architecture of Colossus. Colossus underpins Google's scalable storage system, which serves both its Google Cloud offerings and Google's own globally available services such as YouTube, Google Drive, and Gmail. Five separate components compose Colossus - the client library, curators, metadata database, file servers, and custodians.
-
Microsoft Announces Azure Managed Instance for Apache Cassandra
At this year’s Ignite conference, Microsoft announced the public preview of Azure Managed Instance for Apache Cassandra, a NoSQL database product to manage Cassandra-based workloads into Azure cloud.
-
K8ssandra: Production-Ready Platform for Running Apache Cassandra on Kubernetes
DataStax recently released K8ssandra, an open-source distribution of Apache Cassandra for Kubernetes. K8ssandra aims to provide a “production-ready platform”, and this includes automation for operational tasks such as repairs, backups, and monitoring.
-
DataStax Announces Cloud Native Database as a Service and AIOps Tools
DataStax announced last month the release of Astra, a cloud-native Database-as-a-Service (DBaaS) built on Apache Cassandra. They also recently announced an AIOps product called Vector that proactively monitors the health of Apache Cassandra clusters.
-
Amazon Announces the Open Preview of a Managed Apache Cassandra Service (MCS) on AWS
At the recent AWS re:Invent, Amazon announced a new way of managing Cassandra databases on AWS. With Amazon Managed Apache Cassandra Service (MCS), the public cloud vendor can offer Cassandra directly to customers instead of through third-party vendors.