InfoQ Homepage S3 Content on InfoQ
-
Instacart Creates Real-Time Item Availability Architecture with ML and Event Processing
Instacart combined machine learning with event-based processing to create an architecture that provides customers with an indication of item availability in near real-time. The new solution helped to improve user satisfaction and retention by reducing order cancellations due to out-of-stock items. The team also created a multi-model experimentation framework to help enhance model quality.
-
Amazon OpenSearch Zero ETL with S3 and New OR1 Instances
Amazon has announced the preview of the Amazon OpenSearch Service's zero-extraction, transformation, and loading (ETL) integration with Amazon S3, offering a novel method to analyze operational logs in Amazon S3 and S3-based data lakes without the need to switch between services. Amazon also announced the new OR1 instances for Amazon OpenSearch Service.
-
Zendesk Moves from DynamoDB to MySQL and S3 to Save over 80% in Costs
Zendesk reduced its data storage costs by over 80% by migrating from DynamoDB to a tiered storage solution using MySQL and S3. The company considered different storage technologies and decided to combine the relational database and the object store to strike a balance between querybility and scalability while keeping the costs down.
-
Amazon S3 Introduces High-Performance Storage Class
During the recent re:Invent conference, AWS announced the general availability of S3 Express One Zone, a high-performance, single-AZ storage class that provides single-digit millisecond data access. Reducing request costs, the new storage class is designed for processing data in AI/ML training and financial modeling.
-
Recap of AWS re:Invent 2023: Amazon Q, Frugal Architectures, Database Upgrades
The 12th edition of re:Invent has just ended in Las Vegas. As expected, artificial intelligence was a key topic of the conference, with Amazon Bedrock and Amazon Q, a new type of generative AI-powered assistant, the main focus of Adam Selipsky’s keynote.
-
Goldsky’s Streaming-First Architecture for Blockchain Data with Flink, Redpanda and Kubernetes
Goldsky created a platform for the real-time processing of blockchain data. The platform allows clients to extract data from blockchains into their own databases to support product features, but without running the data pipeline infrastructure. The event-driven architecture (EDA) of Goldsky leverages Apache Flink, Redpanda, Kubernetes, and cloud provider services.
-
Reddit Unveils REV2: Modernised Rule-Execution with Kubernetes, Kafka, and Flink Stateful Functions
Reddit's Safety Engineering team recently published how it modernised its Rule-Execution system, which detects and acts on policy-violating content in real time. The new architecture includes improvements like transitioning from legacy EC2-based systems to Kubernetes, better rule version control with Github and S3 storage, and the capability to scale more efficiently with Flink Stateful Functions.
-
Cloudflare Sippy: Incrementally Migrate Data from Amazon S3 to Reduce Egress Fees
Cloudflare recently announced the open beta of Sippy, an incremental data migration service that copies data from Amazon S3 to Cloudflare R2 only the first time the data is requested. Sippy is designed to minimize migration-specific egress fees by leveraging requests within existing application flows while simultaneously copying objects to R2.
-
Mountpoint for Amazon S3 Now GA to Access Bucket Like Local File System
During the latest AWS Storage Day event, Amazon announced the general availability of Mountpoint for Amazon S3. The new open-source file client provides through a file interface the elastic storage and throughput of Amazon S3, supporting data transfer at up to 100 Gb/second between each EC2 instance and the object storage.
-
Running Apache Flink Applications on AWS KDA: Lessons Learnt at Deliveroo
Deliveroo introduced Apache Flink into its technology stack for enriching and merging events consumed from Apache Kafka or Kinesis Streams. The company opted to use AWS Kinesis Data Analytics (KDA) service to manage Apache Flink clusters on AWS and shared its experiences from running Flink applications on KDA.
-
Inside InfluxDB 3.0: Exploring InfluxDB’s Scalable and Decoupled Architecture
InfluxData recently unveiled the system architecture for InfluxDB 3.0, its newest time-series DB. Its architecture encompasses four major components responsible for data ingestion, querying, compaction, and garbage collection and includes two main storage types. The architecture caters to operating the DB on-premise and natively on major cloud providers.
-
Amazon Introduces AWS HealthImaging to Store and Analyze Medical Imaging Data
At the recent AWS Summit in New York, Amazon announced AWS HealthImaging. The new HIPAA-eligible service helps healthcare providers to store, analyze, and share medical imaging data at scale.
-
Pfizer Uses Serverless Architecture on AWS to Scale Processing of Digital Biomarkers
Pfizer upgraded the serverless architecture for processing digital biomarker data at scale to make it more flexible and configurable. They created a framework that uses a file processing pipeline built with AWS Step Functions and other serverless services, as well as a custom Python package for data ingestion and processing.
-
AWS Launches Amazon S3 Dual-Layer Server-Side Encryption with Keys Stored in AWS KMS
Recently AWS launched Amazon S3 dual-layer server-side encryption with keys stored in AWS Key Management Service (DSSE-KMS), a new encryption option in Amazon S3 that applies two layers of encryption to objects when they are uploaded to an Amazon Simple Storage Service (Amazon S3) bucket.
-
Datadog Creates Scalable Data Ingestion Architecture
Datadog created a dedicated data ingestion architecture offering exactly-once semantics for their third-generation event store, Husky. The event-driven architecture (EDA) can accommodate bursts in traffic in the multi-tenant platform with reasonable ingestion latency and acceptable operational costs.