InfoQ Homepage Scaling Content on InfoQ

News

RSS Feed

Newer Older

Architecture & Design

OpenAI Scales Single Primary PostgreSQL Instance to Millions of Queries per Second for ChatGPT

OpenAI described how it scaled PostgreSQL to support ChatGPT and its API platform, handling millions of queries per second for hundreds of millions of users. By running a single-primary PostgreSQL deployment on Azure with nearly 50 read replicas, optimizing query patterns, and offloading write-heavy workloads to sharded systems, OpenAI maintained low-latency reads while managing write pressure.

Leela Kumili
on Feb 12, 2026
DevOps

Enhancing Reliability Using Service-Level Prioritized Load Shedding: Netflix at QCon SF 2025

At QCon San Francisco, Netflix engineers unveiled their advanced Service-Level-Prioritized Load-Shedding strategy, enhancing reliability during traffic spikes. By prioritizing high-value requests and automating management across microservices, they safeguard user experience and system stability. Key insights stress prioritization, automation, and structured load shedding for optimal resilience.

Steef-Jan Wiggers
on Nov 20, 2025
DevOps

Advanced Autoscaling Helps Companies Reduce AWS Costs by 70%

The next generation of Kubernetes autoscaling techniques and tools is enabling organisations to make substantial cost savings in their cloud infrastructure. Svetlana Burninova recently used Karpenter to build a multi-architecture EKS cluster and managed a 70% reduction in cost whilst also improving performance.

Matt Saunders
on Aug 31, 2025
Cloud

Amazon DocumentDB Serverless: Auto-Scaling Database Solution for Variable Workloads

AWS has launched Amazon DocumentDB Serverless, an auto-scaling database solution compatible with MongoDB, tailored for variable workloads. While marketed as "serverless," it functions more like auto-scaling, charging from $30/month. Ideal for enterprises and SaaS vendors, it adeptly handles spikes in demand, particularly for AI-driven applications.

Steef-Jan Wiggers
on Aug 07, 2025
Culture & Methods

Inflection Points in Engineering Productivity for Improving Productivity and Operational Excellence

As companies grow, investing in custom developer tools may become necessary. Initially, standard tools suffice, but as companies scale in engineers, maturity, and complexity, industry tools may no longer meet needs. Inflection points, such as a crisis, hyper-growth, or reaching a new market, often trigger investments, providing opportunities for improving productivity and operational excellence.

Ben Linders
on Apr 24, 2025
Culture & Methods

Lessons Learned from Growing an Engineering Organization

As their organization grew, Thiago Ghisi's work as director of engineering shifted from being hands-on in emergencies to designing frameworks and delegating decisions. He suggested treating changes as experiments, documenting reorganizations, and using a wave-based communication approach to gather feedback, ensuring people feel heard and invested.

Ben Linders
on Apr 09, 2025
DevOps

Optimizing Amazon ECS with Predictive Scaling

Amazon Web Services (AWS) recently released Predictive Scaling for Amazon ECS, an advanced scaling policy that employs machine learning (ML) algorithms to anticipate demand surges, ensuring applications remain highly available and responsive while minimizing resource overprovisioning.

Claudio Masolo
on Dec 06, 2024
Culture & Methods

Staying Innovative on a Journey from Start-Up to Scale-Up

As ClearBank grew, it faced the challenge of maintaining its innovative culture while integrating more structured processes to manage its expanding operations and ensure regulatory compliance. Within boundaries of accountability and responsibility, teams were given space to evolve their own areas, innovate a little, experiment, and continuously improve, to remain innovative.

Ben Linders
on Oct 31, 2024
DevOps

Deezer Optimizes Kubernetes Autoscaling with Custom Metrics

Popular music streaming service Deezer has written about using custom metrics to enable auto-scaling in its Kubernetes infrastructure. Server utilisation and performance issues made scaling applications to an appropriate size and number of replicas challenging, and Kuberenetes' HPA scaling alone didn't solve these issues. So Deezer turned to custom metrics.

Matt Saunders
on Oct 07, 2024
DevOps

Kubernetes Autoscaler Karpenter Reaches 1.0 Milestone

Amazon Web Services (AWS) has released version 1.0 of Karpenter, an open-source Kubernetes cluster auto-scaling tool. This release marks Karpenter's graduation from beta status and introduces stable APIs and several new features. Karpenter, initially launched in November 2021, has evolved into a comprehensive Kubernetes-native node lifecycle manager.

Matt Saunders
on Sep 16, 2024
Culture & Methods

How Tech-Enabled Networks of Software Teams Work

To maintain agility at scale, software teams can use technological and organizational solutions to reduce dependencies and work autonomously. According to Fabrice Bernhard, collaboration technology can be leveraged to create a distributed network of teams. To empower their teams, leaders can support them with a systematic problem-solving culture aimed at delivering good products to customers.

Ben Linders
on Aug 22, 2024
Culture & Methods

How to Build Large Scale Cyber-Physical Systems

To build large-scale safety-critical systems, we need to decompose the system into smaller solvable problems, resolve what is known, and resolve unknowns through experiments, Robin Yeman argued. She suggested investing in test environments for both software and hardware early to enable being test-driven early to increase the safety, security, reliability, and availability of the systems.

Ben Linders
on May 16, 2024
DevOps

Expedia Open-Sources Container-Startup-Autoscaler (CSA) for Scaling Kubernetes Workloads

Expedia's Performance and Reliability team has recently open-sourced its container-startup-autoscaler (CSA). It is a Kubernetes controller leveraging the In-Place Update of Pod Resources feature to dynamically adjust CPU and/or memory resources of containers during startup based on user-defined startup/post-startup configurations.

Claudio Masolo
on Apr 19, 2024
DevOps

DigitalOcean Introduces CPU-Based Autoscaling for its App Plaform

DigitalOcean has launched automatic horizontal scaling for its App Platform PaaS, aiming to free developers from the burden of scaling services up or down based on CPU load all by themselves.

Sergio De Simone
on Mar 27, 2024
Culture & Methods

How to Create a UI That's Both Robust and User Friendly

The key challenge in building UIs is balancing ease of use and maintainability, with scale and complexity. It requires thoughtful component design and an understanding of common usage paths to create a UI that's both robust and user-friendly. Automation can be a game-changer when it comes to improving efficiency and consistency in your codebase.

Ben Linders
on Sep 21, 2023

Newer News

Older News

InfoQ Software Architects' Newsletter

News