InfoQ Homepage DevOps Content on InfoQ
-
Taking Advantage of Cell-Based Architectures to Build Resilient and Fault-Tolerant Systems
Cell-based architectures offer a robust approach to building resilient systems. They achieve this through the core principles of isolation, autonomy, and replication. Each cell manages its resources and makes decisions autonomously. Observability for cell-based architecture requires a tailored approach to address the unique challenges and opportunities presented by this distributed system design.
-
Optimizing Wellhub Autocomplete Service Latency: a Multi-Region Architecture
Every company wants fast, reliable, and low-latency services. Achieving these goals requires significant investment and effort. In this article, I will share how Wellhub invested in a multi-region architecture to achieve a low-latency autocomplete service.
-
Proactive Approaches to Securing Linux Systems and Engineering Applications
Maintaining a strong security posture is challenging, especially with Linux. An effective approach is proactive and includes patch management, optimized resource allocation, and effective alerting.
-
How to Minimize Latency and Cost in Distributed Systems
Explore the benefits and challenges of microservices architecture in cloud environments, focusing on achieving resilience and high availability while managing costs and performance issues.
-
Building Better Platforms with Empathy: Case Studies and Counter-Examples
Scaling platform development often means absorbing cognitive burdens, but empathy is key. Understanding users beyond their immediate issues leads to better solutions. Platforms help manage growth's complexity, but a product mindset with user-centricity is vital. In his talk at QCon San Francisco 2023, David Stenglein expanded on cultivating empathy through open communication.
-
Curating Developer Experience: Practical Insights from Building a Platform Team
As a platform engineer, how do you help your customers go quicker, which aspects of developer experience should you care about and what do you actually do to curate an experience for them? This article is about curating a developer experience, it shares experiences and learnings from implementing DevEx and ideas on what platform engineers can do for development teams that use platforms.
-
Mastering Impact Analysis and Optimizing Change Release Processes
Dynamic IT professional with a proven track record in optimizing production processes and analyzing outages in complex systems handling millions of TPS. The recent CrowdStrike outage highlights the importance of continuous improvement and adherence to best practices. Passionate about elevating operational excellence through strategic reviews and effective process enhancements.
-
Efficient DevSecOps Workflows with a Little Help from AI
Michael Friedrich is exploring how teams face varying levels of inefficiency in their DevSecOps processes, hindering progress and innovation. He highlights common issues like excessive debugging time and inefficient workflows, while also demonstrating how Artificial Intelligence (AI) can be a powerful tool to streamline these processes and boost efficiency.
-
Cloud Waste Management: How to Optimize Your Cloud Resources
The 2024 "State of FinOps" survey results of the FinOps Foundation mentioned that organizations' top priorities have shifted to reducing cloud waste or unused resources. This article delves into understanding how to manage cloud waste.
-
Uber's Blueprint for Zero-Downtime Migration of Complex Trip Fulfillment Platform
In large-scale distributed systems, migrating critical systems from one architecture to another is technically challenging and involves a delicate migration process. Uber operates one of the most intricate real-time fulfillment systems globally. This article will cover the techniques to migrate such a workload from on-prem to a hybrid cloud architecture with zero downtime and business impact.
-
The Set Piece Strategy: Tackling Complexity in Serverless Applications
In this article, senior engineering manager and AWS Serverless hero Sheen Brisals examines how the characteristics of serverless such as optimization, robust availability and scalability influence us to think in a new way of architecting and evolving modern applications as set pieces, a concept from moviemaking. The contents of this article were presented during QCon London 2024.
-
Platform as a Runtime - the Next Step in Platform Engineering
As systems become larger and more complex we need to take the concepts of platform engineering to a higher level – to the code level – by creating platforms and abstractions that will reduce cognitive load, help simplify and accelerate software development, and allow for easy maintenance and upgrades to the platform. Let’s move from “platform” to “Platform as a Runtime”.