InfoQ Homepage Resilience Content on InfoQ
-
Building Highly-resilient Systems at Pinterest
Yongsheng Wu talks about how to build highly-resilient systems at scale. Wu presents also failure cases that prompted engineers at Pinterest to build such systems, and how they test these systems.
-
Fail Better: Radical Ideas from the Practice of Cloud Computing
Tom Limoncelli discusses creating resiliency at the most economic level, doing risky procedures often, and creating a blameless culture to encourage communication and improve system reliability.
-
Resilience, Service Discovery and Zero Downtime Deployment in Microservice Architectures
York Xyander, Bodo Junglas discuss strategies for service discoverability and transparent failover in a microservices architecture, how to achieve zero downtime and an auto-scaling architecture.
-
Responding Rapidly When You Have 100GB+ Data Sets in Java
Peter Lawrey discusses data-driven reactive systems, profiling latency distribution in such an environment, finding rare bugs, implementing resilience and monitoring.
-
Opportunities to Improve System Reliability and Resilience
Donald Belcham explains how to improve a system’s reliability by using appropriate code patterns.
-
You Won't Believe How the Biggest Sites Build Scalable and Resilient Systems!
The authors discuss about the lessons learned from all the biggest sites on the internet about how to build scalable and resilient architectures.
-
Going Reactive: Event-Driven, Scalable, Resilient & Responsive Systems
Jonas Bonér discusses four key traits of Reactive Apps: Event-Driven, Scalable, Resilient and Responsive, how they impact application design, how they interact, related technologies and techniques.
-
Building Resilience: How Outages Shaped Etsy's Systems
Avleen Vig presents some of the most unexpected, confusing, hilarious and face-palming events during Etsy's outages to show what can be learnt from their problems to build more resilient systems.
-
Fault Tolerance Made Easy
Uwe Friedrichsen discusses implementing resilient software design patterns (code included) and improving those patterns to achieve robustness and becoming a resilient software developer.
-
From Instability to Resilience: The Story of a Web Site
Richard Campbell shares his experiences evolving a web site from ordinary to resilient, the triage process, the quick-and-dirty solutions as well as the work to bring the site to true resiliency.
-
Principles of Reliable Communication & Shared State
Andy Piper describes some fundamentals of communicating reliably in an unreliable world and communication techniques used to build distributed data structures that can tolerate failures.
-
Going Reactive: Event-Driven, Scalable, Resilient & Responsive Systems
Jonas Bonér discusses how the four traits of reactive apps -Event-Driven, Scalable, Resilient and Responsive- impact app design, how they interact, and their supporting technologies and techniques.