InfoQ Homepage Fault Tolerance Content on InfoQ
-
Building Highly Available Systems in Erlang
Joe Armstrong discusses highly available (HA) systems, introducing different types of HA systems and data, HA architecture and algorithms, 6 rules of HA, and how HA is done with Erlang.
-
Storm: Distributed and Fault-tolerant Real-time Computation
Nathan Marz explain Storm, a distributed fault-tolerant and real-time computational system currently used by Twitter to keep statistics on user clicks for every URL and domain.
-
Above the Clouds: Introducing Akka
Jonas Bonér introduces Akka, a JVM platform that wants to address the complex problems of concurrency, scalability and fault tolerance using Actors, STM and self-healing from crashes.
-
Things Break, Riak Bends
Justin Sheehy talks about failure and the need to prepare for it, giving some real life examples along with techniques implemented in Riak to make it resilient to faults.
-
Message Passing Concurrency in Erlang
Joe Armstrong explains through Erlang examples that message passage concurrency represents the foundation of scalable fault-tolerant systems.
-
Failure Comes in Flavors - Stability Anti-patterns
Michael Nygard encourages us to have a failure oriented mindset. He presents many anti-patterns leading to systems instability and failure, accompanied by design patterns that should be used instead.
-
Multicore Programming in Erlang
Ulf Wiger shows typical Erlang programs, patterns that scale well on multicore and patterns that don't, profiling and debugging parallel applications and ensuring correct behaviour with QuickCheck.
-
CouchDB and Me
In this talk from RubyFringe, Damien Katz explains what drove him to create CouchDB, why he chose Erlang and more.
-
Architectures of extraordinarily large, self-sustaining systems
Can a system that is so large it cannot be comprehended be "designed" in a conventional sense? The foundations of computing are about to change. In this talk, Richard P. Gabriel explores why and how.