InfoQ Homepage Reliability Content on InfoQ
-
My Mobile App Only Works on My Phone? How to Scale Enterprise Mobile Apps
The authors discuss patterns and technologies needed to scale large enterprise mobile systems, covering handling network connectivity, data reliability and real-time communication.
-
ZooKeeper for the Skeptical Architect
Camille Fournier explains what projects ZooKeeper is useful for, the common challenges running it as a service and advice to consider when architecting a system using it.
-
The Game of Big Data: Scalable, Reliable Analytics Infrastructure at KIXEYE
Randy Shoup describes KIXEYE's analytics infrastructure from Kafka queues through Hadoop 2 to Hive and Redshift, built for flexibility, experimentation, iteration, testability, and reliability.
-
How Requirements from the Old World Make Erlang Fit into the New World
Robert Virding describes how Erlang was developed to solve the concurrency and reliability requirements of telecommunications, dealing with challenges that are similar with those of cloud computing.
-
The Magic Behind Enterprise Apps: How to Expose Reliable, Scalable and Secure Enterprise APIs?
Blake Dournaee covers the often forgotten back-end architecture for mobile apps which should expose cross-platform APIs to mitigate some of the effects of mobile O/S fragmentation.
-
Lessons from Erlang: Principles of Building Reliable Systems
Garrett Smith discusses building reliable systems starting with lessons from Erlang, then outlining a set of principles and the practices for applying them in languages such as Ruby, Python, and Java.
-
What Can DevOps Learn from Formula 1?
Stephen Burton discusses how the people, processes, collaboration and tools employed in Formula 1 can be used to manage performance and reliability and ultimately achieve success by DevOps.
-
Reliability Engineering Matters, Except When It Doesn't
Michael Nygard shares essential Reliability Engineering techniques that can keep systems from falling apart, but the discipline has some limitations to be considered.
-
On Distributed Failures (and handling them with Doozer)
Blake Mizerany presents various ways that can lead to system failure in distributed systems and how to recover using Doozer, a highly available, consistent data store.
-
Let It Crash ... Except When You Shouldn't
Steve Vinoski explains how to avoid some of the Erlang errors that can bring down a system starting from the premise that not all the crashes are welcome as the “Let It Crash” philosophy might suggest
-
Building Reliable Systems from Unreliable Components
Arnon Rotem-Gal-Oz discusses creating a SOA implementation that maintains a good overall reliability in spite of using smaller and a larger number of components.
-
Rapid and Reliable Releases
Rolf Russell & Andy Duncan discuss rapid and reliable releases from the build/release/devops perspective, considering relationships, metrics, required skills, and the need to cut waste and bottlenecks