InfoQ Homepage Disaster Recovery Content on InfoQ
Articles
RSS Feed-
Mastering Impact Analysis and Optimizing Change Release Processes
Dynamic IT professional with a proven track record in optimizing production processes and analyzing outages in complex systems handling millions of TPS. The recent CrowdStrike outage highlights the importance of continuous improvement and adherence to best practices. Passionate about elevating operational excellence through strategic reviews and effective process enhancements.
-
Understanding Architectures for Multi-Region Data Residency
This article focuses on implementing data residency strategies for a positive stakeholder experience. It underscores the need to diversify data locations, driven by motivations like disaster recovery and geo-redundancy. The core principle is data distribution, ensuring specific sets reside in distinct regions without overlap - a practice termed data residency.
-
SaaS DR/BC: If You Think Cloud Data is Forever, Think Again
SaaS is quickly becoming the default tool for how we build and scale businesses. It’s cheaper and faster than ever before. However, this reliance on SaaS comes with the risk of disaster recovery. The “Shared Responsibility Model” doesn’t just govern your relationship with cloud, it actually impacts all of cloud computing. Even for SaaS, users are on the hook for protecting their own data.
-
Leverage the Cloud to Help Consolidate On-Prem Systems
A cloud model can be used to architecturally validate the possibility of consolidating multiple application servers into a smaller number of physical resources that will ultimately remain on-prem.
-
Failover Conf Q&A on Building Reliable Systems: People, Process, and Practice
One of the biggest engineering challenges associated with maintaining or increasing the reliability of a system is knowing where to invest time and energy. InfoQ recently sat down with several engineers and technical leaders who are involved with the upcoming Failover Conf virtual event, and asked their opinion on the best practices for building and running reliable systems.
-
Book Review: A Leader's Guide to Cybersecurity
A Leader's Guide to Cybersecurity educates readers about how to prevent a crisis and/or take leadership when one occurs. With a focus on clear communication, the book provides details, examples, and guidance of mapping security against what a business actually does. The book describes ways to align security with the motivation of others who may be security-agnostic against their own goals.
-
Designing Chaos Experiments, Running Game Days, and Building a Learning Organization: Chaos Conf Q&A
The second Chaos Conf event is taking place in San Francisco over 25-26 September. In preparation for the conference, InfoQ sat down with a number of the presenters, and discussed topics such as the evolution and adoption of chaos engineering, key people and process learning from running chaos experiments, and what the biggest blockers are for mainstream adoption.
-
Crafting a Resilient Culture: Or, How to Survive an Accidental Mid-Day Production Incident
While working at Etsy, Ryn Daniels accidentally upgraded Apache on every single server that was running it, which caused a production incident. Explore lessons learned in this article, including that although automation and orchestration can be great, you should make sure you understand what’s happening under the hood and what to do if your automation goes awry.
-
Chaos Conf Q&A: The Benefits, Challenges and Practices of Chaos Engineering
This Q&A, from the upcoming Chaos Conf event that is running in San Francisco in September, examines the benefits and challenges of chaos engineering. The article also provides emerging good practice, and contains prerequisites, recommendations, and tips for getting started.
-
The Holistic Approach: Preventing Software Disasters
Olivier Bonsignour on what "X-Raying" software means, how it can help prevent software disasters and why CIOs should care.