Two years after entering public preview, reliability experimentation service Azure Chaos Studio is now generally available. Among its most recent features are experiment templates, dynamic targets, load testing faults, and more.
Chaos Studio is a fully-managed service that allows users to apply chaos engineering techniques to experiment with controlled fault injection to assess the reliability of their apps.
Chaos Studio enables users to assess how applications respond to real-world disruptions like network delays, unexpected storage failures, expired secrets, or datacenter outages. Using Chaos Studio, customers can design and conduct experiments with a wide range of agent-based and service-direct faults to better understand how to proactively improve the resilience of their application.
A chaos experiment defines a sequence of actions to execute against your target resources. Additionally, the chaos experiment defines which actions you want to take in parallel against other branches.
Since it became available in preview, Azure Chaos Studio has been extended with several new capabilities, including experiment templates, dynamic targets, load testing faults, and improved identity management.
Experiment templates aim to simplify the creation of experiments using pre-filled templates. For example, templates may describe an Azure Active Directory outage, an availability zone going down, or simulating all targets in a zone going down. Each template defines a number of rules specific to the fault as well as more generic ones, such as the experiment duration, allowing you to quickly run common experiments.
When selecting targets affected in an experiment, you can either list them manually, or use the new query-based dynamic targets feature, which allows you to filter targets based on Azure resource parameters including type, region, name, and others.
Load testing faults make it possible to start and stop Azure load testing, which is a service able to generate high-scale loads and simulate traffic for your applications. Azure Load Testing uses Apache JMeter to run load tests and simulate a large number of virtual users accessing your application endpoints at the same time.
To better control who can inject faults into your systems, Azure Chaos Studio has also improved identity management by introducing user-assigned managed identities and custom role assignment. This feature will allow you to create a user-assigned managed identity and explicitly assign it the permissions required to run an experiment beforehand. When creating an experiment, you assign a specific user-assigned managed identity to it and review if that identity has sufficient permissions.
If you want to start practicing chaos experiments with Azure Chaos Studio, you can head to its official documentation and have a look at the Azure Chaos Studio fault and action library.