Observability can be programmed and automated with observability as code. A maturity model can be used to measure and improve the adoption of observability as code implementation.
Yury Niño Roa, cloud infrastructure engineer at Google, spoke about programming observability at InfoQ live August 2022.
Niño Roa referred to Thoughworks who in 2019 recommended treating the observability ecosystem configurations as code and adopting infrastructure as code for monitoring and alerting infrastructure:
They motivated us to choose observability products that support configuration through version-controlled code and execution of APIs or commands via infrastructure CD pipelines.
She suggested treating observability as code as a technique for automating the configuration of the observability tools, in a consistent and controlled way, specifically using infrastructure as code and configuration as code.
In her talk Observability is Also Programmed: Observability as Code, Niño Roa presented an Observability as Code Maturity Model that can be used as a way to benchmark and measure its adoption.
The model can be used as a reference for knowing at which level you are:
The model provides criteria to determine the status of an organisation in two axes: sophistication and adoption. For sophistication, it uses four stages: elementary, simple, sophisticated and advanced, and for adoption, there are another four levels: in the shadows, In Investment, In Adoption and in Cultural Expectation.
For example, an organisation is in an advanced stage if it has an automation workflow for observability as code implemented and it’s running on production. The idea is that you identify in which stage you are, reviewing the criteria of each stage and questioning yourself about your implementations and achievements, Niño Roa mentioned.
InfoQ interviewed Yury Niño Roa about observability as code.
InfoQ: How would you define observability?
Yury Niño Roa: Observability is a broad concept whose definition has been controversial between the industry and academy. Some vendors in the industry insist that observability does not have a special meaning, using the term without distinction of telemetry or monitoring. I think the proponents of this definition relegate observability when they use it as another generic term for understanding how the software operates. Monitoring is a part of observability since it allows us to anticipate the system’s health based on the data it generates (logs, metrics, traces).
InfoQ: Why should we do observability as code? What benefits can it bring?
Niño Roa: That is an excellent question since I think the benefits have not been unlocked totally. Some of them include:
- Reducing toiling required for provisioning dashboards for monitoring.
- Having repeatable, replicable and reusable configurations required in the configuration dashboards, alerts and SLOs.
- Documenting and generating context using infrastructure as code to configure monitoring platforms.
- Generally, the teams store the code for observability in repositories, so auditing the history of changes is easier.
- Providing security, because observability as code allows us to have stricter controls while we use continuous integration and deployment.
InfoQ: What are the challenges of observability as code?
Niño Roa: I think they are aligned to the challenges of other practices such as Infrastructure as Code and Configuration as Code. Specifically,
- Reaching a real adoption of engineering teams, leveraging automation wherever possible to accelerate observability delivery across environments.
- Defining clear KPIs to measure the impact of observability-as-code maturity in the organisations.
- Establishing and communicating the current state before implementing new observability-as-code capabilities.
- Documentation is a big challenge in any field, since automating the generation of documentation in an automatic way requires sophisticated techniques such as machine learning and processing of unstructured text.
InfoQ: What’s your advice for starting with observability as code?
Niño Roa: My first advice is that you should know about Infrastructure as Code, specifically about Observability as Code. After that, it is very important to get sponsorship for its implementation. I talk about this in the first stages of the Observability as Code Maturity Model. In these early stages, the organisations have decided to implement Observability as Code, so they have started to collect metrics and officially have practitioners who are dedicating resources to the practice.