Martin Thwaites, an observability evangelist, developer, and developer advocate at honeycomb.io, presented "Production Comes First - An Outside-In Approach to Building Microservices" at QCon London. The session was part of the "Connecting Systems: APIs, Protocols, Observability" track.
In this talk, Thwaites delved into the broad concept of testing, which encapsulates everything from the immediate feedback of linters to the comprehensive evaluation of end-to-end (E2E) testing and beyond into production telemetry and customer feedback. He discussed testing from the "inside out" first.
At the most foundational level, linters represent the first line of defense, operating in real-time to ensure that developers adhere to proper conventions as they type. This immediate feedback loop is crucial for maintaining code quality from the beginning of the development process.
As the compiler progresses, it becomes a stricter examiner, offering more in-depth feedback on the code. Although this stage is relatively fast, it provides a deeper analysis than linters.
Developer tests follow, including unit and integration tests, which might take several seconds to run depending on their extent. These tests are still localized, allowing developers to assess the impacts of their code changes quickly.
The next tier is E2E testing, involving multiple services that could extend the feedback loop from minutes to hours. This more comprehensive testing stage assesses how well different system components work together.
At the far end of the feedback spectrum lies production telemetry and customer complaints, which offer insights into how the software operates in real-world conditions. These feedback mechanisms provide valuable data but have longer latency in the feedback loop.
According to Thwaites, the testing philosophy should be testing not just the internal components of a system (like methods, classes, or even units) but also the connection points between different parts of the system, such as APIs, messages, and events. This approach recognizes that while unit tests are essential, the functionality of the broader system—how different parts connect and operate together—is more important.
Thwaites continues to state that his talk is more about Test-Driven Development (TDD), where the workflow evolves around writing tests before writing the code itself. This methodology ensures that development closely aligns with business or architectural needs requirements. TDD emphasizes an "outside-in" approach, starting with requirements, which then inform the tests to be written and, subsequently, the code that passes these tests. This process ensures no redundant code is written, maintaining a lean and efficient codebase.
The concept extends to testing based on production behaviors, suggesting that tests should mimic what happens in production as closely as possible. This approach aims to validate the code and expected behaviors in the live environment, ensuring that the software delivers the proper outcomes when deployed.
Lastly, observability emerges as a critical concept in understanding and testing software systems, especially those aspects that are not easily examined from the outside. Techniques like tracing provide insights into the system's internal state, aiding developers in understanding how components interact internally. Observability in testing offers a way to monitor and verify system behavior in a manner akin to its operation in production.
Thwaites advocates for an "outside-in" testing strategy, which prioritizes understanding the interconnections among system components. He emphasizes Test-Driven Development (TDD) principles, advocating for writing tests before code to ensure alignment with business and architectural requirements. Furthermore, he stresses the importance of testing against production behaviors, urging tests to simulate real-world conditions closely to validate expected outcomes upon deployment.
After the session, InfoQ interviewed Martin Thwaites about the topic of testing.
InfoQ: Could you elaborate on the specific observability techniques or tools that are most effective when integrated into the local development phase?
Martin Thwaites: The key to integrating Observability with the local development experience is rapid ingest and fast querying. You need to be able to send the telemetry to a visualization tool and then see it immediately. You can't have a 30-60-second delay, or it can't be your primary debugging. You can achieve this by configuring the SDKs in OpenTelemetry to have more minor delays and even post to an InMemory collector for use in local developer tests.
InfoQ: How do these techniques help ensure that applications are more accessible to debug locally and primed for production environments from the outset?
Thwaites: Debugging by stepping through code is really powerful. However, it can be a pain when you're dealing with large call stacks because you have to keep so much in your head. Using tracing and general telemetry data to enable debugging is sometimes easier to understand and reason about. It then has the side benefit that you do not have to context switch between tools in local development and production, making supporting production a less scary experience!
InfoQ: When developing microservices that interact with external systems (which the development team might not own), how do you recommend balancing the need for rapid development and iteration with the complexities introduced by these dependencies?
Thwaites: The rule is "Mock what you don't own." This means that you stub or mock the third-party services' endpoints or inbound messages. You can control the interactions and model any of those flows by doing that. If there's complexity in how these things interact, you can't get around it, but you can model that complexity locally to understand it better.
Access recorded QCon London talks with a Video-Only Pass.