BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Observability Content on InfoQ

  • CircleCI Adds New Sumo Logic Integration to Provide Build Pipeline Analytics

    CircleCI and Sumo Logic have released an integration to allow developers to view analytical data about CircleCI jobs from within a Sumo Logic dashboard. This integration is packaged using the CircleCI package management solution, Orbs. The integration includes real-time pipeline data such as number of failed builds, average run time, and job status.

  • HashiCorp Releases Consul 1.5.0 with Layer 7 Observability and Centralized Configuration

    Hashicorp released version 1.5.0 of Consul, their service mesh application and key-value store. These are the first features released on their new roadmap for Consul, including support for L7 observability and load balancing via Envoy, centralized configuration, and ACL authentication support for trusted third-party applications.

  • Observability in Testing with ElasTest

    In a distributed application it is difficult to use debugging techniques common in developing non-distributed applications. Bringing production observability to your testing environment helps to find bugs, argued Francisco Gortázar at the European Testing Conference 2019. He presented ElasTest, a tool for developers to test and validate complex distributed systems using observability.

  • Recommendations When Starting with Microservices: Ben Sigelman at QCon London

    During the years Ben Sigelman worked at Google, they were creating what we today call a microservices architecture. Some mistakes were made during this adoption, which he believes are being repeated today by the rest of the industry. In his presentation at QCon London 2019, Sigelman described his recommendations to avoid making these mistakes when starting with microservices.

  • Chaos Engineering Observability: Q&A with Russ Miles

    In a new O’Reilly report, “Chaos Engineering Observability: Bringing Chaos Experiments into System Observability”, the author, Russ Miles, explores why he believes the topics of observability and chaos engineering “go hand in hand”. He argues that as engineers begin to run chaos experiments, they will need to be able to ask many questions about the underlying system being experimented on.

  • Three Pillars with Zero Answers: Rethinking Observability with Ben Sigelman

    At KubeCon NA, held in Seattle, USA, in December 2018, Ben Sigelman presented “Three Pillars, Zero Answers: We Need to Rethink Observability” and argued that many organisations may need to rethink their approach to metrics, logging and distributed tracing.

  • Testing Complex Distributed Systems at FT.com: Sarah Wells Shares Lessons Learned

    The complexity in complex distributed systems isn’t in the code, it’s between the services or functions. Testing implies balancing finding problems versus delivering value, said Sarah Wells at the European Testing Conference. Testers often have the best understanding of what the system does; they have a good hypothesis about what went wrong, and are able to validate it pretty quickly.

  • Adopting Envoy as a Service-to-Service Proxy at Reddit

    Reddit introduced Envoy into their backend framework as service-to-service proxy to support their ongoing architectural improvements. By adopting Envoy as a service-to-service Layer 4/Layer 7 proxy, they discovered significant improvements in observability, ease of adoption, and performance.

  • The Evolution of Full Cycle Developers at Netflix: Greg Burrell at QCon SF

    At QCon San Francisco, Greg Burrell talked about the journey towards “full cycle developers” within the Netflix edge engineering team. Following the principle of “operate what you build”, developers within this team chose to take on more operational responsibility for their services, and were facilitated by comprehensive tooling, training and management support.

  • Shipping More Safely by Encouraging Ownership of Deployments

    Many incidents happen during or right after the release argues Charity Majors, CEO at Honeycomb. She believes that stronger ownership of the deployment process by developers will ensure it is executed regularly and reduce risk. She argues for investment in the tooling, high observability during and after release, and small, frequent releases to minimize the impact caused by shipping new code.

  • Scaling Observability at Uber: Building In-House Solutions, uMonitor and Neris

    Uber’s infrastructure consists of thousands of microservices supporting mobile applications, infrastructure, and internal services. To provide high observability of these services, Uber’s Observability team built two in-house monitoring solutions: uMonitor for time-series metrics-based alerting, and Neris for host-level checks and metrics.

  • O11ycon Discusses Benefits and Challenges of Observability

    The first o11ycon provides a comprehensive look at the emerging concept of observability in software and systems which allow people to understand if things are working as expected, and to diagnose problems and identify solutions.

  • Observability and Microservices: The Need for Effective Tracing and Metrics

    Zach Jory has written an article discussing how microservices and service mesh implementations need observability to ensure that developers can build cloud-native applications which scale and can be more easily managed. This ties into a number of articles and interviews we have spoken about over recent months too.

  • Building Observable Distributed Systems

    Today's systems are more and more complex; microservices distributed over the network and scaling dynamically, resulting in many more ways of failure, ways we can't always predict. Investing in observability gives us the ability to ask questions to systems, things we never thought about before. Some of the tools that can be used for this are metrics, tracing, structured and correlated logging.

  • How Observability Impacts Testing: Q&A with Amy Phillips at QCon London

    Observability gives you a picture of the system’s current health and can replace certain types of testing. For low-risk application areas you can rely on observability instead of testing, provided you have continuous delivery that provides fast feedback and allows you to release changes quickly.

BT