InfoQ Software Architects' Newsletter

A monthly overview of things you need to know as an architect or aspiring architect.

Enter your e-mail address

Select your country

We protect your privacy.

InfoQ Homepage Monitoring Content on InfoQ

News

RSS Feed

Newer Older

DevOps

Instana Pipeline Feedback for Release Performance

Application performance management service provider Instana launched Pipeline Feedback for release performance tracking and analysis. Pipeline Feedback provides automatic tracking of application releases, feedback on release performance, and integration with Jenkins.

K Jonas
on Aug 22, 2019
Cloud

Microsoft Releases a Preview of the Integration of Prometheus with Azure Monitor for Containers

Recently Microsoft announced the integration of Prometheus, a popular open-source metric monitoring solution and part of Cloud Native Compute Foundation, with Azure Monitor for containers. This integration is currently available in a preview stage for testing.

Steef-Jan Wiggers
on Aug 02, 2019
Cloud

Oliver Gould on Linkerd Service Mesh and Traffic Management

Oliver Gould, Linkerd product lead and CTO of Buyont, spoke at the QCon New York 2019 Conference last week about Linkerd service mesh, with a focus on traffic management capabilities.

Srini Penchikala
on Jul 07, 2019
DevOps

Athena: Automated Build Health Monitoring at Dropbox Engineering

Dropbox’s engineering team runs ~35,000 builds and millions of automated tests, many of which can fail either due to bad commits or due to environmental conditions. The team created a build monitoring system to minimize the manual intervention necessary to detect and quarantine flaky tests, and notify code authors.

Hrishikesh Barua
on Jun 15, 2019
AI, ML & Data Engineering

Expo: Real Time A/B Testing and Monitoring with Spark Streaming and Kafka at Walmart Labs

The WalmartLabs engineering team developed a real time A/B testing tool called Expo that collects and analyzes user engagement metrics. It uses Spark Structured Streaming to process the incoming data and stores the metrics in KairosDB.

Hrishikesh Barua
on May 24, 2019
DevOps

HashiCorp Releases Consul 1.5.0 with Layer 7 Observability and Centralized Configuration

Hashicorp released version 1.5.0 of Consul, their service mesh application and key-value store. These are the first features released on their new roadmap for Consul, including support for L7 observability and load balancing via Envoy, centralized configuration, and ACL authentication support for trusted third-party applications.

Matt Campbell
on May 24, 2019
DevOps

Merging OpenTracing and OpenCensus into a Single Distributed Tracing Framework

The OpenTracing and OpenCensus projects have announced that they will merge into a single, unified project. The goals of the merge include creating a single instrumentation standard, maintaining essential functionality without including every feature from both projects, a loosely coupled architecture to enable pluggability, and cover within its scope traces, metrics and logs.

Hrishikesh Barua
on Apr 13, 2019
DevOps

Scaling Graphite at Booking.com

Booking.com's engineering team scaled their Graphite deployment from a small cluster to one that handles millions of metrics per second. Along the way, they modified and optimized Graphite's core components - the carbon-relay and carbon-cache, and the rendering API.

Hrishikesh Barua
on Mar 30, 2019
DevOps

Vector Performance Monitoring Tool Adds eBPF, Unified Host-Container Metrics Support

Vector, the open source performance monitoring tool from Netflix, added support for eBPF based tools using a PCP daemon, a unified view of container and host metrics, and UI improvements.

Hrishikesh Barua
on Mar 23, 2019
Culture & Methods

Observability in Testing with ElasTest

In a distributed application it is difficult to use debugging techniques common in developing non-distributed applications. Bringing production observability to your testing environment helps to find bugs, argued Francisco Gortázar at the European Testing Conference 2019. He presented ElasTest, a tool for developers to test and validate complex distributed systems using observability.

Ben Linders
on Mar 14, 2019
Architecture & Design

Recommendations When Starting with Microservices: Ben Sigelman at QCon London

During the years Ben Sigelman worked at Google, they were creating what we today call a microservices architecture. Some mistakes were made during this adoption, which he believes are being repeated today by the rest of the industry. In his presentation at QCon London 2019, Sigelman described his recommendations to avoid making these mistakes when starting with microservices.

Jan Stenberg
on Mar 10, 2019
DevOps

Chaos Engineering Observability: Q&A with Russ Miles

In a new O’Reilly report, “Chaos Engineering Observability: Bringing Chaos Experiments into System Observability”, the author, Russ Miles, explores why he believes the topics of observability and chaos engineering “go hand in hand”. He argues that as engineers begin to run chaos experiments, they will need to be able to ask many questions about the underlying system being experimented on.

Daniel Bryant
on Mar 04, 2019
DevOps

Scaling, Incident Management and Collaboration at New York Times Engineering

The New York Times Engineering Team wrote about their approach to scaling and incident management against the backdrop of increased traffic during the November 2018 US midterm elections.

Hrishikesh Barua
on Mar 02, 2019
DevOps

Three Pillars with Zero Answers: Rethinking Observability with Ben Sigelman

At KubeCon NA, held in Seattle, USA, in December 2018, Ben Sigelman presented “Three Pillars, Zero Answers: We Need to Rethink Observability” and argued that many organisations may need to rethink their approach to metrics, logging and distributed tracing.

Daniel Bryant
on Feb 17, 2019
DevOps

Evolution of Metrics Collection and Log Aggregation at Coinbase

Luke Demi, software engineer at Coinbase, writes about the changes in monitoring and logging that have taken place at Coinbase since mid-2018. Coinbase moved from a self-managed Elasticsearch cluster that served the dual purpose of log analysis and metrics visualization, to Datadog for metrics collection and managed Elasticsearch on AWS for log aggregation.

Hrishikesh Barua
on Feb 17, 2019

Newer News

Older News

InfoQ Software Architects' Newsletter

Login with:

Don't have an InfoQ account?

News