At the recent devopsdays Amsterdam 2015, Patrick Roelke contended that monitoring still has lots of issues. Roelke believes that data science can help by eliminating static thresholds and coalescing information from various data sources into a single metric. The talk included a quick overview of monitoring tools that leverage data science: Kale, Bosun and AnomalyDetection.
David Mytton, CEO at Server Density, shared with the devopsdays Amsterdam 2015 crowd how they handle incidents and outages. The process is grounded on a key set of principles: frequent public updates; exhaustive logging of the response activities; team effort and effective escalation. Server Density draws a lot of inspiration from the aviation industry, renowned for its safety procedures.
At GlueCon 2015, Adrian Cockcroft presented a list of rules for monitoring microservice and container-based applications. In addition to these guidelines, Cockcroft also highlighted a series of challenges for monitoring cloud-native container-based systems, and introduced his ‘Spigo/simianviz’ microservice simulation and visualisation tool.
New Relic has released a set of new features to its Software Analytics Platform. Service Maps is a real time visual map focused on services. Together with a tool for Docker monitoring, a database dashboard for NoSQL databases and an unified alerts platform, the company wants to reduce complexity in modern software architecture.
Weaveworks, creators of the Weave Docker virtual networking solution, have released a pre-alpha version of 'Weave Scope', an open source developer-focused container monitoring tool. Scope automatically generates a map of containers, enabling developers to visualise, monitor, and control applications by using the information exposed to drive deployment and operational decisions.
At QCon London 2015 Phil Calcado shared lessons learnt from SoundCloud’s move from a monolithic to microservices architecture, and stated that the core requirements for building a microservice platform include developing capabilities for rapid provisioning, basic monitoring and rapid application deployment.
Google Cloud Monitoring is now available for free whilst in beta to all Google Cloud Platform customers. The service provides dashboards and alerts for cloud-powered applications, giving developers and operations staff insight and metrics to their services.
James Turnbull, VP of engineering at Kickstarter and author of The Docker Book, presented at both FOSDEM and Config Management Camp about monitoring, sharing his views on modern, scalable, business oriented monitoring, provided as a service with self service APIs, and integrated in the project development.
Shortly after releasing the AWS CloudTrail Processing Library (CPL), Amazon Web Services has also integrated AWS CloudTrail with Amazon CloudWatch Logs to enable alarms and respective "notifications from CloudWatch, triggered by specific API activity captured by CloudTrail". The implied support for monitoring JSON-formatted logs has recently been officially released as well.
Netflix has open sourced Atlas, part of their next-generation monitoring platform they have been working on since early 2012. The company developed Atlas to store time series data in order to provide near real-time operational insight to teams.
VictorOps published the results of its survey on the state of on-call activities, which it claims to be the first of its kind. The survey includes data about the challenges of being on-call, the way those who are on-call get notified, the tools they use to support incident resolution, the prevalence of false alarms, the average time of each incident resolution and more.
To thoroughly remove waste in a process you need flow to deliver just in time, and mindfulness and situational awareness in organizations to handle problems with processes and built in human intelligence. Organizations apply concepts from flow to develop what is needed and when it is needed and use pull to prevent inventories. What they also need is “Jidoka”: mindfulness and situational awareness.
Kanban is often used to manage work, but the concepts of kanban can also be used to guide a journey of change in an organization. This is a case study of an insurance company that used kanban to get change done to improve visibility and predictability and engaging their people.
Amazon CloudWatch recently gained log file monitoring and storage for application, operating system and custom logs and meanwhile enhanced support for Microsoft Windows Server to cover a wider variety of log sources.
Lindsay Holmwood made a retrospective about metrics and monitoring in his DevOps Days Belgium talk, listed his typical metrics and monitoring pipeline, exposed some flaws in monitoring systems, and his view of what the future may bring in the field.