AIOps platform vendor, Moogsoft, has announced the release of Moogsoft Enterprise 8.0, featuring a capability for technology teams to build a virtual Network Operations Centre (NOC). Moogsoft Enterprise consolidates monitoring tools with the intention of helping technology teams reduce noise, prioritise incidents, reduce escalations and ensure uptime.
Working remotely, the IT operator can analyse alerts, logs, metrics and traces with the aim of finding and resolving the root cause of incidents before they become outages. With version 8.0, a virtual NOC can be created using the Moogsoft Situation Room for teams to collaborate during the incident management process, and diagnose and resolve problems through a single view, rather than multiple screens each dedicated to different monitoring tools.
Other new features also enable teams to identify issues via improved noise reduction analysis and automated topology visualisation for root cause analysis. Version 8.0 also expands the platform's integrated workflow engine with advanced collaboration add-ons to help customers connect with existing systems of record or escalation processes.
The workflow engine modules provide integrations with external data sources including CMDB data directly from ServiceNow, Ansible and Puppet, bi-directional ticket integration into tools such as ServiceNow, Remedy and Cherwell, and PagerDuty, xMatters and Slack collaboration integration so users can directly communicate with the Situation Room team members. Other new integrations include AWS Firelens to ingest EC2 log files and with Opsgenie for on-call management.
A new alert analyser provides users with an interface to visually configure, identify anomalies within, and tune the platform’s alert processing through its noise reduction AI, known as Entropy. A new dynamic topology builder provides the virtual NOC with visibility into the correlation process, based on logical, virtual and physical topological relationships. This feature allows users to gain insights into incidents, and visualise the probable root cause and potential impact associated with current and neighbouring services.
There is also new platform functionality for the deployment of ML algorithms; new versioning, rollback and history capabilities allow customers to audit new deployments, enabling them to add, track, delete and change ML algorithms.
InfoQ took the opportunity to ask Will Cappelli, CTO EMEA and global VP of product strategy at Moogsoft, some questions about the new release:
InfoQ: What is the driver behind needing a virtual NOC?
Will Cappelli: Before the pandemic hit, workforces and IT operations teams in particular were already becoming more distributed and changing in make up more rapidly. At the same time, the environments being managed were themselves becoming increasingly virtualised. Also the tools that were required were multiplying and their respective scopes becoming more fragmented. A virtual or software-defined NOC has emerged as a way of coping with these challenges. Of course, with the pandemic, many of these trends have undergone warp drive acceleration.
InfoQ: What does an operator need to learn in order to adjust an ML algorithm?
Cappelli: It depends on the ML algorithm. A deep learning neural network, for example, requires a fair amount of knowledge of both how neural networks work and of the domain space and even then, you can never be sure exactly how the algorithm came to its conclusion. While Moogsoft uses DLNNs for some of the problems we tackle, most of the algorithms require no user intervention or previous knowledge whatsoever and they yield results instantaneously. Of course, some general knowledge of statistics and graph theory can deepen one's appreciation of what's going on, but that is just icing on the cake.
InfoQ: How does Moogsoft automate topology visualisation?
Cappelli: Visualisation is based on the graph structure of the topologies being analysed. It is the mathematical relationships between nodes and links that serves as input into the visualisation process. This also allows the platform represent in the same visualisation the results of the analyses our algorithms perform. For example, we have an algorithm developed recently by our CEO and co-founder, Phil Tee, that determines which nodes in a topology are likely to be the places from which an outage arose. This can be visualised via the node's colour and size in the representation.
InfoQ: How does the AWS Firelens integration work?
Cappelli: Firelens is all about annotating and distributing log data. The Moogsoft platform can take input from various log management systems (e.g., Elastic) and use its data selection, pattern discovery, causal inference, collaboration, and automation algorithms to interpret and act on information contained in the logs about problems both actual and anticipated occurring within an enterprise IT environment.
For more information, please consult the Moogsoft Enterprise 8.0 documentation.