Available for Enterprise and Enterprise MSP customers, the new Header Graph (Beta) feature is being rolled out in the v148 release. This time-series graph allows for easy alert grouping to cut down troubleshooting time and quickly identify the resources that are causing an alert storm.
The Alerts page features a brand new time-series header graph that provides immediate visual insight into an alert storm or other events requiring investigation.
Representing the alerts currently displayed in the alert table, this graph’s aggregated alert counts can be grouped by a number of dimensions including alert severity, associated alert triggers (resource, LogicModule, instance, datapoint), matching alert rules, or the escalations chains used to deliver alert notifications.
In just a few clicks, you can group alerts by highly relevant dimensions. This can greatly speed up troubleshooting during an alert storm as well as aid in the identification of recurring issues that require remediation or adjustments to alert thresholds.
Available by clicking the more options icon located in the upper right corner of the Alerts page, the header graph is highly interactive, allowing you to zoom in on a time range or use the graph’s legend to quickly include/exclude a group of alerts. As you interactively change the graph’s filters, the alert table will automatically update to remain in sync.
Use the graph’s legend to quickly home in on a group of alerts. Here, we’ve grouped the past hour of alerts by their associated datapoint and then focused in on a particular datapoint.
The header graph helps shave time off of MTTR by allowing users to quickly group alerts via alert properties so that in the case of an alert storm, the triggering resource, module, instance, etc. can be quickly identified for further troubleshooting and remediation.
In just a few clicks, you can use the header graph to do the following:
Immediately get a visualization for a spike in alerts.
Zoom in on the spike in the graph and add a dimension for Datapoint to see that the alerts are all related to the same metric.
Add a dimension for Resource/Website to quickly see the affected resources.
Alert grouping is part of our AIOps Early Warning System. First introduced in late 2019, the Early Warning Systems consists of several advanced features aimed at reducing alert noise and improving troubleshooting via anomaly detection. Since this release, we’ve vastly improved on the Dynamic Thresholds feature to reduce alert noise and troubleshooting via anomaly detection. While Root Cause Analysis (RCA) has expanded its coverage, along with topology, its scope to address additional dependency use cases has been limited. LogicMonitor also has Service Insights, Cluster Alerts, host down, and Collector down logic to help reduce alert noise. Even put together these features do not cover the majority of use cases, especially as monitoring expands into the cloud and microservices environments.
We have learned a great deal from your feedback on all of these features, especially RCA, and are embarking on a path to provide a universal alert grouping and notification suppressions solution in LogicMonitor. We are starting with a fairly basic cornerstone to allow for easy alert grouping via the new Alerts Page header graph, but this is only the first step towards the expansion of our Early Warning System to not just aid in troubleshooting issues after the fact, but providing additional context to remedy the issue and prevent it from reoccurring in the future.
With an influx of new data from recent enhancements such as Logs and Netflow, as well as new features coming out this year for Synthetics and Tracing, LM will provide access to large amounts of data for Unified Observability. This feature is only the first step towards providing a universal approach for alert grouping and contextual analysis no matter the source or data type.
Throughout the next year we will continue to iterate on this feature with performance improvements and functional enhancements such as:
Two-dimensional treemap graph for sub-grouping
Zoom in on a map section to get further contextual information on the sub-grouped values and counts
This is only the beginning of how we will use this initial step for easy alert grouping in the UI to allow for automated grouping of alerts and suppression of notifications. Stay tuned to our release notes for further updates. Reach out to your Customer Success Manager if you would like to get more involved with development, prototyping, and beta testing.
Chris is an analyst at heart who knows data is king and loves solving puzzles. He’s been involved in numerous projects in this medium from permitting workflow, inventory and accounting, and identity and asset management for relational, mainframe, and directory data systems, in terminal, on-premise, and SaaS environments. His goal is to further his understanding of day to day workflow for monitoring to remove roadblocks and solve problems via intelligent data analysis and automation. When he’s not working, you’ll find Chris reading a sci-fi/fantasy book, riding or tinkering on his motorcycles, or finding respite from the Texas heat in the Austin greenbelts with his dogs.
Subscribe to our LogicBlog to stay updated on the latest developments from LogicMonitor and get notified about blog posts from our world-class team of IT experts and engineers, as well as our leadership team with in-depth knowledge and decades of collective experience in delivering a product IT professionals love.
LogicMonitor announced the appointment of Nitin Navare as Chief Technology Officer (CTO).
There are a few Agile certifications available to choose from, and in this article, we’ll discuss the best agile certifications currently available for IT professionals.
Join LogicMonitor Wednesday June 1st for Dinner @ Frankie & Johnnie's Steakhouse