No one wants to get paged in the middle of the night for an issue or failure within their infrastructure. When this does happen, IT operations engineers need to be able to quickly and confidently identify where the fire is and how to put it out to minimize negative impact. The root cause analysis (RCA) feature within LogicMonitor’s new AIOps Early Warning System makes this easier than ever. RCA intelligently identifies the root cause when an issue occurs, enabling IT operations engineers to focus on solving the issue quickly instead of searching for it. With LogicMonitor’s ability to monitor pretty much anything (e.g. cloud, containers, network, servers, storage, virtualization, etc.), this means that RCA can help reduce downtime even for complex hybrid infrastructures.
RCA uses automatically discovered topology relationships between monitored resources to establish dependencies between those resources. When a monitored resource becomes unreachable, those dependencies are used to identify the root cause and impacted dependent resources. Alert notifications routed to IT operations engineers are limited to that of the root cause, and include information about impacted dependent resources, preventing the typical alert storm these engineers would normally receive for such a scenario. By preventing alert storms that obscure root cause, RCA helps speed issue to resolution time and minimize downtime.
Normally this would result in dozens of alerts for all unreachable devices, but with RCA the originating cause alert is identified and dependent alerts are grouped and alert notification for dependents are disabled.
While the initial release relies on the reachability of monitored resources, future enhancements will allow resource dependencies to be more granularly configured beyond LogicMonitor provided defaults.
So what does RCA have to do with automated remediation? Identifying the root cause is only the beginning; we want to provide the ability to automate actions that remediate the root cause issue. This will close the loop from intelligently identifying and predicting issues to automatically fixing and preventing them, and not only further reduce downtime but also save IT operations teams valuable time, enabling them to spend more time innovating and less time reacting to problems. RCA is just one step in this direction, but with this first step, we aim to cut through the noise and give you more nights of uninterrupted sleep. Reach out to the LogicMonitor team more information or to get started on a free trial!
Chris SternbergProduct Manager
Chris is an analyst at heart who knows data is king and loves solving puzzles. He’s been involved in numerous projects in this medium from permitting workflow, inventory and accounting, and identity and asset management for relational, mainframe, and directory data systems, in terminal, on-premise, and SaaS environments. His goal is to further his understanding of day to day workflow for monitoring to remove roadblocks and solve problems via intelligent data analysis and automation. When he’s not working, you’ll find Chris reading a sci-fi/fantasy book, riding or tinkering on his motorcycles, or finding respite from the Texas heat in the Austin greenbelts with his dogs.
Subscribe to our LogicBlog to stay updated on the latest developments from LogicMonitor and get notified about blog posts from our world-class team of IT experts and engineers, as well as our leadership team with in-depth knowledge and decades of collective experience in delivering a product IT professionals love.