Guidelines for Responding to Alert Notifications
IN THIS ARTICLE:
Identifying the Issue
Before you take steps to resolve and respond to an alert notification, you must first identify the issue causing the alert. Alert notifications don’t always provide the whole picture. They should be used as an indication that further investigation is needed into the component that generated the alert.
A great place to start investigating alerts are dashboards. Assuming dashboards are strategically set up, they can help you quickly pinpoint root cause.
Strategies for Responding to Alert Notifications
As discussed in the following section, there are several places from which you can respond to an alert notification. While the response method is subject to personal preference, there are guidelines for the type of response you should provide, depending upon whether you can resolve the issue at hand.
LogicMonitor supports three response types. Guidelines for the appropriate use of each are discussed in the following sections:
Acknowledging an Alert
You should acknowledge an alert when you believe that you can resolve the problem. Resolving the problem includes fixing whatever is causing the problem and then taking action to ensure that the alert does not recur, if necessary. Actions to suppress further alerts can include:
- Adjusting alert thresholds.
- Disabling alerting.
- Scheduling downtime, discussed in the next section, to cover periods of expected recurring maintenance
- Eliminating the alerting DataSource instance from discovery, or creating a cloned DataSource to discover/classify it with different alert thresholds. If there is a set of instances that should not be discovered (e.g. NFS mounted filesystems on Linux hosts) or for which you require different thresholds than other instances (e.g. QA VIPs on load balancers), you can achieve that with Active Discovery filtering.
Note: Acknowledgement of alerts applies to the same event if the severity drops but does not clear. For example, if you acknowledge a disk usage alert of level error and free up some space so that the alert level drops to warning, the warning alert will already be considered acknowledged. However, acknowledgements of alerts at a lower level do not affect the escalation of alerts if severity increases. For example, acknowledging a disk usage error alert will not affect the escalation of the critical alert if the drive continues to fill.
Scheduling Downtime (SDT) for an Alert
When you schedule downtime, you are suppressing all alert notifications for the designated instance, DataSource, device, or Website for the duration of the configured SDT period. This is in contrast to acknowledging an alert, which only suppresses further notifications of that particular alert.
You should place a resource in alert if:
- You forgot to proactively schedule downtime.
- A solution is in the works, but you don't want to continue receiving notifications while the issue is being addressed. For example, if you are receiving an alert that a server is out of memory, and you have more memory currently on order, you may initiate an SDT response type to avoid being repeatedly notified of the alert.
For more information on scheduling downtime for instances, DataSources or devices, see SDT Tab; for scheduling downtime for Websites, see Website SDTs; for scheduling downtime for Collectors, see Collector SDT.
Note: When responding with SDT to an EventSource alert, the entire EventSource is placed into SDT, not just the individual event ID associated with the alert condition.
Escalating an Alert
You should escalate an alert to the next step in the escalation chain if an SDT isn't appropriate and you either don't know what the issue is, don't have time to identify and/or resolve the issue, or don't know how to resolve the issue. Note that even if the escalation interval for the matching alert rule is set to zero, the alert will still escalate
Where to Respond to Alert Notifications
There are several places from which you can acknowledge, SDT, or escalate alert notifications:
- From the Alerts page, found in the LogicMonitor platform. See Alerts Page Overview for details.
- From the Alerts tab (available from the Resources or Websites page), found in the LogicMonitor platform. See Alerts Tab for details.
- If the alert notification is received via email, from the email notification (as a reply). See Responding to Alert Notifications via Email or SMS Email for details.
- If the alert notification is received via text, from the text thread (as a reply). See Responding to Native SMS Alert Notifications for details.