HostStatus DataSource

The following support article details the expected behavior and function of the HostStatus DataSource. 

What is the purpose of the HostStatus DataSource? 

The HostStatus DataSource is a critical component of effectively monitoring your infrastructure. We use this DataSource to determine whether your host (any device you have added to your account) is responsive or not. Specifically, the idleinterval datapoint within the HostStatus DataSource measures the amount of time in seconds since the LogicMonitor Collector was able to collect ANY data (be it SNMP, Ping, WMI, ESX, etc) from your host. The absence of all data is important in order to formally declare a host as down. For instance, data collection failures for individual protocols can result from credential errors such as a changed SNMP community string or a new WMI username/password.  But, the device may still be pingable, and thus not actually be down. 

 A sample configuration for the idleinterval datapoint can be seen below: 




How does the HostStatus DataSource work? 

When the associated Collector is unable to contact your host for a period of six minutes or more, all alerts emanating from that host will be suppressed (the six minute period of time is not configurable). This alert suppression will not auto-trigger an alert indicating that the host has been declared down. That is the job of the  HostStatus DataSource. The HostStatus DataSource will trigger a critical alert declaring the host down following the period of time designated in the idleinterval datapoint's alert threshold. 

Please note that users should not implement customized changes to the HostStatus DataSource. Changes such as increasing the idleinterval alert threshold can result in cascading alert suppression without notification. 


What impact does a Host Down alert have on my monitoring? 

When a host goes down, all other alerts emanating from the host will be suppressed. This means they will not trigger notifications to their respective escalation chains. This is meant to reduce noise caused by cascading effects of your host being down.   

I received a Host Down alert, but my host is not down. How do I proceed? 

There are two primary courses of action one should take when this occurs: 

  1. Check network connectivity from the Collector to the host. Issue the !ping command from the collector debug interface. If the ping command reports 0 packets returned, get the IP address of the Collector by issuing the !ipaddress command. If you can manually ping from the command line using the same addresses or names, then this is indicative of a bug.
  2. If ping works, then it is possible that there is a problem with task scheduling or execution. To check task scheduling or execution, use the !tlist and !tdetail commands. !tlist gives you a list of all Collector's collection tasks and some status for each one. !tdetail gives you a more detailed view of a single collection task. Rare problems with the communication between the application and the Collector can cause tasks to not schedule properly.