Monitoring your Collector
The LogicMonitor Collector is the heart of your monitoring system. As such, you should monitor your Collector to ensure that it can keep up with data collection. As a best practice, we recommend that you set up the following monitoring for your Collector:
- Add the device your Collector is installed on into monitoring - this ensures that there is monitoring of CPU and disk space and other basic metrics of the collector's operating system.
- Enable LogicMonitor's Collector datasources on the device added in (1) - this ensures that there is monitoring for the performance of your Collector and its data collection tasks.
- The Collector DataSources will only monitor the preferred Collector on the device to which they are applied. When choosing a preferred Collector, ensure that you select the Collector installed on that device. Otherwise, the Collector's metrics will display on the wrong host. For instance, if you attempt to monitor Collector A using Collector B (installed on a separate host), then Collector B's metrics will display in lieu of Collector A's on Collector A's host. Once you've set up comprehensive monitoring for your Collector, you can use LogicMonitor alerts to know when your Collector is overloaded and unable to keep up with data collection.
- LogicMonitor has published a new series of Collector DataSources. We recommend customers disable old DataSources after they have verified the new ones function properly in their environment. Doing so will ensure you do not lose historical data, but also do not receive duplicate alerts. The following is a list of the old DataSources and their new versions, respectively.
- Collector Active Discovery Task | LogicMonitor_Collector_ActiveDiscoveryTasks
- Collector Data Collecting Task | LogicMonitor_Collector_DataCollectingTasks
- Collector Event Collecting Task | LogicMonitor_Collector_EventSourceCollectionTasks
- Collector Heartbeat | LogicMonitor_Collector_Heartbeat
- Collector JVM Garbage Collection | LogicMonitor_Collector_JVMGarbageCollection
- Collector JVM Memory Pools | LogicMonitor_Collector_JVMMemoryPools
- Collector JVM status | LogicMonitor_Collector_JVMStatus
- Collector Netflow Metrics | LogicMonitor_Collector_NetflowMetrics
- Collector Reporter Task | LogicMonitor_Collector_ReporterTask
- Collector Status | LogicMonitor_Collector_GlobalStats
- Throttler | LogicMonitor_Collector_Throttler
Enabling LogicMonitor's Collector Datasources:
To enable LogicMonitor's Collector datasources on a device with a collector installed, navigate to that device in your LogicMonitor account and perform the following steps:
- Click the device Manage button in the upper right
- Locate the system.categories property (or add a new system.categories property)
- Add Collector to the system.categories property values (you can append it to existing values after a comma)
- Save the property and save the Manage device dialog
After a minute or two (you may need to refresh your page), you should see a group of Collector datasources applied to your devices:
The Overview of the Data Collecting Task datasource can give good insights into how busy your collector is. Gaps in your graphs,or alerts on the Unavailable Task Rate may indicate a need for Collector Tuning or workload adjustment.
Note: If the Collector datasources do not appear, check to see if they are present in your account (from Settings | LogicModules | Datasources). If they are not, you can import them by clicking Add | From LogicMonitor Repository from Settings | LogicModules | Datasources.
Collectors have a built-in "Data Collecting Task" datasource which details the number of requests of each collection method (ie. batchscript, ESX, SNMP, ping, etc). The information is reported in tasks/second.
Data Collecting Task is found within the "Collector" datasource group, located in your Device Tree under the Collector's host. It includes an overview graph that displays the Top 10 tasks contributing to your Collector's load. This is useful for identifying the sources of CPU or memory usage:
Behind the numbers:
Let's take a look at the JMX tasks represented in the above graph. If a Collector is associated with 14 devices, each with 18 instances, at a 1 minute polling interval, we would calculate the rate of performed tasks per second using the following equation: 14 * (18/1/60) = 4.2
Given this example, there are 4.2 JMX tasks performed each second on this Collector.