LogicMonitor’s collectors are configured to work well in most environments, but can need tuning.
Performance Overview
There is a trade-off between the collector’s resource consumption (CPU and memory) and performance. The collector by default does not consume many resources, so tuning of the collector may be required in large environments, environments where a collector is not doing a variety of work (e.g. a collector doing almost all JMX collection, instead of a mix of SNMP, JMX and JDBC), or environments where many devices are not responding. Tuning may involve adjusting the collector’s configuration, or it may involve redistributing workloads.
A common reason for collectors to no longer be able to deal with the same devices they have been monitoring is if some devices no longer respond. For example, if a collector is monitoring 100 devices with no queuing, but then starts showing task queueing, or is unable to schedule tasks, this may well be because it can no longer collect data from some of the devices. If it was talking to all those devices via JMX, and each device normally responded to a JMX query in 200 ms, it could cycle through all the devices easily. However, if the JMX credentials now mismatch on 10 of the hosts, so that they do not respond to LogicMonitor queries – the collector will keep a thread open until the configured JMX timeout. It will now be keeping several threads open, waiting for responses that will never come. Tuning can help in this situation.
How do I know if a Collector needs tuning?
Assuming you’ve set up Collector monitoring, you will be alerted by the Collector Data Collecting Task datasource if the collector is unable to schedule tasks. This is a clear indication that the workload of a collector needs tuning, as data is not being collected in accordance with the datasource schedule. This may result in gaps in graphs. Another metric to watch is the presence of elements in the Task Queue. This indicates that the collector is having to wait to schedule tasks, but that they are still completing in the appropriate time – so it’s a leading indicator of a collector approaching its configured capacity.
You can see on the below graphs that the Collector datasources clearly show an overloaded collector – there are many tasks that cannot be scheduled, and the task queue is very high. After tuning (Aug 26), the number of successful tasks increases; unscheduled tasks drops to zero, as does the task queue.
A good proactive behavior is to create a collector dashboard, and create a Custom Graph showing the top 10 collectors by the datapoint UnavailableScheduleTaskRate for all instances of the Data Collecting Task datasource on all devices, and another showing the top 10 collectors by TasksCountInQueue. Given each collector has many instances of these datasources (one for each collection method), you may have to specify specific collection methods as instances – snmp, jmx, etc – in order to not exceed the instance limit on a custom graph. Otherwise set instances to a star (*) to see all methods on one graph.
Collector Tuning
The easiest way to tune your Collector is simply to increase the Collector size. The small Collector only uses 2GB of memory, but can perform more work if upgraded to a larger size (and the server running the Collector has the memory available). The Collector’s configurations can also be modified manually, as discussed in Editing the Collector Config Files.
In general, there are two cases that could require Collector tuning:
- when devices are not responding
- when the Collector cannot keep up with the workload
Both are often addressed by increasing Collector Size, which should be your first step. However, if you’ve already tried increasing the size and still see performance issues, you may find it helpful to do a little fine tuning.
Devices not responding
If devices are failing to respond to a query from the collector, because they have had their credentials changed, the device is offline, the LogicMonitor credentials were incorrectly set, or other reasons, you should get alerts about the protocol not responding on the device. The best approach in this situation is to correct the underlying issue (set the credentials, etc), so that the monitoring can resume on the devices. However, this is not always possible. You can validate from the Collector debug window (Under Settings…Collectors…Manage Collector…Support…Run Debug Command) whether this issue is impacting your collectors. If you run the command !tlist c=METHOD, where method is the data collection method at issue (jmx, snmp, WMI, etc), you will get a list of all the tasks of that type the collector has scheduled.
If you see many tasks that failed due to timeout or non-response – those tasks are keeping a thread busy for the timeout period of that protocol. In this situation, it may be appropriate to reduce the configured timeout, to stop threads from blocking for so long. The default for JMX timeouts was 30 seconds at one point – which is a very long time for a computer to respond. Setting that to 5 seconds (the current default) means that for a non-responsive device, 6 times as many tasks can be processed in the same time. Care should be taken when setting timeouts to ensure they are reasonable for your environment. While it may be appropriate to set the JMX timeout to 5 seconds, the webpage collector may be left at 30 seconds, as you may have web pages that take that long to render. Setting a timeout to a shorter period than it takes devices to respond will adversely affect monitoring.
To change the timeout for a protocol, you must edit the collector configuration manually from the Collector Configuration window. Edit the collector.*.timeout parameter to change the timeout for the protocol you want (ex: change collector.jmx.timeout=30 to collector.jmx.timeout=5).
You may also need to increase the number of threads, as well as reducing the timeout period – see the section below.
Collector cannot keep up with workload
If the Collector is still reporting tasks cannot be scheduled, it may be appropriate to increase the number of threads for a collection method. This will allow the collector to perform more work simultaneously (especially if some threads are just waiting for timeouts), but will also increase the collectors CPU usage.
To increase the threads available to a collection method, you must edit the collector configuration manually from the Collector Configuration window. Edit the collector.*.threadpool parameter to change the threadpool allotment for the protocol you want (ex: change collector.jmx.threadpool=50 to collector.jmx.threadpool=150).
It is recommended to increase the threadpool setting gradually – try doubling the current setting, then observing behavior. Note changes in the collector’s CPU utilization, and Heap utilization – more threads will use more CPU, and place more demands on the JVM heap. If the collector heap usage (shown under the Collector datasource Collector JVM Status) is approaching the limit, that may need increasing too.
If a collector has had its threads increased and its heap increased, and still cannot keep up with the workload (or is hitting CPU capacity) – it is time to add another collector and split the workload amongst the collectors.
The LogicMonitor Collector is an application that runs on a Linux or Windows server within your infrastructure and uses standard monitoring protocols to intelligently monitor devices within your infrastructure.
LogicMonitor Collectors are not agents and do not have to be installed on every resource within your infrastructure that you would like monitored. Rather, you should install a Collector on a host in each location of your infrastructure. For more information, see Installing Collectors.
The Collector retrieves data from all the devices assigned to it, then encrypts the data and sends it back to the LogicMonitor servers over an outgoing SSL connection.
One Collector can typically monitor hundreds of devices; however, this capacity depends on how many metrics are being monitored for each device, as well as the available resources of the server on which the Collector is installed. For more information on capacity, see Collector Capacity.
How Collectors Determine What Metrics to Monitor for Devices
When you add a device into monitoring, LogicMonitor applies built-in intelligence to recognize what kind of device it is. Based on the information discovered about the device, LogicMonitor DataSources are applied.
DataSources are templates that tell the Collector how to monitor the device, what metrics to collect for the device, how to display those metrics as graphs, and what values indicate issues that need attention. LogicMonitor installs with hundreds of pre-built DataSources that will automatically apply when you add devices into your account.
Collector Data Storage
All of the data from your Collectors is consolidated in a LogicMonitor data center, and this data is accessible in your LogicMonitor portal from anywhere with an internet connection. This necessitates that the server your Collector is installed on can make an outgoing HTTPS connection to LogicMonitor’s data centers (note, however, that Collectors can be installed on proxy servers).
Ports Used by Collectors
The server on which a Collector is installed must be able to able to make an outgoing HTTPS connection to the LogicMonitor servers (proxies are supported). In addition, the ports for the monitoring protocols you intend to use (e.g. SNMP, WMI, JDBC, etc.) must be unrestricted between your Collector machine and the resources you want to monitor.
The following tables document how the Collector communicates outbound traffic so that firewall rules can be configured accordingly. Additionally, it highlights the use cases in which the Collector is listening for inbound traffic and, when applicable, the configurations that can be used to update these inbound ports.
Inbound communication
Port | Protocol | Use Case | Configuration Setting |
162 | UDP | SNMP traps received from target devices | eventcollector.snmptrap.address |
514 | UDP | Syslog messages received from target devices | eventcollector.syslog.port |
2055 | UDP | NetFlow data received from target devices | netflow.ports |
6343 | UDP | sFlow data received from target devices | netflow.sflow.ports |
7214 | HTTP/ Proprietary | Communication from custom JobMonitors to Collector service | httpd.port |
Outbound communication
Port | Protocol | Use Case | Configuration Setting |
135 | TCP | The RPC Endpoint Mapper uses port 135 to support WMI and PerfMon DataSources to help the Collector communicate with monitored devices. It enables the Collector to locate a temporary port which the device can use to send performance information. | N/A |
443 | HTTP/TLS | Communication between the Collector and the LogicMonitor data center (port 443 must be permitted to access LogicMonitor’s public IP addresses; If your environment does not allow the Collector to directly connect with the LogicMonitor data centers, you can configure the Collector to communicate through a proxy.) | N/A |
445 | TCP | For PerfMon datasources, the Collector connects to Windows system over port 445 using the SMB protocol. The PerfMon datasource uses the special IPC$ share to initiate communication and interact with the system services to collect performance data such as CPU, memory, and disk usage. | N/A |
Other non-privileged | SNMP, WMI, HTTP, SSH, JMX, etc. | Communication between Collector and target resources assigned for monitoring | N/A |
Internal communication
Port | Protocol | Use Case | Configuration Setting |
7211 | Proprietary | Communication between Watchdog and Collector services to OS Proxy service (sbwinproxy/sblinuxproxy) | sbproxy.port |
7212 | Proprietary | Communication from Watchdog service to Collector service | agent.status.port |
7213 | Proprietary | Communication from Collector service to Watchdog service | watchdog.status.port |
15003 | Proprietary | Communication between Collector service and its service wrapper | N/A |
15004 | Proprietary | Communication between Collector service and its service wrapper | N/A |
For instructions on editing a Collector’s configurations, see Editing the Collector Config Files.
Collector Security
The LogicMonitor Collector has been carefully designed and developed with high security in mind. For details on Collector security measures and recommended best practices, see LogicMonitor Security Best Practices.
Note: Windows Defender Credential Guard is not supported and should not be enabled on Windows Collectors. The security platform has application requirements, such as blocking specific authentication capabilities, that may interfere with Collector operation.
Anti-malware Considerations
LogicMonitor Collector undergoes rigorous security testing and is digitally signed using a DigiCert code signing certificate to ensure the authenticity and integrity of each release. This guarantees that the code has not been altered or tampered with after publication, providing users with a secure and trusted experience. Despite this, the network traffic patterns may look suspicious to anti-malware tools such as Heuristic antivirus or intelligent endpoint detection and response services. If you choose to run such software on collector systems, be aware that it may interfere with the collector’s operations. Frequent collector service restarts and process crashes are some of the common indicators of anti-malware interference.
LogicMonitor recommends to follow a targeted and balanced approach to address potential threats without compromising the system’s overall protection. Follow these guidelines to tune anti-malware alerts:
- Understand the nature of anti-malware alerts to make informed decisions. You must first assess the alert details to determine whether it indicates a genuine security threat that requires your attention and action or is it a false positive alarm that you can ignore.
- Instead of immediately adding full exclusions to the software’s directory path, you may consider adjusting the settings to permit specific components or files flagged by the alert.
- Stay updated on the security practices of the anti-malware software and regularly review the configuration settings to manage these alerts effectively.
For more information on setting exclusions in common anti-malware packages, see the following resources:
- Symantec Endpoint Protection: Excluding a file or a folder from scans
- ESET: Exclude files or folders from scanning in ESET Windows home products
- Sophos: Global Exclusions
- FortiClient: Managing the AntiVirus exclusion list
Open Source Software (OSS) List in Collector Installer
LogicMonitor has automated the OSS license report generation process. With every Collector release – Early Access (EA), Optional General Releases (GD), Required General Releases (MGD), and patch releases, a report of the OSS licenses used by the Collector is generated and bundled with the Collector installer. You can access the report file at the following locations:
- Linux –
<AGENT_ROOT>/lib/THIRD-PARTY-NOTICES.txt
- Windows –
<AGENT_ROOT>\lib\THIRD-PARTY-NOTICES.txt
Note: The AGENT_ROOT is the install path. The default value for Linux is – /usr/local/logicmonitor/agent and for Windows it is – C:\Program Files\LogicMonitor\agent.
Windows Collector Installation Directory Components
The AGENT_ROOT is the collector install path. The default AGENT_ROOT value for Linux and Windows is:
- AGENT_ROOT for Linux—/usr/local/logicmonitor/agent
- AGENT_ROOT for Windows—C:\Program Files\LogicMonitor\Agent
A summary of the components used in the Windows collector installation directory is given in the following table:
Windows Collector Directory | Description |
---|---|
<AGENT_ROOT>\SNMP-MIB-Copyrights.txt | This file contains copyrights of the out-of-the-box MIB files used for translating SNMP traps which are ingested as LM logs. |
<AGENT_ROOT>\bin | The folder bin contains executables and DLL files that are required to start, stop, and uninstall the Agent and Watchdog services. |
<AGENT_ROOT>\bin\queues | This consists of persistent queues for data reporting, and files for converting collector users to non-root or non-admin. |
<AGENT_ROOT>\conf\agent.conf | This configuration file controls the business behavior of collector. It consists of all data collection, active discovery, auto property, and other business logic configurations. |
<AGENT_ROOT>\conf\sbproxy.conf | This configuration file controls the internal behaviour of collector sbwinproxy process. It is recommended that you do not change this configuration. |
<AGENT_ROOT>\conf\watchdog.conf | This configuration file controls the internal behaviour of collector Watchdog service. It is recommended that you do not change this configuration. |
<AGENT_ROOT>\conf\wrapper.conf | This configuration file controls the internal behaviour of collector Wrapper service. It is recommended that you do not change this configuration. However, in exceptional cases, to enlarge the memory that collector can use or the Java Classpath, you must additionally load a collector. |
<AGENT_ROOT>\diagnosetool | This utility contains a number of predefined checks related to configurations, memory, network, processes, systems, and more. It also contains some SNMP commands such as snmpbulkget , snmpbulkwalk , snmpget , and snmpwalk . |
<AGENT_ROOT>\lib | The lib folder contains libraries created by collector and third-party libraries on which the collector code depends. |
<AGENT_ROOT>\logs | This file contains multiple logs such as logs related to collector installation, diagnose utility logs, agent logs, sbProxy logs, watchdog logs, and more. |
<AGENT_ROOT>\tmp | This folder contains downloaded files used for upgrading and downgrading collectors. It also stores temporary files for monitoring. |
<AGENT_ROOT>\configure.sh | (Only for Linux directory) When a collector is installed using the install.sh , the configure.sh file is run to configure the collector settings. |
Note:
- For the Linux installation directory, you can use the same installation components with path
/ . For example, /tmp. - You can use the
!DecryptFileSHA
debug command to obtain the SHA of files that you want to exclude or allow while installing collector. For more information, see Collector Debug Facility.
If your environment does not allow the Collector to directly connect with the LogicMonitor data centers, you can configure the Collector to communicate through an HTTP proxy.
Updating SSL and Proxy Settings
By default, collectors are not configured to use proxies. To communicate with HTTP proxies, you need to make updates to several proxy settings located in the collector’s agent.conf
file. For detailed instructions on editing the agent.conf file, see Editing the Collector Config Files.
Once updated, the new settings should look similar to these:
# SSL & Proxy settings
ssl.enable=true
proxy.enable=true
proxy.host=10.0.0.54
proxy.port=8080
proxy.user=domain\username
proxy.pass=password
proxy.exclude=
proxy.global=false
proxy.pass.isencrypted=false
These new settings designate the following:
ssl.enable=true
Indicates that the collector will make outbound connection using SSL.proxy.enable=true
Indicates that the collector will use these settings.proxy.host=
Indicates the IP address of the proxy server.proxy.port=
Indicates the port the proxy server uses.proxy.user=
Indicates the username the collector uses when connecting to the proxy.proxy.pass=
Indicates the password the collector uses when connecting to the proxy.proxy.pass.isencrypted=
Indicates if the proxy password is encrypted or not.
Note: The settings specified above reflect a Windows-based proxy requiring authentication. Linux collectors support only basic authentication. Windows collectors support NTLM and other native windows authentication methods.
Changing Proxy Password
If a proxy server has password-based authentication, its credentials are stored in the proxy.user
and proxy.pass
fields. The proxy password is encrypted. To indicate the encryption, the proxy.pass.isencrypted
is set to true
. You can set proxy.pass.isencrypted= false
if you want to change the proxy password.
Note: This setting is available in collector version 30.104 or later.
- Navigate to Settings > Collectors.
- Under the Collectors tab, select the collector you want to configure.
- Select the More option and then select Collector Configuration.
On the Collector Configuration page, settings under the Agent Config tab are displayed. - Scroll to locate the SSL and Proxy settings.
- Enter a new password in plain text in the
proxy.pass
field. - Set the
proxy.pass.isencrypted
value tofalse
. - Select Save and Restart.
After the restart, observe that the password is encrypted and theproxy.pass.isencrypted
field is set totrue
.
Troubleshooting Collector Proxy Configuration
We have highlighted some common issues experienced (and how to resolve them) when configuring collectors to be used with HTTP proxies.
Issue: Proxy Authentication Required
When the collector is configured to use a proxy that requires basic authentication, the collector may throw the following exception:
[MSG] [WARN] [main::controller:main] [Controller2._initConfiguration:461] Unexpected status encountered from server. Will retry., CONTEXT=retry=30s, statusCode= 500, errMsg=Unable to tunnel through proxy. Proxy returns "HTTP/1.1 407 Proxy Authentication Required"
In this case, you will want to add the following configuration to the collector’s wrapper.conf:
wrapper.java.additional.16=-Djdk.http.auth.tunneling.disabledSchemes=
Issue: Invalid SSL Certificate
If a collector does not get a valid SSL certificate issued directly from LogicMonitor, it will fail to properly start. In the below example, all SSL certificates in the client environment were being intercepted and reissued using special security software (example, Blue Coat Proxy).
[03-26 15:53:03.222 EDT] [MSG] [INFO] [statusmonitor:::] [StatusListener$1.run:106] Receive peer request, CONTEXT=command=keepalive, charset=windows-1252, peer=/***.***.***.***:******
[03-26 15:53:03.268 EDT] [MSG] [WARN] [statusmonitor::scheduler:] [PropertyFilePersistentHandler._load:94] task file not found, CONTEXT=filename=C:\Program Files (x86)\LogicMonitor\Agent\conf\persistent_task.conf, EXCEPTION=C:\Program Files (x86)\LogicMonitor\Agent\conf\persistent_task.conf (The system cannot find the file specified)
java.io.FileNotFoundException: C:\Program Files (x86)\LogicMonitor\Agent\conf\persistent_task.conf (The system cannot find the file specified)
at java.io.FileInputStream.open0(Native Method)
at java.io.FileInputStream.open(FileInputStream.java:195)
at java.io.FileInputStream.<init>(FileInputStream.java:138)
at java.io.FileInputStream.<init>(FileInputStream.java:93)
at com.santaba.common.util.scheduler.impl.PropertyFilePersistentHandler._load(PropertyFilePersistentHandler.java:88)
at com.santaba.common.util.scheduler.impl.PropertyFilePersistentHandler.<init>(PropertyFilePersistentHandler.java:30)
at com.santaba.common.util.scheduler.Schedulers.newPersistentScheduler(Schedulers.java:17)
at com.santaba.agent.collector3.CollectorDb._newScheduler(CollectorDb.java:172)
at com.santaba.agent.collector3.CollectorDb.<init>(CollectorDb.java:68)
at com.santaba.agent.collector3.CollectorDb.<clinit>(CollectorDb.java:65)
at com.santaba.agent.agentmonitor.StatusListener._getAgentStatusResponse(StatusListener.java:279)
at com.santaba.agent.agentmonitor.StatusListener$1.run(StatusListener.java:117) /
[03-26 15:53:03.947 EDT] [INFO] [1] [default] [controller] [Controller2._initHttpService:469] Agent starting with ID - 00baae57-3971-4239-9610-b512aae9c21csbagent
[03-26 15:53:04.232 EDT] [MSG] [INFO] [main::controller:main] [SSLUtilities.checkCertificates:160] Invalid or wrong SSL Certificates found, CONTEXT=info=Found total 2 certificates:
Subject: CN=*.logicmonitor.com, OU=Domain Control Validated
Issuer: CN=SSLInterception87
Type: X.509
SHA1: 9a:a6:ff:33:85:cc:13:4c:3a:13:11:77:5c:ef:5e:a7:74:65:6b:de
MD5: 61:35:08:b5:ec:71:a2:ae:05:c4:7f:54:f1:aa:6f:ad
Valid from: 2017-04-19 10:02:01 -0400
Valid to: 2020-06-18 17:33:09 -0400Subject: CN=SSLInterception3
Issuer: CN=BillyBob's-CA, DC=slhn, DC=org
Type: X.509
SHA1: 6b:a8:1f:61:7b:5d:f0:e4:ee:7e:6a:1b:bb:18:de:67:be:5c:44:1d
MD5: d0:fc:64:da:6f:9b:1f:8d:1a:52:64:dc:41:da:e7:1c
Valid from: 2017-08-09 15:08:18 -0400
Valid to: 2021-10-03 08:53:12 -0400 */
[03-26 15:53:04.232 EDT] [MSG] [WARN] [main::controller:main] [Controller2._initConfiguration:322] SANTABA SERVER ceriticates not trusted, CONTEXT=Host=generic-customer.logicmonitor.com, port=443
Solution A (Preferred)
Have the local administrator add the SSL certificate to your allow list so that it comes into the network unmodified by a proxy/firewall. This is the preferred option because it preserves security.
Solution B
Change the collector configuration setting from:
EnforceLogicMonitorSSL=true
to:
EnforceLogicMonitorSSL=false
Removing SSL enforcement lowers the security of the connection between your collector and LogicMonitor and, for this reason, should be carefully considered before implementing.