Support Center Home


Linux (via SSH) Monitoring

Overview

Secure Shell (SSH) is a cryptographic network protocol for operating network services securely over an unsecured network. SSH clients are distributed with most Linux-based machines. Typical SSH applications include remote command-line, login, and remote command execution, but any network service can be secured via SSH.

LogicMonitor offers monitoring for Linux systems that leverages the SSH protocol to collect various metrics including CPU, memory, and filesystem utilization; uptime; and throughput to name a few. However, this monitoring is designed to only go into effect if SNMP isn't configured for the system. Generally, if SNMP is configured, more robust out-of-the-box monitoring will activate and there is no need to configure the SSH monitoring provided by this Linux SSH package.

Setup Requirements

Add Resource Into Monitoring

Add your Linux host into monitoring. For more information on adding resources into monitoring, see Adding Devices.

Enable SSH

SSH must be configured on the Linux host in order for the DataSources to apply.

Generate SSH Keys

If you will be authenticating the Collector's access to the device using an SSH key (rather than a password), you'll need to generate the SSH public-private key pair and copy the public key between the Collector host that is assigned to the device in LogicMonitor and the device itself. For instructions on generating the SSH key pair and copying the public key, see Generate a New SSH Key.

Assign Properties to Resources

SSH credentials must be set as properties on the Linux resource within LogicMonitor. These properties allow LogicMonitor to pass the appropriate credentials onto the Linux host for authentication. For more information on setting SSH authentication credentials as properties, see Defining Authentication Credentials.

Import LogicModules

From the LogicMonitor public repository, import all Linux SSH LogicModules, which are listed in the LogicModules in Package section of this support article. If these LogicModules are already present, ensure you have the most recent versions.

Once the LogicModules are imported (assuming all previous setup requirements have been met), the suite of DataSources will automatically begin collecting data. ​

Adding Instances for Linux SSH Control Groups and Service Status

The Control Groups and Service Status DataSources do not have Active Discovery enabled from their DataSource definitions and will require either the manual addition of instances or the enabling of Active Discovery via the scripts provided below. These DataSources have been configured in this way because the automatic enabling of Active Discovery for all cgroups and services on a given host has the potential to produce too many instances, causing rapid alert flooding in the LogicMonitor platform or an unmanageable list of instances.

For this reason, we recommend manually adding selected cgroups and services as monitored instances, as outlined in the following two sections.

Note: The following instructions assume a minimum version installation of Linux kernel 2.6.24. LogicMonitor’s Control Groups and Service Status DataSources are verified to be compatible with the following Linux distros:

  • CentOS
  • Debian
  • Oracle Linux
  • RHEL
  • Ubuntu

Finding and Manually Adding Control Groups as Instances

The following set of instructions uses a Docker container as an example of a cgroup we would like to monitor.

  1. From the command line, run: systemd-cgtop -n1 -b

    This command displays the cgroups that are using the most resources. The -n1 flag (shorthand for --iterations=1) denotes that we only want one iteration of the command to execute. The -b flag (short for --batch) forces the command to run in “batch” mode—in other words, do not accept input and run until the iteration limit set is exhausted or until killed).

  2. From the resulting output, copy the name of the control group you want to monitor excluding the parent container. In the example below, we will add atd.service excluding system.slice/.

    Note: Unless “CPUAccounting=1” and “MemoryAccounting=1” are enabled for the services in question, no resource accounting will be available and the data shown by systemd-cgtop will be incomplete.

  3. Navigate to the Linux host on the Resources page and select ‘Add Monitored Instance’ from the dropdown menu located next to the Manage menu.

    Note: For more information on manually adding instances, see Adding Instances.

  4. In the Add Monitored Instance dialog, enter “Control Groups” in the DataSource field and enter “atd.service” in the Wildcard Value field. Enter whatever you would like this instance to be called in the Name field (you may use the wildcard value if you like). Optionally, you may also add a description.
  5. After completing the dialog, click Save. If the action was successful, you’ll be able to see the instance under the DataSource on the Resources page.

  6. To verify data collection is successful, navigate to the Raw Data tab for this instance and click Poll Now.

Finding and Manually Adding Services as Instances

  1. From the command line, run: systemctl list-units -a --type=service

    This command displays all units that systemd loaded or attempted to load, regardless of their current state on the system. The -a flag (shorthand for --all) ensures all units are listed including those which are inactive. The --type=service flag returns only service units.

  2. From the resulting output, copy the name of the service you want to monitor from the UNIT column.

  3. Follow the steps listed for control groups above to manually create an instance for a service using “Service Status” in the DataSource field and the copied service name in the Wildvalue field.

Enabling Active Discovery for Automatic Instance Adding

If you are confident that your system will not be overwhelmed with an unmanageable number of instances, you may enable Active Discovery from the DataSource definition and embed the corresponding Groovy script as a parameter of the “SCRIPT” discovery method. (See What Is Active Discovery? for more information on configuring Active Discovery for a DataSource.)

Active Discovery Script for the Linux_SSH_Cgroups DataSource Definition
/*******************************************************************************
 *  © 2007-2020 - LogicMonitor, Inc. All rights reserved.
 ******************************************************************************/

import com.jcraft.jsch.JSch
import com.santaba.agent.util.Settings

host = hostProps.get("system.hostname")
user = hostProps.get("ssh.user")
pass = hostProps.get("ssh.pass")
port = hostProps.get("ssh.port")?.toInteger() ?: 22
cert = hostProps.get("ssh.cert") ?: '~/.ssh/id_rsa'
timeout = 15000 // timeout in milliseconds


// Expected output pattern capturing cgroup name, tasks, CPU, memory, input, output.
// Unless "CPUAccounting=1" and "MemoryAccounting=1" are enabled for the services in question, 
// no resource accounting will be available and the data shown by systemd-cgtop will be incomplete.
def line_pattern = ~/^\/?([^\/]+)\/(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s*$/

def command = 'systemd-cgtop -n1 -b --cpu=percentage'
def command_output = getCommandOutput(command)

command_output.eachLine { line ->
    def matcher = line_pattern.matcher(line) ?: [:]
    // Process lines that contain a match except cgroups in user.slice
    if (matcher.size() > 0 && matcher[0][1] != "user.slice" && matcher[0][1] != "docker") {
        // Replace invalid wildvalue characters with underscores
        def wildvalue   = matcher[0][2].replaceAll(/[:|\\|\s|=]+/,"_")
        def cgroupPath  = matcher[0][1].replaceAll(/[:|\\|\s|=]+/,"_")
        println "${wildvalue}##${wildvalue}######" +
                "auto.cgroup.path=${cgroupPath}"
    }
}
return 0


// Helper function for SSH connection and command passing
def getCommandOutput(String input_command) {
    try {
        // instantiate JSCH object.
        jsch = new JSch()

        // do we have an user and no pass ?
        if (user && !pass) {
            // Yes, so lets try connecting via cert.
            jsch.addIdentity(cert)
        }

        // create session.
        session = jsch.getSession(user, host, port)

        // given we are running non-interactively, we will automatically accept new host keys.
        session.setConfig("StrictHostKeyChecking", "no");
        String authMethod = Settings.getSetting(Settings.SSH_PREFEREDAUTHENTICATION, Settings.DEFAULT_SSH_PREFEREDAUTHENTICATION);
        session.setConfig("PreferredAuthentications", authMethod);

        // set session timeout, in milliseconds.
        session.setTimeout(timeout)

        // is host configured with a user & password?
        if (pass) {
            // set password.
            session.setPassword(pass);
        }

        // connect
        session.connect()

        // execute command.
        channel = session.openChannel("exec")
        channel.setCommand(input_command)

        // collect command output.
        def commandOutput = channel.getInputStream()
        channel.connect()

        def output = commandOutput.text;

        // disconnect
        channel.disconnect()

        return output
    }
    // ensure we disconnect the session.
    finally {
        session.disconnect()
    }
}
Groovy
Active Discovery Script for the Linux_SSH_ServiceStatus DataSource Definition

Note: The Service Status DataSource has default alerts that you may want to adjust when monitoring all services.

/*******************************************************************************
 *  © 2007-2020 - LogicMonitor, Inc. All rights reserved.
 ******************************************************************************/

import com.jcraft.jsch.JSch
import com.santaba.agent.util.Settings

host = hostProps.get("system.hostname")
user = hostProps.get("ssh.user")
pass = hostProps.get("ssh.pass")
port = hostProps.get("ssh.port")?.toInteger() ?: 22
cert = hostProps.get("ssh.cert") ?: '~/.ssh/id_rsa'
timeout = 15000 // timeout in milliseconds


// Expected pattern of output lines with data
def line_pattern = ~/^\/?\s+?(\S+)\s+(\w*loaded|not-found|masked\w*)\s+(\w*active|inactive|failed\w*)\s+(\S+)\s+(.*)$/

// Run command to show any unit that systemd loaded or attempted to load, regardless of its current state on the system.
def command = 'systemctl list-units --all --type=service --plain'
def command_output = getCommandOutput(command)

command_output.eachLine { line ->
    def matcher = line_pattern.matcher(line) ?: [:]
    // Process lines that contain a match
    if (matcher.size() > 0) {
        def wildvalue   = matcher[0][1]
        def description = matcher[0][5]?.trim() ?: ""
        println "${wildvalue}##${wildvalue}####${description}##"
    }
}


// Helper function for SSH connection and command passing
def getCommandOutput(String input_command) {
    try {
        // instantiate JSCH object.
        jsch = new JSch()

        // do we have an user and no pass ?
        if (user && !pass) {
            // Yes, so lets try connecting via cert.
            jsch.addIdentity(cert)
        }

        // create session.
        session = jsch.getSession(user, host, port)

        // given we are running non-interactively, we will automatically accept new host keys.
        session.setConfig("StrictHostKeyChecking", "no");
        String authMethod = Settings.getSetting(Settings.SSH_PREFEREDAUTHENTICATION, Settings.DEFAULT_SSH_PREFEREDAUTHENTICATION);
        session.setConfig("PreferredAuthentications", authMethod);

        // set session timeout, in milliseconds.
        session.setTimeout(timeout)

        // is host configured with a user & password?
        if (pass) {
            // set password.
            session.setPassword(pass);
        }

        // connect
        session.connect()

        // execute command.
        channel = session.openChannel("exec")
        channel.setCommand(input_command)

        // collect command output.
        def commandOutput = channel.getInputStream()
        channel.connect()

        def output = commandOutput.text;

        // disconnect
        channel.disconnect()

        return output
    }
    // ensure we disconnect the session.
    finally {
        session.disconnect()
    }
}
Groovy

LogicModules in Package

LogicMonitor's package for monitoring Linux via SSH consists of the following LogicModules. For full coverage, please ensure that all of these LogicModules are imported into your LogicMonitor platform.

Display Name Type Description
addCategory_Linux_SSH PropertySource Assigns a value of "Linux_SSH" to the system.categories property for hosts (excluding AWS and Azure) which have not been properly identified due to unconfigured SNMP, and attempts to connect via SSH using the properties set on the resource/Collector.
Linux_SSH_Info PropertySource Gathers Linux system information such as kernel name, kernel release, kernel version, hardware name, hardware platform, node name, processor type, and operating system.
Block Device Performance DataSource Monitors I/O for disks and partitions on Linux systems via SSH.
Control Groups DataSource Linux Control Groups resource and task usage via the systemd-cgtop command.
Control Group Status DataSource (DEPRECATED August 2020) Linux Control Groups status monitoring via the systemd-cgtop command.
CPU Cores DataSource Monitors CPU usage per core via SSH.
CPU / Memory DataSource Monitors Linux CPU and Memory statistics via SSH.
Filesystems DataSource Monitors the Linux filesystem utilization metrics.
Network Interfaces DataSource Monitors Linux network interfaces metrics such as throughput, packet transmission, errors, packet drops, collisions and operating status.
Service Status DataSource Linux systemd services via the systemctl command.
TCP / UDP Stats DataSource Retrieves TCP and UDP statistics from netstat.
Uptime DataSource Monitors the Linux hosts uptime via SSH.

When setting static datapoint thresholds on the various metrics tracked by this package's DataSources, LogicMonitor follows the technology owner's best practice KPI recommendations. If necessary, we encourage you to adjust these predefined thresholds to meet the unique needs of your environment. For more information on tuning datapoint thresholds, see Tuning Static Thresholds for Datapoints.

In This Article