Linux (SSH) Monitoring

Last updated on 12 February, 2024

LogicMonitor offers monitoring for Linux systems that leverages the SSH protocol to collect various metrics including CPU, memory, and filesystem utilization; uptime; and throughput to name a few. However, this monitoring is designed only for the systems where SNMP is not configured. If SNMP is configured, more robust out-of-the-box monitoring will activate and there is no need to configure the SSH monitoring provided by this Linux SSH package.

Requirements

  • Add your Linux host into monitoring
    For more information on adding resources into monitoring, see Adding Devices.
  • SSH must be configured on the Linux host for the DataSources to apply.
    You can authenticate the Collector’s access to the device using an SSH public-private key pair instead of a password. This requires you to generate the SSH key pair and copy the public key to the Collector’s host that is assigned to the device in LogicMonitor and the device.
    To use an SSH key pair, you need the following:
    • A valid home directoryThe SSH server verifies with  ~/.ssh/authorized_keys when authenticating incoming connections.
    • The key pair in classic OpenSSH format (.pem)

Note: Use the How to Use ssh-keygen to Generate a New SSH Key? documentation from SSH and the following command to generate the key pair in classic OpenSSH format:

ssh-keygen -m PEM

  • From the LogicMonitor public repository, import all Linux SSH LogicModules, which are listed in the LogicModules in Package section of this support article. If these LogicModules are already present, ensure you have the most recent versions. Ater the LogicModules are imported (assuming all previous setup requirements have been met), the suite of DataSources will automatically begin collecting data.

Assign Properties to Resources

SSH credentials must be set as properties on the Linux resource within LogicMonitor. These properties allow LogicMonitor to pass the appropriate credentials onto the Linux host for authentication. It is strongly recommended that you do not provide a privileged user.

For more information on setting SSH authentication credentials as properties, see Defining Authentication Credentials.

The LogicModules in this package do not require any specialized permissions to monitor Linux via SSH. LogicModules in other monitoring packages may require additional permissions; these will be specified in their respective support articles. The lack of a need for special permissions has been confirmed through testing against clean installations of Debian, Ubuntu Server and CentOS, but does not account for additional hardening steps that may have been applied to a system.

Adding Instances for Linux SSH Control Groups and Service Status

The Control Groups and Service Status DataSources do not have Active Discovery enabled from their DataSource definitions and will require either the manual addition of instances or the enabling of Active Discovery via the scripts provided below. These DataSources have been configured in this way because the automatic enabling of Active Discovery for all cgroups and services on a given host has the potential to produce too many instances, causing rapid alert flooding in the LogicMonitor platform or an unmanageable list of instances.

For this reason, we recommend manually adding selected cgroups and services as monitored instances, as outlined in the following two sections.

The following instructions assume a minimum version installation of Linux kernel 2.6.24. LogicMonitor’s Control Groups and Service Status DataSources are verified to be compatible with the following Linux distros:

  • CentOS
  • Debian
  • Oracle Linux
  • RHEL
  • Ubuntu

Finding and Manually Adding Control Groups as Instances

The following set of instructions uses a Docker container as an example of a cgroup we would like to monitor.

  1. From the command line, run: systemd-cgtop -n1 -b

    This command displays the cgroups that are using the most resources. The -n1 flag (shorthand for --iterations=1) denotes that we only want one iteration of the command to execute. The -b flag (short for --batch) forces the command to run in “batch” mode—in other words, do not accept input and run until the iteration limit set is exhausted or until killed).

  2. From the resulting output, copy the name of the control group you want to monitor excluding the parent container. In the example below, we will add atd.service excluding system.slice/.

Note: Unless “CPUAccounting=1” and “MemoryAccounting=1” are enabled for the services in question, no resource accounting will be available and the data shown by systemd-cgtop will be incomplete.

  1. Navigate to the Linux host on the Resources page and select ‘Add Monitored Instance’ from the dropdown menu located next to the Manage menu. For more information on manually adding instances, see Adding Instances.
  2. In the Add Monitored Instance dialog, enter “Control Groups” in the DataSource field and enter “atd.service” in the Wildcard Value field. Enter whatever you would like this instance to be called in the Name field (you may use the wildcard value if you like). Optionally, you may also add a description.
  3. After completing the dialog, click Save. If the action was successful, you’ll be able to see the instance under the DataSource on the Resources page.


  4. To verify data collection is successful, navigate to the Raw Data tab for this instance and click Poll Now.

Finding and Manually Adding Services as Instances

  1. From the command line, run: systemctl list-units -a --type=service

    This command displays all units that systemd loaded or attempted to load, regardless of their current state on the system. The -a flag (shorthand for --all) ensures all units are listed including those which are inactive. The --type=service flag returns only service units.

  2. From the resulting output, copy the name of the service you want to monitor from the UNIT column.

  3. Follow the steps listed for control groups above to manually create an instance for a service using “Service Status” in the DataSource field and the copied service name in the Wildvalue field.

Note: Services can be added manually by specifying the linux.ssh.services property on the resource.

Enabling Active Discovery for Automatic Instance Adding

If you are confident that your system will not be overwhelmed with an unmanageable number of instances, you may enable Active Discovery from the DataSource definition and embed the corresponding Groovy script as a parameter of the “SCRIPT” discovery method. (See What Is Active Discovery? for more information on configuring Active Discovery for a DataSource.)

Active Discovery Script for the Linux_SSH_Cgroups DataSource Definition

/*******************************************************************************
 *  © 2007-2020 - LogicMonitor, Inc. All rights reserved.
 ******************************************************************************/

import com.jcraft.jsch.JSch
import com.santaba.agent.util.Settings

host = hostProps.get("system.hostname")
user = hostProps.get("ssh.user")
pass = hostProps.get("ssh.pass")
port = hostProps.get("ssh.port")?.toInteger() ?: 22
cert = hostProps.get("ssh.cert") ?: '~/.ssh/id_rsa'
timeout = 15000 // timeout in milliseconds


// Expected output pattern capturing cgroup name, tasks, CPU, memory, input, output.
// Unless "CPUAccounting=1" and "MemoryAccounting=1" are enabled for the services in question, 
// no resource accounting will be available and the data shown by systemd-cgtop will be incomplete.
def line_pattern = ~/^\/?([^\/]+)\/(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s*$/

def command = 'systemd-cgtop -n1 -b --cpu=percentage'
def command_output = getCommandOutput(command)

command_output.eachLine { line ->
    def matcher = line_pattern.matcher(line) ?: [:]
    // Process lines that contain a match except cgroups in user.slice
    if (matcher.size() > 0 && matcher[0][1] != "user.slice" && matcher[0][1] != "docker") {
        // Replace invalid wildvalue characters with underscores
        def wildvalue   = matcher[0][2].replaceAll(/[:|\\|\s|=]+/,"_")
        def cgroupPath  = matcher[0][1].replaceAll(/[:|\\|\s|=]+/,"_")
        println "${wildvalue}##${wildvalue}######" +
                "auto.cgroup.path=${cgroupPath}"
    }
}
return 0


// Helper function for SSH connection and command passing
def getCommandOutput(String input_command) {
    try {
        // instantiate JSCH object.
        jsch = new JSch()

        // do we have an user and no pass ?
        if (user && !pass) {
            // Yes, so lets try connecting via cert.
            jsch.addIdentity(cert)
        }

        // create session.
        session = jsch.getSession(user, host, port)

        // given we are running non-interactively, we will automatically accept new host keys.
        session.setConfig("StrictHostKeyChecking", "no");
        String authMethod = Settings.getSetting(Settings.SSH_PREFEREDAUTHENTICATION, Settings.DEFAULT_SSH_PREFEREDAUTHENTICATION);
        session.setConfig("PreferredAuthentications", authMethod);

        // set session timeout, in milliseconds.
        session.setTimeout(timeout)

        // is host configured with a user & password?
        if (pass) {
            // set password.
            session.setPassword(pass);
        }

        // connect
        session.connect()

        // execute command.
        channel = session.openChannel("exec")
        channel.setCommand(input_command)

        // collect command output.
        def commandOutput = channel.getInputStream()
        channel.connect()

        def output = commandOutput.text;

        // disconnect
        channel.disconnect()

        return output
    }
    // ensure we disconnect the session.
    finally {
        session.disconnect()
    }
}
Groovy

Active Discovery Script for the Linux_SSH_ServiceStatus DataSource Definition

The Service Status DataSource has default alerts that you may want to adjust when monitoring all services.

/*******************************************************************************
 *  © 2007-2022 - LogicMonitor, Inc. All rights reserved.
 ******************************************************************************/

import com.jcraft.jsch.JSch
import com.santaba.agent.util.Settings

host = hostProps.get("system.hostname")
user = hostProps.get("ssh.user")
pass = hostProps.get("ssh.pass")
port = hostProps.get("ssh.port")?.toInteger() ?: 22
cert = hostProps.get("ssh.cert") ?: '~/.ssh/id_rsa'
timeout = 15000 // timeout in milliseconds

def azureHost = hostProps.get("system.azure.privateIpAddress")
if (azureHost && hostProps.get("auto.network.resolves") == "false") host = azureHost

// To run in debug mode, set to true
def debug = false

//Pull in the list of services to monitor from the device properties
ArrayList<String> services = hostProps.get("linux.ssh.services")?.split(",") ?: []

// Expected pattern of output lines with data
def line_pattern = ~/^\/?\s*(\S+)\s+(\w*loaded|not-found|masked\w*)\s+(\w*active|inactive|failed\w*)\s+(\S+)\s+(.*)$/

// Run command to show any unit that systemd loaded or attempted to load, regardless of its current state on the system.
def command = 'systemctl list-units --all --type=service --plain'
def command_output = getCommandOutput(command)

// Turn on debug mode to get the following info about services running on this device
if (debug) {
    println "DEBUG MODE -- LIST OF AVAILABLE SERVICES"
    command_output.eachLine { line ->
        def matcher = line_pattern.matcher(line) ?: [:]
        // Process lines that contain a match
        if (matcher.size() > 0) {
            def service = matcher[0][1]
            def description = matcher[0][5]
            println String.format("%-30s %-30s", service, description)
            println "----------------------------------------"
        }
    }
}

if (services.size() > 0) {
    command_output.eachLine { line ->
        def match= line_pattern.matcher(line) ?: [:]
        if (match.size() > 0 && services.contains(match[0][1])) {
            def service = match[0][1]
            def description = match[0][5].trim() ?: ""
            println "${service}##${service}##${description}####"
        }
    }
}

return 0

// Helper function for SSH connection and command passing
def getCommandOutput(String input_command) {
    try {
        // instantiate JSCH object.
        jsch = new JSch()

        // do we have an user and no pass ?
        if (user && !pass) {
            // Yes, so lets try connecting via cert.
            jsch.addIdentity(cert)
        }

        // create session.
        session = jsch.getSession(user, host, port)

        // given we are running non-interactively, we will automatically accept new host keys.
        session.setConfig("StrictHostKeyChecking", "no");
        String authMethod = Settings.getSetting(Settings.SSH_PREFEREDAUTHENTICATION, Settings.DEFAULT_SSH_PREFEREDAUTHENTICATION);
        session.setConfig("PreferredAuthentications", authMethod);

        // set session timeout, in milliseconds.
        session.setTimeout(timeout)

        // is host configured with a user & password?
        if (pass) {
            // set password.
            session.setPassword(pass);
        }

        // connect
        session.connect()

        // execute command.
        channel = session.openChannel("exec")
        channel.setCommand(input_command)

        // collect command output.
        def commandOutput = channel.getInputStream()
        channel.connect()

        def output = commandOutput.text;

        // disconnect
        channel.disconnect()

        return output
    }
    // ensure we disconnect the session.
    finally {
        session.disconnect()
    }
}
Groovy

LogicModules in Package

LogicMonitor’s package for monitoring Linux via SSH consists of the following LogicModules. For full coverage, please ensure that all of these LogicModules are imported into your LogicMonitor platform.

Display NameTypeDescription
addCategory_Linux_SSHPropertySourceAssigns a value of “Linux_SSH” to the system.categories property for hosts (excluding AWS and Azure) which have not been properly identified due to unconfigured SNMP, and attempts to connect via SSH using the properties set on the resource/Collector.
Linux_SSH_InfoPropertySourceGathers Linux system information such as kernel name, kernel release, kernel version, hardware name, hardware platform, node name, processor type, and operating system.
Block Device PerformanceDataSourceMonitors I/O for disks and partitions on Linux systems via SSH.
Control GroupsDataSourceLinux Control Groups resource and task usage via the systemd-cgtop command.
Control Group StatusDataSource(DEPRECATED August 2020) Linux Control Groups status monitoring via the systemd-cgtop command.
CPU CoresDataSourceMonitors CPU usage per core via SSH.
CPU / MemoryDataSourceMonitors Linux CPU and Memory statistics via SSH.
FilesystemsDataSourceMonitors the Linux filesystem utilization metrics.
Network InterfacesDataSourceMonitors Linux network interfaces metrics such as throughput, packet transmission, errors, packet drops, collisions and operating status.
Service StatusDataSourceLinux systemd services via the systemctl command.
TCP / UDP StatsDataSourceRetrieves TCP and UDP statistics from netstat.
UptimeDataSourceMonitors the Linux hosts uptime via SSH.
In This Article