Argus discovers all Kubernetes resources that are monitored by LogicMonitor. In addition, you can use Argus to exclude resources from monitoring.

You can add resource filters in the Argus configuration files under the filter parameters. The rule engine evaluates rules sequentially; when the first rule is met and is evaluated as true, the rule engine breaks the rules evaluation, and the resource is excluded.

Note: LM-Container v8.2.0 and later disables filtering for LM Container resources from monitoring. This feature ensures effective deployment management.

Following is the sample configuration:

FiltersDescription
'type in ("pod", "deployment") && name =~ "nginx"'If the name of the pods and deployments contains nginx, those pods and deployments will get excluded.
'!(type == "pod" && namespace == "kube-system")'Negates rule to simplify as whitelist rule; only pods of kube-system namespace will get monitored, remaining pods gets excluded.
'namespace == "default"'
Excludes all resources of the default namespace.
'!(type != "replicaset" && ns == "logicmonitor")'Excludes replicasets within LogicMonitor namespace – remaining resources will be included. In addition, this rule is equivalent to 'type == "replicaset" && ns == "logicmonitor"

Note: If you want to exclude only helm chart secrets, the rule is

argus:
 filters:
    "secret" && jsonGet(object, "type") == "helm.sh/release.v1"'

If you want to exclude all the resources of any type, then instead of adding a rule with a wildcard notation “name =~.*”, you must add the resources in the monitoring disable list:

Example:

argus:
    monitoring:
       disable:
            - "replicasets"

LogicMonitor uses the open-source Govaluate library for evaluating the filter expressions.  For more information on the syntax of the filter expressions, see the Govaluate expressions manual.

Rule Engine Variables

You can write filter rules using the following variables that are made available to the rule engine:

Variable NameValueValue DatatypeComments
typeResource TypestringThe following operators will work on the type variable: "==", "!=", "in"
Note: (known issues): in operator on the type variable doesn’t work when the array has only one element.
nameResource NameString
namespaceResource NamespaceStringEmpty, if the resource is not namespace scoped.
Resource labels with their keys as variable namesResource Labels values against its keysStringNote: As per the Govaluate documentation, you need to escape variable names having special characters viz. dot (.), hyphen (-), etc. using square brackets around it.
Resource annotations with their keys as variable namesResource Annotations values against its keysStringNote: As per the Govaluate documentation, you need to escape variable names having special characters viz. dot (.), hyphen (-), etc. using square brackets around it.

Note: If the same key is used for annotations and labels, the label value gets higher priority and is used for evaluation.

Rule Writing Guidelines

  1. Rules must be written in single-quoted strings to avoid parsing errors across YAML conversions.
  2. There must be no distinction such as include rules and exclude rules. If the rule is evaluated as true that means resources will get excluded.

Note: In some cases, if you want to simplify the rule, you can write the include rule and invert its value to make it an exclude rule.

Example 1:

If you want to monitor only resources of a single namespace frontend, the rule is

'!(ns == "frontend")'

Note: As per the Govaluate documentation, you need to escape variable names having special characters viz. dot (.), hyphen (-), etc. using square brackets around it.

Example 2

You have added a web service label on resources having their respective value. If you want to monitor only the user web service resources while excluding the remaining resources of all remaining services, then you can write the rule as '!([web-service] == "user")' – here square brackets define everything within it as a variable name while parsing the rule. If you miss surrounding the web-service variable then Govaluate makes it a mathematical expression web -(minus) service which will not exclude the resources as expected.

Example 3

The following example presents a few possible configurations you can use to selectively monitor resources in your cluster:

filters:
# Remove NGINX pods and deployment from monitoring
- 'type in ("pod", "deployment") && name =~ "nginx"'
# Remove pods in kube-system namespace from monitoring
- '(type == "pod" && namespace == "kube-system")'
# Remove resources in the default namespace from monitoring
- 'namespace == "default"'
# Remove relicasets in the logicmonitor namespace from monitoring
- '(type != "replicaset" && ns == "logicmonitor")'

Available Operators to Write Rules

OperatorDescriptionCommentsExamples
==EqualityExact string matchns == "default"
!=InequalityIs not equal to the exact stringname != "nginx"
=~Regex matchRegex having a dot and hyphen may not work in some casesname =~ "nginx" Resources having prefixes as Nginx in their name, then the resources will get excluded
!~Inverted regex patternEquivalent to !(<regex>)name !~ “nginx” equivalent to !(name =~ “nginx”)Resources that do not have Nginx in the name will be excluded
&&Logical ANDShort circuits if the left side expression is falsens == "default" && name =~ "nginx". This will exclude resources of the default namespace that has Nginx in the name
||Logical ORShort circuits if the left side evaluates to true. Although the operator is available, you must write another rule if the left side and right side are not logically connected, as the set of rules are OR’ed.
inMembership in arrayPerforms equality == to check membershipns in ("kube-system", "kube-public")This will exclude the resources of mentioned namespaces
()Parenthesis to group

Disclaimer: Argus and Collectorset-Controller Helm Charts are being phased out. For more information to switching to the new LM Container Helm Chart for a simpler install and upgrade experience, see Migrating Existing Kubernetes Clusters Using LM Container Helm Chart.

Argus is a tool for monitoring Kubernetes with LogicMonitor. Argus runs as a Pod in your cluster and uses LogicMonitor’s API to add Nodes, Pods, Services, and resources into monitoring. Once in monitoring, data collection starts automatically. Argus uses Kubernetes API to collect data for all monitored resources. Additionally, you can leverage LogicMonitor DataSources to monitor your applications running within the cluster.

Argus offers the following key features for Kubernetes monitoring:

You can install Argus either by using the command-line interface (CLI) or from your LogicMonitor portal.

Requirements

To install Argus, you need the following:

Installing Collectorset-Controller

A Collectorset-Controller automatically installs and manages the LogicMonitor’s Collectors based on the requirements added in the configurations file.

Requirements:

1. Navigate to Resources > Add and select “Kubernetes Cluster”.

2. On the Add Kubernetes Cluster page, enter the required details and click Next.

3. In the Install Argus section, select “Edit Configurations” to enter the required configuration details.

Note: You may click Download File, to edit the configuration file from your LogicMonitor portal, or you can create the file using the template.

4. Update the configuration parameters in the configuration file.

5. Export the configuration file path and enter the following helm command:

$ export COLLECTORSET_CONTROLLER_CONF_FILE=<collectorset-controller-configuration-file-path>
$ helm upgrade \
  --install \
  --debug \
  --wait \
  --namespace="$NAMESPACE" \
  -f "$COLLECTORSET_CONTROLLER_CONF_FILE" \
  collectorset-controller logicmonitor/collectorset-controller

For more information on the list of values and their descriptions used in the collectorset-controller configuration yaml file, see Default value from ArtifactHUB.

Installing Argus

The setup wizard provides the configuration and installation commands for the applications needed for monitoring CollectorSet-Controller and Argus.

  1. From the setup wizard, select “Edit Configuration” to customize the YAML configuration files for the CollectorSet-Controller and Argus. For the complete list of configuration options, see Configurations.

Note: You can click Download File to edit the configuration and install them using the Kubernetes CLI.

2. Select “Install” to see the Helm commands to install the CollectorSet-Controller and Argus and LogicMonitor’s Helm Charts. You can copy and paste the commands to install the integration into your cluster.

For more information on the list of values and their descriptions used in the argus configuration yaml file, see Default value from ArtifactHUB.

3. Click Verify Connection to ensure that the Collectors and the cluster resources were properly added into monitoring.

Note: The connection verification process may take up to a minute.

Disclaimer: Argus and Collectorset-Controller Helm Charts are being phased out. For more information to switching to the new LM Container Helm Chart for a simpler install and upgrade experiencere, see Migrating Existing Kubernetes Clusters Using LM Container Helm Chart.

Configuring Collectorset-Controller using Helm Charts

The Collectorset-Controller Helm chart supports the following values:
Required Values

ParametersDescription
accessID (default: “”)The LogicMonitor API key ID
accessKey (default: “”)The LogicMonitor API key
account (default: “”)The LogicMonitor account name

Optional Values

Parameters Settings

ParametersDescription
imageRepository (default: “logicmonitor/collectorset-controller”)The image repository of the Collectorset-Controller container
imageTagThe image tag of the Collectorset-Controller container
proxyURL (default: “”)Http/s proxy URL
proxyUser (default: “”)Http/s proxy username
proxyPass (default: “”)Http/s proxy password

Advanced Parameters Settings

ParametersDescription
enableRBAC (default: true)If your cluster does not have RBAC enabled, this value should be set to false
etcdDiscoveryToken (default: “”)The public etcd discovery token used to add etcd hosts to the cluster device group
imagePullPolicy (default: “Always”)The image pull policy of the Collectorset-Controller container
nodeSelector (default: {})It provides the simplest way to run Pod on particular Node(s) based on labels on the node
affinity (default: {})It allows you to constrain which nodes your Pod is eligible to be scheduled on
priorityClassName (default: “”)The priority class name for Pod priority. If this parameter is set then you must have PriorityClass resource created otherwise Pod will be rejected
tolerations (default: [])Tolerations are applied to Pods and allow the Pods to schedule onto nodes with matching taints

Configuring Argus using Helm Chart

The Argus Helm chart supports the following values:

Required Values

ParametersDescription
accessID (default: “”)The LogicMonitor API key ID
accessKey (default: “”)The LogicMonitor API key
account (default: “”)The LogicMonitor account name
clusterName (default: “”)A unique name given to the cluster’s device group
logLevel (default: “info”)Set Argus Log Level
collector.replicas (default: 1)The number of collectors to create and use with Argus
collector.size (default: “”)The collector size to install. Can be nano, small, medium, or large

Optional Values

ParametersDescription
clusterGroupID (default: 1)A clusterGroupID is a user-defined static or dynamic group under which Kubernetes Cluster’s dynamic group is created. A cluster group is a logical organization of Kubernetes clusters and is uniquely identified through its numeric identifier. Every Kubernetes cluster must be associated with a cluster group by specifying the group’s id in this field. If not specified, the default value corresponding to the root group (1) is added to the cluster group ID.
Note: The dynamic groups have references to resources according to AppliesTo function evaluation.
resourceGroupID (default: 1)The resourceGroupID is the ID of the primary resource group where the Argus resources are created. If not specified, the resourceGroupID takes the default value corresponding to the root group (1).
Note: The root group contains special logic to display only the groups and not the resources, so we recommend setting the resourceGroupID value to other values if you need to narrow the scope of the API tokens beyond the root group.
imageRepository (default: “logicmonitor/argus”)The image repository of the Argus container
imageTagThe image tag of the Argus container
proxyURL (default: “”)The Http/s proxy URL
proxyUser (default: “”)The Http/s proxy username
proxyPass (default: “”)The Http/s proxy password
collector.groupID (default: 0)The ID of the group of the collectors
collector.escalationChainID (default: 0)The ID of the escalation chain of the collectors
collector.collectorVersion (default: 0)The version of the collectors
collector.useEA (default: false)On a collector downloading event, either download the latest EA version or the latest GD version
collector.proxyURL (default: “”)The Http/s proxy URL of the collectors
collector.proxyUser (default: “”)The Http/s proxy username of the collectors
collector.proxyPass (default: “”)The Http/s proxy password of the collectors

Advanced Parameter Settings

ParametersDescription
enableRBAC (default: true)Enable RBAC. If your cluster does not have RBAC enabled, this value should be set to false
etcdDiscoveryToken (default: “”)The public etcd discovery token used to add etcd hosts to the cluster resource group
imagePullPolicy (default: “Always”)The image pull policy of the Argus container
nodeSelector (default: {})It provides the simplest way to run Pod on particular Node(s) based on labels on the node
affinity (default: {})It allows you to constrain which nodes your Pod is eligible to be scheduled on
priorityClassName (default: “”)The priority class name for Pod priority. If this parameter is set then the user must have PriorityClass resource created otherwise Pod will be rejected
tolerations (default: [])Tolerations are applied to Pods, and allow the Pods to schedule onto nodes with matching taints
filters.pod (default: “”)The filtered expression for Pod resource type. Based on this parameter, Pods would be added or deleted for discovery on LM
filters.service (default: “”)The filtered expression for Service resource type. Based on this parameter, Services would be added or deleted for discovery on LM
filters.node (default: “”)The filtered expression for Node resource type. Based on this parameter, Nodes would be added or deleted for discovery on LM
filters.deployment (default: “”)The filtered expression for Deployment resource type. Based on this parameter, Deployments would be added/deleted for discovery on LM
collector.statefulsetspecHolds the Collector Pod’s Statefulfulset specification as per the Kubernetes statefulset object’s spec format.
Refer statefulset basics for more information

With Argus v5 or the previous resource tree, dynamic groups were created for each type of resource. Now with Argus v7, the resource tree is enabled by default, to optimize the performance. In the new resource tree, all the namespace scoped resources such as Pods, Deployments, and Services are grouped under Namespaces, and cluster scoped resources such as etcd is grouped under the ClusterScoped group.

Note: The resource tree is automatically enabled with fresh installed Argus helm chart version 2.0.0 and Argus version v7. However, if you are upgrading Argus, then you must follow the instructions mentioned in the Updating Resource Tree section.

Updating Resource Tree

To enable the new resource tree structure, set enableNewResourceTree: true in the argus-config.yaml file. Following are the samples of Argus v5 and v6 resource tree structure :Argus v5 resource tree sample:

Argus v6 resource tree sample:

Note: By enabling the new resource tree structure few of the components such as reports, alerts, services, etc might get affected and may not be displayed in the Dashboard.

You must update the Item field as per the new resource tree structure for the resources to be displayed in the Dashboard.

Once you have successfully made the changes to the resources in the new structure, you may delete the old resource tree structure.

Note: Ensure to select the Delete default option while deleting the old resource tree structure.

Disclaimer: Argus and Collectorset-Controller Helm Charts are being phased out. For more information to switching to the new LM Container Helm Chart for a simpler install and upgrade experiencere, see Migrating Existing Kubernetes Clusters Using LM Container Helm Chart.

When you add your Kubernetes cluster into monitoring, dynamic groups are used to group the cluster resources. For more information on adding a Kubernetes cluster into monitoring, see Adding Kubernetes Cluster into Monitoring.

Earlier, for adding dynamic groups, you required ‘manage all resources’ permission. However, now you can use API keys that have access to at least one resource group to add clusters for monitoring. The dynamic groups are linked to the resource groups that have the API keys with view permissions.

Administrator: Steps to Assign Permission to a Non-Admin User

  1. Create different device groups for non-admin users. For more information, see Adding Device Groups.
  1. Navigate to Settings> User Access> User and Roles> Roles tab.
  2. Select the required user group and click the Gear icon.
  3. On the Manage Role dialog box, provide access to the respective static groups.
    Note: You can create multiple users with specific roles from the Manage Role dialog box.

    Once the required permissions are provided, the non-admin users can add and monitor the Kubernetes clusters within the static groups respectively.


  4. Create different Collector groups and Dashboard groups for the non-admin users and provide access to the users for their respective groups.
  5. Select the User Profile setting and grant access to the non-admin users to create the API tokens and manage their profiles.

Note: Ensure to select the View checkbox for Collectors.

Non-admin user: Adding a Kubernetes Cluster

Once the administrator has completed all the prerequisites and allocated a resource group, then you can complete the following steps to add the Kubernetes cluster:

Note: You must have access to at least one collector before adding a Kubernetes cluster.

1. Navigate to Resources > Devices > select the allocated resource group for adding the cluster.

2. From the Add drop-down list, select Kubernetes Cluster.

3. On the Add Kubernetes Cluster page, add the following information:

a. In the Cluster Name field, enter the cluster name.

b.In the API token field, select the API token for the allocated resource group and click Save.
The other API Token fields information will get populated automatically.

c. In the Resource Group field, select the allocated resource group name.

Note: If you select the root group in the Resource Group field, an error message “Insufficient permissions” will occur.

d. In the Collector Group and Dashboard Group fields, select the allocated Resource Group.

4. Click Next.

5. In the Install Instruction section, select the Argus tab.

6. Select the resourceGroupID parameter and replace the default value with the system.deviceGroupId property value of the allocated resource group.

Note: To view the system.deviceGroupId value, navigate to Resources > Devices > select the allocated resource group, and click the Info tab.

7. Click Verify Connection. Once the connection is successful, your cluster is added.

Note: Currently, you cannot add services to the Kubernetes cluster. You must contact the administrator to add the required services.

Google Anthos enables you to run and manage Kubernetes Clusters running anywhere, whether natively via Google Kubernetes Engine (GKE) or externally in another cloud service and on-premise.

With LM Container, you can ensure central and consistent monitoring for clusters managed through Anthos. Specifically, LM Container provides a central view within LogicMonitor for you to monitor the health and performance of your cluster components (such as nodes, pods, containers, services, deployments, and so on) as well as the performance of the applications running within the clusters.

To get started, add each of your clusters into monitoring.

Once added, you should see the Anthos-managed clusters within LogicMonitor. Container applications will be automatically detected and monitored.

Alternatively, you can develop your own LogicModules to monitor custom applications.

To get additional metrics for Google Cloud Platform (GCP) resources, such as GKE notes, you can add your GCP projects into monitoring.

LogicMonitor’s Kubernetes monitoring involves two semantically versioned applications: Argus and the Collectorset-Controller. New versions of these applications provide bug fixes, improvements, and new features. You can see the latest available versions of these applications (and the enhancements they provide) on their respective GitHub host pages.

Release Tagging

Argus images are tagged with semantic versioning in addition to a tag for the major version (for example, v5) which will match all images that get published with that major version (including minor improvement and bug fix increments). This ensures that new images with minor improvements and bug fixes are used automatically.

For example, for Argus version 5.0.0, the following tags are applied to the image: “v5” and “5.0.0”. If you reference “v5” in your Helm deployment at the time 5.0.0 is released, that tag would also match subsequent Argus versions 5.1.0, 5.1.1, and so on upon their release.

Opt-in is still required when a new major version is released. This means that if your Helm deployment for Argus and Collectorset-Controller currently references “latest” or hard-coded version image tags, you will need to opt into the upgrade using the instructions provided in the next section.

Opting Into Major Version Changes

Prerequisites

Ensure the following requirements are met before upgrading:

Upgrade Instructions

To upgrade the Argus chart from older versions to the latest version, you require an Argus configuration file for running the helm upgrade command.

Creating Argus configuration file

For creating .yaml configuration files for argus and collector-set controller, complete the following steps:

1. Download the configuration file template from here.

2. Get existing values using the command:

helm get values argus 

3. Save in the backup configuration file.

4. Put all existing values in the downloaded configuration file at their appropriate places. And leave the remaining values at their default.

Note: If you using Argus v6.0 or later, and the existing filtering configuration contains “ * ” for excluding all the resources from monitoring, you must remove “ * ” from the filtering configuration.

With Argus v6.0, a new parameter “disableResourceMonitoring” is added for excluding all the resources of a specific type.
You must specify the resourcetype in the disableResourceMonitoring parameter list, as follows:

disableResourceMonitoring:

- nodes

- services

- deployments

Upgrading the Argus using Helm deployment

For upgrading to the latest version of Argus, complete the following steps:

  1. Run helm repo update.
  2. To upgrade the Argus helm chart, run the following command:

    helm upgrade -f argus-config.yaml argus logicmonitor/argus

Note: You may need to recreate the Argus pod if the pods are not recreated automatically.
Helm does not recreate pods if there is no change in definitions.

Updating Configuration Parameters in the Configuration File

Export the configuration file path and enter the following helm command:

$ export ARGUS_CONF_FILE=<argus-configuration-file-path>
$ helm upgrade \
  --install \
  --debug \
  --wait \
  --namespace="$NAMESPACE" \
  -f "$ARGUS_CONF_FILE" \
  argus logicmonitor/argus

See the Configurations section for a list of values the Argus helm chart supports and their descriptions.

Configuration templates of Helm charts values.yaml

Collectorset Controller

Latest Collectorset-controller-config.yaml

Chart VersionCollectorset-controller VersionConfiguration
Chart 1.0.0Collectorset-controller 3.0.0collectorset-controller-config-1.0.0.yaml
Chart 0.11.0Collectorset-controller 2.2.0collectorset-controller-config-0.11.0.yaml
Chart 0.10.0Collectorset-controller 2.2.0collectorset-controller-config-0.10.0.yaml
Chart 0.9.0Collectorset-controller 2.2.0collectorset-controller-config-0.9.0.yaml
Chart 0.8.0Collectorset-controller 2.1.0collectorset-controller-config-0.8.0.yaml

Argus

Latest argus-config.yaml

Chart VersionArgus VersionConfiguration
Chart 2.0.0Argus 7.0.0argus-config-2.0.0.yaml
Chart 1.2.0Argus 6.1.2argus-config-1.2.0.yaml
Chart 1.1.0Argus 6.1.0argus-config-1.1.0.yaml
Chart 1.0.0Argus 6.0.0argus-config-1.0.0.yaml
Chart 0.18.0Argus 5.1.1argus-config-0.18.0.yaml
Chart 0.17.0Argus 5.1.0argus-config-0.17.0.yaml
Chart 0.16.1Argus 5.0.0argus-config-0.16.1.yaml
Chart 0.16.0Argus 5.0.0argus-config-0.16.0.yaml
Chart 0.15.0Argus 4.2.0argus-config-0.15.0.yaml
Chart 0.14.0Argus 4.1.0argus-config-0.14.0.yaml

Note: The helm chart version should match with the corresponding Argus and Collectorset controller versions.

For example, if you are upgrading Argus from v5.0.0 to v6.1.0, then the Helm chart version must be v1.1.0.

Similarly, if you are upgrading Collectorset controller v2.2.0 to v2.4.0, then the Helm chart version must be v0.12.0.

Overview

​Istio is a service mesh that provides traffic management, policy enforcement, and telemetry collection for microservices. Using LogicMonitor’s Istio package for Kubernetes, you can gather metrics from the backing Prometheus time-series database (TSDB) that comes bundled with Istio. LogicMonitor’s Istio Kubernetes package supports Kubernetes only.

​The LogicModules in the Istio Kubernetes package utilize the “/metrics” endpoint from the Prometheus pod on port 9090. By default, these LogicModules apply to the pod in the “istio-system” namespace labeled with “app=prometheus”.

Compatibility

LogicMonitor’s Istio Kubernetes package supports Kubernetes only. As Istio releases support for other platforms, LogicMonitor will test and extend coverage as necessary.

Setup Requirements

Import LogicModules

From the LogicMonitor Repository, import all Istio Kubernetes LogicModules, which are listed in the LogicModules in Package section of this support article.

Add Devices Into Monitoring

It is recommended that your Kubernetes cluster already be added into LogicMonitor for monitoring, as many of the AppliesTo properties set by Argus, LogicMonitor’s open-source Kubernetes monitoring solution, are necessary for seamless operation. For instructions on adding your Kubernetes cluster into monitoring, see Adding your Kubernetes Cluster into Monitoring.

Port Access

The Istio Kubernetes package gathers metrics from the backing Prometheus TSDB that comes bundled with Istio. Istio must be installed on the Kubernetes cluster and port 9090, which is used to access the Prometheus “/metrics” endpoint, must be open to the Collector.​

LogicModules in Package

​LogicMonitor’s Istio Kubernetes package consists of the following LogicModules. For full coverage, please ensure that all of these LogicModules are imported into your LogicMonitor platform.

Display Name

Type

Description

Istio Kubernetes Connections DataSource Collects net_conntrack_dialer_con* Istio metrics from Prometheus such as attempted, closed, established, refused and failed connections.
Istio Kubernetes Go Performance DataSource Collects go_* Istio metrics from Prometheus such as Go memory, cpu, routines, threads, etc.
Istio Kubernetes Process Performance DataSource Collects process_* Istio metrics from Prometheus such as CPU time usage, file descriptors, memory, flaps, etc.
Istio Kubernetes Prometheus Performance DataSource Collects prometheustsdb* Istio metrics from Prometheus such as transactions, blocks, checkpoints, compactions, fsync, garbage collection, etc.
Istio Kubernetes Prometheus Queries DataSource Collects prometheus_engine* and Istio metrics from Prometheus such as query counts and latencies across for API, engine, prepare, queue, etc.
Istio Kubernetes Prometheus Scrape Performance DataSource Collects prometheus_target* Istio metrics from Prometheus such as scrape attempts, failures, reloads, etc.
Istio Kubernetes Prometheus Scrape Pools DataSource Collects prometheus_target* Istio metrics from Prometheus such as scrape counts and latency.
Istio Kubernetes Prometheus Service Discovery DataSource Collects prometheus_sd* Istio metrics from Prometheus such as latency and failures for Azure, Consul, EC2, GCE, Kubernetes, etc.
Istio Kubernetes Prometheus Traffic DataSource Collects promhttp_metric_handler_requests_* Istio metrics such as HTTP responses, response status codes, size, health and InFlight requests.

When setting static datapoint thresholds on the various metrics tracked by this integration package, LogicMonitor follows the technology owner’s best practice KPI recommendations. If necessary, we encourage you to adjust these predefined thresholds to meet the unique needs of your environment. For more information on tuning datapoint thresholds, see Tuning Static Thresholds for Datapoints.

When you initially add your cluster into monitoring via the ‘Add Kubernetes Cluster’ wizard, you configure various aspects of monitoring (Collector size, whether Kubernetes RBAC is enabled, how many Collector replicas you want, etc.) and LogicMonitor provides you with Helm commands that enable install matching your configuration:

This wizard is not available after adding your cluster into monitoring. If you need to change one of these configuration items later, the most straightforward way to do so is via Helm directly in your cluster. Specifically, you’ll need to upgrade your Helm release for Argus and/or the Collectorset-Controller and change the desired options.

Note: If Collector pods attempt to recreate with a Collector version that is no longer available, the newest Collector version will be used automatically to ensure successful pod start-up.

Configuration Options

You can see all available configuration options for Argus here, and for the Collectorset-Controller here.

We recommend using the --reuse-values flag during the upgrade to ensure previous options that aren’t changed remain as-is. For example, to update the number of Collector replicas to ‘2’, you might run the following in your cluster:

helm upgrade --reuse-values --recreate-pods --set collector.replicas="2" argus logicmonitor/argus

You can also use Helm to view the current configuration for the Argus and Collectorset-Controller releases, which may be helpful before making changes:

helm get values argus

Note: LogicMonitor updates the Helm Charts repository for certain releases. To ensure you have the latest charts in your cluster, run the following command before making changes:

helm repo update

Deleting LM Container

You may want to start with a fresh install. To remove existing Argus and Collectorset-Controller from a Kubernetes cluster, run the following Helm commands:

helm uninstall collectorset-controller
helm uninstall argus

Kubernetes Resource Visibility Limit

Addition of excessive Kubernetes resources causes service disruptions and degrade portal performance. To support platform performance and optimize the user experience, LogicMonitor enforces a limit on the number of Kubernetes resources.

LogicMonitor enforces this limit starting with v.225. 

Requirements for Resolving Kubernetes Resource Visibility Limit

If the platform-wide limit of 25,000 resources is reached, additional Kubernetes resources are excluded by the system and do not display in the platform.

Before troubleshooting, review the following requirements:

Resolving the Kubernetes Resource Visibility Limit

If the resource limit is reached and additional Kubernetes resources do not display, do the following:

  1. Review the Resource Retention Policy
    Check the globalDeleteAfterDuration property in the lm-container-configuration.yaml file to see your current retention configuration or review the setting in the LogicMonitor portal. You can also check the kubernetes.resourcedeleteafterduration property in the Kubernetes root folder.

Note: Shorter retention periods can help reduce your total Resource count.

  1. Adjust the Retention Period
    If the currently set retention limit is too long, adjust it to a more suitable value. Shortening the retention period reduces the number of retained (deleted) resources counted toward the limit.
  2. Contact Customer Support Manager
    If your operational needs require a higher threshold than 25,000 resources, contact your Customer Success Manager to discuss available options.

If the Helm install commands fail:

If Helm install commands succeed, but your cluster is not added to monitoring:

If Helm install commands succeed, and your cluster was added into monitoring but data collection does not work:

Kubernetes Pod modal

If you are unable to set up Argus and Collectorset-Controller pods in GKE due to memory, CPU, and ephemeral storage, use the following configuration:

statefulsetspec:
      template:
        spec:                                                                                                                                                                               
         containers:
           - name: collector
           resources:
             requests:
               cpu: 1000m
               ephemeral-storage: 5Gi
               memory: 2Gi

If the following gRPC connection failure occurs while installing Argus:

level=warning msg="Error while creating gRPC connection. Error: context deadline exceeded" argus_pod_id=<pod-id> debug_id=<id> goroutine=1 method=pkg/connection.createGRPCConnection watch=init
  1. Run the following command to log in to the Argus Pod shell:

    kubectl exec -it <argus_Pod_name> /bin/sh
  1. Check communication between the Argus and Collectorset-Controller pods:

    curl http://collectorset-controller:50000
  1. If the communication fails and an error occurs, you must check the parameters of the restrictions set in the internal network policies.

If Collector pods restart frequently on an OpenShift v4.x cluster and cause monitoring gaps, do the following:

The Docker collector runs all the collection jobs; however, the limit is insufficient for the large-scale clusters as OpenShift has a default container PID limit of 1024 that limits the number of processes to 1024.

You can modify the settings by using a ContainerRuntimeConfig custom resource.

For example, you have labeled a Pod as machineconfigpool on which you want to increase the PID limit with key as custom-crio and value as custom-pidslimit, then modify the configuration file as follows:

apiVersion: machineconfiguration.openshift.io/v1
kind: ContainerRuntimeConfig
metadata:
 name: custom-pidslimit
spec:
 machineConfigPoolSelector:
   matchLabels:
     custom-crio: custom-pidslimit
 containerRuntimeConfig:
   pidsLimit: 4096

Note: The appropriate PID limit may vary based on the collector size and the number of Kubernetes resources that are monitored. However, the default PID limit set for the small-size collector is a minimum of 4096.

You can also verify if the PID limit is effective by entering the following command in the pod shell:
cat /sys/fs/cgroup/pids/pids.current

14-day access to the full LogicMonitor platform