Windows Server Failover Cluster Monitoring

Overview

LogicMonitor’s Windows Server Failover Cluster (WSFC) monitoring is performed on a virtual device called a virtual network name (VNN). Each VNN is assigned a virtual IP address and, in some cases, multiple virtual IP addresses for specific services. There is no physical device for the cluster; nodes are added as physical devices or virtual machines.

All cluster alerting comes from the VNN. The nodes are monitored the same as any standard device. If a clustered service fails on a node and successfully rolls over to another node, there will be no alerts generated on the cluster unless redundancy is at a critical level.

As of March 2021, LogicMonitor has added support for Storage Spaces Direct (S2D). These modules can be used to track the status and performance of Storage Spaces Direct and alert for failure conditions.

Note: If Microsoft SQL Server is running on a node as a stand-alone server, LogicMonitor will not automatically find it. See Windows Server Failover Cluster (on SQL Server) Monitoring for information on monitoring WSFCs on SQL Server.

Compatibility

As of December 2020, LogicMonitor’s WSFC monitoring package is known to be compatible with Windows Server 2012 through 2019. Clusters have been tested with PowerShell version 5 or greater and may not work with earlier PowerShell versions.

As of March 2021, Storage Spaces Direct modules are known to be compatible with 2016 and 2019 Windows Servers.

As Microsoft releases newer versions of Windows Server, LogicMonitor will test and extend coverage as necessary.

Setup Requirements

Satisfy Dependencies

  • Requires the use of a Windows Collector
    • Recommended to install the Collector in the domain of the cluster or set the WMI user as a local account on the domain
  • The following PropertySources must be present in your portal:
    • addCategory_WindowsFailoverCluster
    • Powershell_Info (not part of this monitoring package)
  • Remote PowerShell must be enabled on all nodes in the cluster (see Using PowerShell Scripts in LogicMonitor)

Add Resources Into Monitoring

Add your cluster nodes into monitoring. For more information on adding resources into monitoring, see Adding Devices.

After all nodes are discovered by LogicMonitor, manually add a resource for the cluster. This resource must use the cluster VNN as its hostname (which is entered into the IP Address/DNS name field). Be sure to use the fully qualified domain name (FQDN) as the hostname; do not use the IP address.

Note: Once the nodes are configured and LogicModules imported, the addCategory_WindowsFailoverCluster PropertySource will automatically add the name of the cluster, the IP addresses, and all node names in the cluster as properties on the nodes, as discussed in the following section. These can then be used to manually add the resource used for the cluster VNN.

Assign Properties to Resources

If the Collector is running as a domain account with local admin privileges on the host to be monitored, it is not required that you set the following custom properties. However, if the remote host requires credentials be specified, you’ll need to manually set several credentials as custom properties on the resource. See Credentials for Accessing Remote Windows Computers.

The addCategory_WindowsFailoverCluster PropertySource automatically finds all cluster virtual resources and nodes and sets their system.categories properties to “WSFC_VNN” or “WSFC_Node” respectively.

The PropertySource will also add the following properties to the node and virtual cluster resources.

Property Name Example Value
auto.wsfc.active_node node1.example.com
auto.wsfc.fqdn cluster1.example.com
auto.wsfc.ip 192.168.0.100
auto.wsfc.name cluster1
auto.wsfc.nodes node1,node2,node3

Import LogicModules

From the LogicMonitor public repository, import all Windows Server Failover Cluster LogicModules, which are listed in the LogicModules in Package section of this support article. If these LogicModules are already present, ensure you have the most recent versions.

Once the LogicModules are imported (assuming all previous setup requirements have been met), data collection and property assignment will automatically commence.

Migration from Legacy LogicModules

In December of 2020, LogicMonitor released a new suite of WSFC LogicModules. The DataSources in the new suite run on only the cluster virtual device, whereas the prior suite’s DataSources ran on each node in the cluster. Alerting on only the cluster virtual device reduces the number of duplicate alerts generated.

The release of the new suite serves to deprecate the following legacy DataSources:

  • Microsoft_Windows_Cluster_Disks
  • WinClusterGroupToNode-
  • WinClusterNodes-
  • WinClusterResourceGroup-
  • WinClusterResources-
  • Windows_Cluster_DiskPartitions
  • Windows_Cluster_NodeState
  • Windows_Cluster_ResourceState

If you are currently monitoring WSFC devices using any of these legacy LogicModules, you will not experience any immediate data loss due to the name variation that LogicMonitor expressly adds. However, there will be a diversion in data collection between the deprecated and new LogicModule, and you will potentially collect duplicate data and receive duplicate alerts for as long as both LogicModules are active.

For this reason, we recommend that you disable monitoring of DataSource instances at the resource or resource group level after you have imported the replacement. When DataSource monitoring is disabled in this way, it stops querying the host and generating alerts, but maintains all historical data. At some point in time, you may want to delete the legacy DataSource altogether, but consider this move carefully as all historical data will be lost upon deletion. For more information on disabling DataSource monitoring, see Disabling Monitoring for a DataSource or Instance.

LogicModules in Package

LogicMonitor’s package for WSFC and Storage Spaces Direct monitoring consists of the following LogicModules. For full coverage, please ensure that all of these LogicModules are imported into your LogicMonitor platform.

Display Name Type Description

LogicModules for WSFC Monitoring

addCategory_WindowsFailoverCluster PropertySource Sets a category of “WSFC_VNN” for the cluster virtual resource and sets “WSFC_Node” for each node in a cluster in the system.categories property.
Windows Cluster Disks Datasource Monitors disks and volumes which are associated with the cluster, gathering metrics such as the storage details, number of partitions, and the overall utilization.
Windows Cluster MulticastMessages DataSource Monitors the Multicast Request-Response (MRR) messages throughout the cluster network, monitoring the multiple recipients and their responses.
Windows Cluster Network DataSource Monitors the cluster network throughput, message transmission, number of reconnections and message queue depth.
Windows Cluster NetworkInterfaces DataSource Monitors the operating state of the interfaces associated with the cluster.
Windows Cluster Nodes DataSource Monitors the individual nodes that comprise the cluster, monitoring their operating and drain state.
Windows Cluster NodeStatus DataSource Summary status of all nodes in a cluster.
Windows Cluster PrintServer DataSource Monitors the cluster print server spooler metrics such as job rate, total jobs, page rate, total pages, job spooling, job data, page references and errors.
Windows Cluster ResourceControlManager DataSource Monitors the Resource Control Manager(RCM) resource states and resource handling of failures and collections metrics such as number of groups currently online, Resource Host Monitor(RHS) processes and restarts.
Windows Cluster ResourceGroups DataSource Monitors the cluster resource group state. The current owner node is an instance property on the resource.
Windows Cluster Resources DataSource List all of the Cluster Resources and the current state.
Windows Cluster SharedVolumes DataSource Monitors Cluster Shared Volume operating state, backup state, fault state, cache state, storage utilization, storage capacity details, throughput, IOPS, latency, queue depth, flushes, cache IOPS, cache throughput, cache storage details and current LRU cache size.

LogicModules for Storage Spaces Direct Monitoring

Windows Cluster S2D StoragePoolStatus DataSource Monitors the status of storage pools in the S2D cluster.
Windows Cluster S2D Statistics DataSource Provides statistics from the S2D Health Report.
Windows Cluster S2D VirtualDisks DataSource Monitors S2D virtual disk names and status.

When setting static datapoint thresholds on the various metrics tracked by this package’s DataSources, LogicMonitor follows the technology owner’s best practice KPI recommendations. If necessary, we encourage you to adjust these predefined thresholds to meet the unique needs of your environment. For more information on tuning datapoint thresholds, see Tuning Static Thresholds for Datapoints.

In This Article