NVIDIA SMI Monitoring
Last updated - 28 February, 2025
LogicMonitor provides support for monitoring NVIDIA GPUs. LogicMonitor’s Nvidia-SMI monitoring is optimized for fast efficient parsing, enabling scaleability to larger setups on a single collector with minimal impact to overall collector performance.
LogicMonitor’s GPU Module enhances monitoring for NVIDIA GPUs in the following ways:
- Increases efficiency–Optimized request filters minimize SSH data transfer and impacts to device performance, enabling scalability to larger setups with minimal impact to a collector.
- Provides secure access–Supports key-based SSH authentication for both Linux and Windows environments.
- Ensures seamless data collection–Harnesses SSH’s exec functionality to make connections reliable, short, and robust.
- Identifies performance bottlenecks–Metrics covering utilization, VRAM usage, clock speeds, as well as assists in identifying which models maximize your hardware.
- Streamlines device management–Uses a dedicated Property Source (@category NVIDIA-SMI) for streamlined and easy monitoring.
Requirements for NVIDIA SMI Compatibiltiy
LogicMonitor’s GPU Modules are compatible with resources running NVIDIA SMI command (broad GPU support) with SSH EXEC.
Adding Resources into Monitoring
Add your NVIDIA GPU resources into monitoring. For more information on adding resources into monitoring, see Adding Resources.
Import LogicModules
From the LogicMonitor Module Exchange, import all NVIDIA GPU LogicModules. If these LogicModules are already present, ensure you have the most recent versions.
After the LogicModules are imported data collection automatically commences.
LogicModules in Package
| Display Name | Type | Description | 
| Nvidia_SMI_SSH | DataSource | GPU metrics from Nvidia-SMI. | 
| addCategory_Nvidia_SMI | PropertySource | Attempts to identify Windows and Linux hosts using an Nvidia GPU with support for Nvidia-SMI. | 
When setting static datapoint thresholds on the various metrics tracked by this package’s DataSources, LogicMonitor follows the technology owner’s best practice KPI recommendations.
Recommendation: If necessary, adjust these predefined thresholds to meet the unique needs of your environment. For more information on tuning datapoint thresholds, see Tuning Static Thresholds for Datapoints.
