Last week we posted about best practice monitoring for MSP/CSPs (cloud service providers). This week’s entry is a back-to-basics best practice about “managing host groups” — important for MSP/CSPs wanting performance monitoring to fit their preferred way of managing their business. Whether you’re a CSP that wants to know which customers are taxing certain devices, or a more traditional MSP wanting forewarning of service degradation for clients’ equipment at their own location, managing host groups provides you the organized structure you want to optimize the customer service experience. Managing monitoring to provide your support team visibility by device type, location, or product/service level agreement allows you to optimize your customers’ experience. Is the service up? Down? Degrading? As an MSP/CSP you want to know these answers while simultaneously trying to maximize the bottom line.
When using LogicMonitor here are the specific steps to implement this best practice:
- Use a client-based group folder structure to keep things neat. Organizing hosts into naming groups allows MSP/CSPs to edit and share host data on a group level, as well as get a visual representation of your architecture. This also allows you to use Role Based Access Control to share LogicMonitor performance information with your clients simply – by just granting access to the top level group – and also to manage per-client performance more easily.
- Underneath the client level – decide on the sub group schema that works best for your business and your clients. You may want to separate a client’s systems by function (creating subgroups for ESX, Network, Databases, Storage, etc). Or by client sites. Or by business unit. The important factors to remember here are consistency where possible between clients, and remembering that hosts can be members of many group or subgroups – so you may very well have the same host be a member of both a location based group (say, Goleta), as well as a functional group (Storage.)
- When designing your subgroup schema, balance simplicity (always a good thing) with the consideration that grouping by site will allow you easier visibility into sitewide alarms. Also, if you do elect to have hosts be a member of multiple groups, define which kind of group will be used to adjust thresholds or alerting. (Adjusting the same threshold on different groups at the same level, that may contain the same hosts, will lead to confusion.)
- As elsewhere, you should be looking to automation to simplify group allocations as much as possible. If you use consistent host names for customer equipment, that includes attributes such as customer name, and location, then you can use dynamic groups to allocate hosts into the correct customer groups and subgroups, automatically. (e.g. to create a group that will contain all Windows systems from the Goleta site of the customer LogicMonitor into one group automatically, define the dynamic group’s Applies To expression as: isWindows() && system.displayname =~ “Goleta” && system.displayname =~ “LogicMonitor”.) If you’d like more information on the importance on host naming read this.
- Tag the names of the collectors per site correctly to reflect if it’s a primary agent or a backup agent or in some cases a solo collector. This helps to trace problems and collector issues. Collector Management is accessed from the Settings tab.
- Deploy Collectors to customer sites, and don’t monitor remote sites over a VPN or WAN. Indirect monitoring is more likely to produce inaccurate results.