The generation of insights in LM Dexda is based on the use of machine learning to group related alerts into clusters. Using a set of specialized algorithms, LM Dexda identifies hidden patterns within the text features of alert data. LM Dexda analyses the alerts to dynamically manage their clustering. This grouping into clusters is controlled by correlation models.
The following describes the concept of correlation models. For information on how to create models, see Creating Models.
How Clustering Works
Alerts are forwarded for analysis by LM Dexda’s ML processor when any of the following criteria are true:
- The alert is a new alert.
- Any of the following fields have changed:
- Alert state
- Workflow state
- First or Last event timestamp
Alerts forwarded to cluster processing are compared to each other and previous alerts and clusters to find new clusters or update existing ones. The specified models determines which alerts are compared (model filter), and how they are compared (model groups). Each alert is compared to each of the other available alerts or active clusters.
When comparing an alert to another alert, the fields specified in each corelation model are compared, and their textual similarity calculated. If the alerts are deemed similar based on model thresholds, a link between the alerts is created.
When comparing an alert to a cluster, the alerts field are compared to each of the alerts in the cluster. The alert must pass the similarity threshold for all alerts in the cluster to create a link between them. When all links re determined, these links are analyzed to find the clusters.
If there is only one model, then all linked alerts forms a cluster. An alert can only belong to one cluster and cannot move. If the links include a cluster, then all alerts connecting to that cluster will be added to it.
If there is more than one model, then there can be more than one link between an alert and another alert or cluster. If there is more than one link, the cluster is determined based on the following criteria:
- The number of alerts in the potential cluster.
- The highest average similarity between all the alerts.
- If a cluster already exists.
- The greater number of model group by fields (over all models).
- The greater number of models that have matched.
What is a Correlation Model?
A correlation model specifies:
- A filter controlling which alerts should be analyzed with the model, for example only alerts relating to “Cisco Meraki Wireless Access Points”.
- One or more “group by” fields for computing textural similarity together with correlation sensitivity levels.
- The required minimum density (number of alerts) which must exhbit the same feature in order to form a cluster.
Through correlation models you can control the number of insights generated, and ensure that they are actionable. Models let you specifically target business scenarios for which you want to generate actionable insights to be managed in your workflow.
Correlation Model Graph
The treeview chart in the Create Correlation Model page uses a similarity threshold of 1.0 (identical) to show how many cluster groups there would be, and the size of each group for a selected group in the model and a given time. Use the information to simulate what the correlation would look like and the possible number of generated insights.
You can for example have two groups, one for correlation by resource (CI, configuration item), and one for correlation by description. For the CI, the matching correlation score has to be 100% (1), meaning the that the resource has to be identical to fulfill the grouping criteria. For the description an 80% (0.8) matching is enough.
Parameters in the grouping drop-down are fields that are available for alerts. You can choose from any core or enriched field for the alert. For more information on available fields, see About Filters.
You can add filters to narrow the grouping and resulting insight creation. Let’s say you only want to run this correlation logic against alerts from LogicMonitor. To do this you can add a filter where “Source” equals “LogicMonitor”. As you add grouping parameters and filters, the table view on the right hand updates to reflect the current settings. For more information, see About Filters.
You also have the following additional settings to make your correlation model more specific.
- Timeout—Timeout is the maximum time between the first time of the earliest alert and the first time of the latest alert. Default is 15 minutes (90000 msec). 15 minutes after the first incoming alert, no further alerts will be correlated into an insight. This time span is set to prevent clusters from overconnecting unrelated alerts. If more incoming alerts are correlated, they will be added to a new insight. With this setting a cluster will always be 15 minutes in duration.
- MinClusterDensity—Defines how many alerts have to meet the specified grouping criteria to form a cluster. Default value is 2. This means that at least two alerts need to match the grouping criteria within the given timeout period, to create an insight. Depending on the type of resource, the cluster density can be narrow or broad.
You can for example have a broad correlation for a geographic location like a datacenter, since a datacenter that is down may result in a massive number of alerts. For this scenario a broad correlation like at least 200 alerts is recommended, to generate an insight. The broader the group, the larger the number of cluster points.
- Stopwords—Stopwords are strings – partial or whole, that are removed from the alert message text before scoring analysis. Typical examples for a CI type of source are parts of URLs like “company.com” and “company.net”.
These are removed to prevent overmatching, ensuring that the matching logic is applied to the relevant information. Without removal of stopwords, the strings would appear more similar than they actually are, and the correlation scoring is biased.
- RemoveNumbers—This lets you remove numbers from a string, so the correlation only looks at actual alpha characters.
- CaseSensitive—This lets you indicate that the correlation should consider upper or lower case parts of a string.
- Trim—Lets you trim off white spaces at the end of a string for correlation comparing.
- Locale—Sets the locale for case sensitivity – default is “UK”. For a list of locales, see this documentation.
Models are locked and read-only once submitted, and except from the name and description you cannot edit or delete an existing model. This is due to the referencing between models and insights. An insight contains a reference linking to the originating model that was used to generate the insight. This reference is an important part of the problem solving process to understand why the insight was generated, and therefore a model is always preserved.
You can do the following when managing models:
- Edit—Update the Name and Description for an existing model that has been used. These changes do not affect the model correlation behavior.
- Clone—Copy an existing model and use it as foundation to modify when creating a new model.
- Activate—Start using a model after you have created and submitted it.
- Deactivate—Stop the model from being used.
- Archive—Store a model that you no longer want to use.
- Unarchive—Make a model available again for use (must be activated).
The following states exist for models:
- Ready—The model has been submitted, and is ready to be used (activated).
- Running—The model is currently in use.
- Archived—The model should no longer be used.
Model Creation Guidelines
The prebuilt correlation by configuration item (CI) is a good start, but eventually you will want to customize this based on detailed knowledge about your specific business. Common customization scenarios include cluster grouping based on resources, applications, environments, and geographical locations like datacenters.
When working with models, it is important to know your data and consider which fields hold valuable data. The more specific you get with your correlations, the more specific your insights will be. Reducing the number of insights is important, but it is also important that extracted insights are valuable and actionable. An insight has little value if you don’t know how to fix the associated issue.
CI (resource, configuration item) is an example of a common and very specific correlation item, compared to for example correlation by geographic region. Other examples of correlation items are applications and business services. Correlating on application is quite specific, but less specific than CI. These items can have a grouping of 3 or 5 for example, depending on how you want your operation flow to work.
The more specific the grouping, the lower the minimum cluster density number should be. The “Description” field usually varies a lot and require more groups, whereas the CI field is similar and requires fewer groups. Correlation based on “Source” will be the same so this only requires on group per timout block.
When testing a model, you can clone an existing model, make adjustments to it, and run it in your environment. To investigate the effectiveness of the model, you can look at the number of insights generated. You can also go to the Explore page to see which model generated a specific insight and filtering the insights by Model ID List. Select the insight to access the correlation details and the model that generated the insight.
Grouping by Tenant and Domain Separation
LM Dexda supports multi-tenant processing and domain separation. This allows the logical separation of instance into separate domains where a single instance can support multiple organizations. If the tenant field is populated with a value, all processing, including correlation, is done in the context of a tenant ID, allowing MSPs (Managed Service Providers) to segregate their events, alerts and insights by tenant.
The tenant.identifier property is set on the resource in LogicMonitor. It is automatically passed to LM Dexda with the event, and mapped to the Tenant ID field of the event record in LM Dexda. The tenant here is usually an MSP customer on a resource or resource group dedicated to a customer.