Get Better Observability With Machine Learning Anomaly Detection.
DevOps teams today are challenged with the rapid growth and complexity of infrastructure. Managing those environments only through static thresholds becomes insufficient, so to address this issue, modern DevOps teams rely on advanced ML/AI algorithms.
LogicMonitor’s Anomaly Detection solution is part of our AIOps Early Warning System that provides context, meaningful alerts, illuminates patterns, and enables foresight and automation. All of this is done automatically, without exposure to ML/AI algorithms and parameters.
In this article we will cover:
- The main use cases for DevOps teams
- The right Anomaly Detection algorithm
- If DevOps engineers should act as data scientists
A Little Bit of History
The first step toward artificial Neural Networks came in 1943 when Warren McCulloch and Walter Pitts wrote a paper on how neurons might work. They even modeled a simple neural network with electrical circuits.
Other algorithms used to describe changes in time series using a mathematical approach (e.g. the ARIMA model by George Box and Gwilym Jenkins) were developed in the 1970s.
With that, only in the last few years has the computational power (in 2009 when GPUs started getting used for training Neural Network) and a huge amount of data available played a key role for efficient and accurate learning of the model.
What Are the Main Use Cases for DevOps Teams?
At LogicMonitor we identify three main use cases for anomaly detection:
- Predict issues before they occur (prevent severe issues).
- Suppress alerts on issues that don’t need action to be taken (noise reduction)
- Troubleshoot issues as they occur, answering a question such as
- Is this issue abnormal?
- How different is the signal compared to the last day/week?
- What is the change in the environment?
Comparing original time-series with one day-of-set (orange) and the expected range
Automatically Choose the Right Anomaly Detection Algorithm
Several anomaly detection techniques have been proposed in literature. Some of the popular techniques are Forests, Tensor-based, correlation-based, Neural Networks, Bayesian Networks, and deviations from association rules and frequent itemsets.
At LogicMonitor, our platform processes data in a stream, keeping the system agile so it can quickly adjust and use the right algorithm. We believe that a DevOps engineer should not need to become a data-scientist (we’ve hired a few). Our platform should do the hard work for you
Many competing monitoring platforms require DevOps teams to fill in certain parameters, but LogicMonitor takes away this burden and automatically answers the following questions:
- Which resample algorithm to use?
- What is the right standard deviation?
- How to tune for daily, weekly, or seasonality?
- Should I use ARIMA/SARIMA?
- Should we be aware of daylight-savings or time-zones?
Workloads are classified automatically, and for each datapoint the Model is learning and adjusting. Transformers are used to handle seasonalities, shifts, etc. Once implemented, the Model warm-up times are as follows:
- 12.5 hours – Automatically identifies asymmetric deviations.
- 12.5 hours -Automatically identifies rate of change based on shelf and amplitude transformers.
- 2.5 days – Discovers daily seasonality.
- 9 days – Discovers weekly seasonality.
note: the daily seasonality is kicking in automatically in after 2.5 days
What Users Should Know When Setting up Anomaly Detection
When setting up anomaly detection, users should not be exposed to ML/AI algorithms parameters. Adjusting algorithm sensitivity should be described in simple English.
In rare cases where tuning may be required, it is possible, but our philosophy is that burden should not be on the user by default and avoided in 99.9% of the scenarios.