Anomaly Detection

Related Pages:

The AppDynamics Anomaly Detection and Automated Root Cause Analysis are two features designed to reduce Mean Time To Resolution (MTTR) for application performance problems.

Anomaly Detection automatically determines whether every Business Transaction in your application is performing normally. Automated Root Cause Analysis helps you quickly determine the root cause for problems revealed by Anomaly Detection.

How Does Anomaly Detection Work?

Anomaly Detection uses machine learning capabilities to reduce the Mean Time to Detect (MTTD) when an anomaly occurs in a business transaction. It uses a specially designed algorithm that does not require you to configure anything. The Anomaly Detection algorithm works as follows:

It detects if any abnormal reading is reported for the Errors per minute (EPM) metric.
It detects if any abnormal reading is reported for the Average Response time (ART) metric.
It then combines the data it learned from these metric readings using heuristics that are designed to reduce alert noise.

Anomaly Detection employs multiple techniques to ensure that the metric data it collects is accurate:

It disregards any temporary spikes and periods of no data
It normalizes the metric data. For example, when determining the EPM metric data, any spikes may not indicate a real problem unless there is a corresponding increase in Calls per Minute (CPM). EPM data may not be useful in itself, hence, Anomaly Detection uses Error Rate (EPM/CPM).
It does not apply traditional seasonal baselines. Instead, it correlates the variance of EPM and ART to CPM to obtain reliable results.

Correlation of EPM and CPM Variance

What is Root Cause Analysis (RCA)?

When an entity in your application has an anomaly, you will want to know why. Anomaly Detection uses AI capabilities to enable Automated Root Cause Analysis to monitor the health of all the entities in your application, and show you the suspected causes for every anomaly. You can confirm or negate the suspected causes with a brief look, and drill down into deviating metrics and snapshots, as desired. Thus, you can quickly determine the root cause of any anomaly in the application.

How Does RCA Work?

RCA reduces the Mean Time to Identification (MTTI) by automatically pointing to the source of the problem. RCA considers metrics to identify the fault domain, and snapshots and events from the entire application, to find and surface suspected problems. This holistic approach is performed in two parts:

Fault domain isolation—Identification of the exact location of the problem in the system
Impacted component analysis—Analysis of logs, snapshots, trace, infrastructure, and so on to determine the affected components

How is Anomaly Detection Different from Health Rules?

While both Anomaly Detection and health rules alert you to performance problems in your application, Anomaly Detection provides powerful insights that would be difficult to obtain using health rules.

Anomaly Detection	Health Rules
Anomaly Detection uses machine learning to discover the normal ranges of key business transaction metrics and alerts you when these metrics deviate significantly from expected values. This enables Anomaly Detection to identify a wider range of problems than a person could capture in Health Rules.	Health rules are manually created to apply logical conditions that one or more metrics must satisfy. For example, you could monitor the Average Response Time (ART) to check if this metric deviates from the configured baseline.
Anomaly Detection requires no configuration except when you want to limit anomaly alerting.	AppDynamics provides a default set of health rules and you create additional health rules manually as required, configuring time periods, trends, and schedules.
Anomalies are associated with business transactions.	Health rules apply to any entity, for example, business transactions, service endpoints.