Monitor Anomalies

Anomaly Detection is enabled by default for the entity types in the application. It takes 48 hours for machine learning to model train on the entity types in your application. See Model Training.

When the entity types are model trained, you can monitor the anomalies associated with an application and determine the details of the violating metrics.

To monitor anomalies, perform the following:

Click Observe.
Go to one of the following domains:
- Application Performance Monitoring
- Infrastructure

Select an entity type:

Domain	Entity Types
Application Performance Monitoring	Business Transactions, Services, Service Endpoints, or Service Instances
Infrastructure	Hosts Anomaly Detection is supported only for AWS EC2, AWS Application Load Balancer, and AWS Classic Load Balancer.

For example, from the Application Performance Monitoring, you select the entity type Service. The service details page lists the relationship map, the health violations (includes anomaly), and the violating metrics.

Click an entity name to examine the anomaly.

For the Application Performance Monitoring domain, click List to view the list of entity names.

Examine the Anomaly

You can view the alert related to the anomaly in the Entity Health Timeline timeline and the VIOLATION DETAILS section in the right panel. This data helps identify the exact metric that violates and provides associated details that help you take corrective actions.

To view the anomaly details related to an entity:

On Entity Health Timeline, click the anomaly event to view the details and the Suspected Causes on the VIOLATION DETAILS section on the right. The VIOLATION DETAILS indicates the overall status of the anomalies for a given entity.
On the Flow map, view the unhealthy entities that are highlighted in red along with the call paths.

The VIOLATION DETAILS section displays the following information:

Service name
Violating metric name
Average value of the violating metric
Start date and time of the violation along with the duration
Status (Open or Close) of the violation
Suspected causes (lists a maximum of top three suspected causes)

Each suspected cause lists the service name, the ID, the affected entities path, the deviating metric, and an associated call path.
Call path
Violating metric and suspected cause metric graph with the timeline plotted on the X-axis
Other properties like service name and service namespaces

View Alert Details

To view the details of the alert triggered by the anomaly:

On the Alerts section, click the alert name of the type Anomaly to open the Alert Details pane.
Click Go to Alert Details to open the Diagnostics tab. The tab displays the various stages of the anomaly alert and the details of the violating metrics and suspected cause metrics in the form of graphs.
On the Alert Details pane, click Remediation to view the Suspected Causes paths and links to the corresponding logs and traces for detailed analysis.

The Diagnostics tab in the Alert Details pane displays the various severity stages such as Critical, Warning, Normal, and Unknown that the alert has transitioned and the corresponding time. Red indicates Critical status, Yellow indicates Warning status, Green indicates Normal status, and Grey indicates Unknown status. If the anomaly is in an open state, the end time of the anomaly violation is the current time. If the anomaly is in a closed state, the start and end time depict the historical time of the anomaly violation.

Anomaly can not be evaluated if data is not available or flowing from a source. During the no data period, anomaly status is unknown. The Anomaly Detection timeline in Entity Health Timeline, Violating Metrics graph, and Suspected Cause metrics graph display the no data period in the grey color.

You can view the details of the violating metrics such as the source and value at violation and suspected cause metrics. Examine the graphs to correlate the deviating metric data with that of the violating metric. You view, scroll through, and hover over graph to determine the deviation. The metric value is shown as a thin blue line. You can hover over a time point to view the metric value in numerical form. In the violating metric graph, you can view the upper and lower bounds (threshold) of a metric. The upper and lower bounds define the normal range of a metric. By observing the metric in context of the normal range, you can identify the abnormal metric values.

The Remediation tab displays the root cause of the anomaly found in the entity. These details help determine the source of the anomaly and deduce the affected path. See Determine the Root Cause of an Anomaly.

You can view the logs and traces related to the alert by clicking the respective links.