You can configure Anomaly Detection for the entity types of your applications and infrastructure for automatic detection of performance issues. This feature enables you to easily detect the performance issues without having any prior experience in writing complex evaluation conditions as in health rules. Once configured, Anomaly Detection uses machine learning capabilities and automatically determines whether the specified entities in your application perform within the acceptable performance limits.

With this feature, you can:

  • Filter specific entities by tags and attributes for which you want to configure Anomaly Detection.
  • Link HTTP request actions as per your choice and get automated response when performance deviates from the acceptable limits.
  • Choose a sensitivity level (High, Medium, or Low) of the Anomaly Detection algorithm based on your business needs.
  • Test the Anomaly Detection configuration for the entities that are in your development or staging environments.

How to create a configuration

To configure Anomaly Detection:

  1. Click Configure > Anomaly Detection.
  2. Click Create configuration to open the configuration wizard.

Alternatively, you can configure Anomaly Detection for an entity type from the Observe page. Perform the following:

  1. Click Observe.
  2. Go to one of the following domains:
    • Application Performance Monitoring
    • Infrastructure
    • Kubernetes
  3. Click an entity type of the domain:

    DomainEntity Type
    Application Performance MonitoringServices, Service Instances, Service Endpoints, or Business Transactions
    Infrastructure

    Hosts

    Anomaly Detection is supported only for AWS hosts.
    KubernetesCluster, Namespace, Pods, or Workloads
  4. Click an Entity Name to view its details.

    For Application Performing Monitoring domain, click List to view the list of entity names.
  5. Under the HEALTH AND ALERTING section, click Anomaly Detection.
  6. Click Create configuration to open the configuration wizard corresponding to the selected entity type.

The configuration process involves the following three steps:

  1. Select entities and detection sensitivity
  2. Link actions
  3. Review the settings

Select Entities and Detection Sensitivity 

You can configure Anomaly Detection for the following domains and their entity types:

Domain Entity TypesMonitored Metrics
Application Performance Monitoring
  • Business Transactions
  • Services
  • Service Endpoints
  • Service Instances
Metric Description

Average Response Time

The time each request must wait to be granted a global resource added together for all requests and then divided by the total number of requests; nanoseconds is converted to milliseconds.

Call Per Minute

The number of calls reported during one minute.

Errors Per Minute

The number of errors reporting in one minute.
Infrastructure

  • AWS EC2
MetricDescription
CPU Used UtilizationThe percentage of time the CPU was busy processing system or user requests.
Disk Avg IO UtilizationThe average time spent processing read and write requests on all disks and partitions as a percentage of the total reported time period. Databases often report high disk I/O utilization due to frequent read/write requests.
Memory Used UtilizationThe amount of memory used by applications.
Network Incoming Errors/minThe number of incoming packet errors the network incurs every minute.
Network Incoming Packets DroppedThe number of incoming data packets per second dropped by all monitored network devices.
Network Outgoing Errors/minThe number of outgoing packet errors the network incurs every minute.
Network Outgoing Packets DroppedThe number of outgoing data packets per second dropped by all monitored network devices.
Page Faults/secThe number of page faults per second for the system.
AWS Application Load Balancer
MetricDescription

Target Response Time

The time elapsed after a request leaves the load balancer until it receives a response from the target.

Target Connection Errors

The number of connections that were not successfully established between the load balancer and target.
AWS Classic Load Balancer
Metric Description

Backend Response Time

The total time elapsed from the time the load balancer sent the request to a registered instance until the instance started to send the response headers.
Backend Connection ErrorsThe number of failed connections between the load balancer and the registered instances.

Surge Queue Length

The total number of requests (HTTP listener) or connections (TCP listener) that are pending routing to a healthy instance.
KubernetesCluster
MetricDescription
CPU UsedThe total CPUs used in a cluster.
CPU RequestsThe total CPU requests from the pods in a cluster.
Memory UsedThe total memory used by the pods in a cluster.
Memory RequestsThe total memory requested by the pods in a cluster.
Memory PressureThe amount of memory pressure experienced on nodes due to decrease in available memory.
Disk PressureThe amount of disk pressure experience on nodes due to decrease in available disk space.
Pods in Pending StateThe pods in a cluster are in the pending state because they cannot be scheduled to a node due to a shortage of resources.
Pods in Failed StateThe pods in the cluster are in the failed state because of some errors.
Pods in Unknown StateThe pods in the cluster are in the unknown state because the node on which they are running becomes unresponsive, disconnected, or experiences other issues.

Namespace
MetricDescription
CPU UsedThe total CPUs used in a namespace.
CPU RequestsThe total CPU requests from the pods in a namespace.
Memory UsedThe total memory used by the pods in a namespace.
Memory RequestsThe total memory requested by the pods in a namespace.
Pods in Pending StateThe pods in a namespace are in the pending state because they cannot be scheduled to a node due to a shortage of resources.
Pods in Failed StateThe pods in a namespace are in the failed state because of some errors.
Pods in Unknown StateThe pods in the namespace are in the unknown state because the node on which they are running becomes unresponsive, disconnected, or experiences other issues.

Workload
MetricDescription
CPU UsedThe total CPUs used in a workload.
CPU RequestsThe total CPU requests from a workload.
Memory UsedThe total memory used by a workload.
Memory RequestsThe total memory requested by a workload.

Pod
MetricDescription
CPU UsedThe total CPUs used in a pod.
CPU RequestsThe total CPU requests from a pod.
Memory UsedThe total memory used by a pod.
Memory RequestsThe total memory requested by a pod.

In Step 1 of the wizard, perform the following:

  1. Select a domain:
    • Application Performance Monitoring
    • Infrastructure
    • Kubernetes
  2. In Selected Entities , select an entity type.

    The entity type is preselected if you have already chosen it from the Observe page.

  3. In the Filter section, enter a filter expression by using the tags and attributes to narrow down specific entity type.
    The attributes and tags are auto-populated based on the entity type that you have selected. You can use the attributes and the tags to configure Anomaly Detection for specific entity names, entity types, and so on. For example, you can select the entity type Service and enter the following filter expression to configure Anomaly Detection for the particular criteria:

    attributes(service.name) = 'test' && attributes(status) IN [Normal]
    For more information about the supported filter operations, see Filters.

  4. In the Detection Sensitivity section, select one of the following sensitivity levels:

    Sensitivity level Description
    HighUse this level for business-critical services to ensure that no issue gets undetected in your environment. It triggers more alerts but with lower statistical confidence.
    Medium

    Use this level for services that are important to your business but not critical. By default, this sensitivity level is selected.

    LowUse this level for services that have low business impact and to avoid too many alerts.
  5. Click Next to link HTTP request actions.

In Step 2 of the wizard, you can view the HTTP request actions available in your Cisco Cloud Observability Tenant and link it with the Anomaly Detection configuration. If you want to link a new HTTP request action, you need to first create it. See Create HTTP Request Action.

To link an HTTP request action:

  1. Click +Add.
  2. In the HTTP Action section:
    1. Select an action from the list.
    2. Select a trigger from the list. You can select multiple trigger events based on which the action will be triggered.

      The Preview pane on the right displays mock data that the HTTP request contains. It does not display the request header; however, the actual request includes the header.

  3. Click +Add and repeat step 2.a and 2.b to link multiple actions.
  4. Click Next to review the settings.

Review the Settings

In Step 3 of the wizard, specify the following details to complete the configuration:

  1. Enter a name for the Anomaly Detection configuration.
  2. (Optional) Deselect Turn on this configuration to disable it after creating the configuration. By default, this option is enabled. It is recommended to keep it enabled so that you receive automated response when performance issues are detected in the monitored metrics.
  3. Select one of the following options to evaluate the health of the entity when no metric data is available for evaluation:
    • Unknown: The anomaly detection algorithm considers the health of the entity for a no data scenario as unknown and the health status of the entity is shown as Grey.
    • Healthy: The anomaly detection algorithm considers the health of the entity for a no data scenario as healthy and the health status of the entity is shown as Green.
  4. (Optional) If you want to test the configuration, select Yes, turn on test mode.

    Test mode allows you to assess anomaly detection capabilities in non-production environments. In this mode, the anomaly detection accurately detects any performance issues even if metric data collection is low. You can use the test mode in your development or staging environments.

  5. Click Submit to save the configuration.

The configuration applies to all the monitored entities of the specified entity type unless you have defined a filter criteria in Step 1 of the wizard.

View the Configurations

The Configure > Anomaly Detection page displays the list of Anomaly Detection configurations available in your Cloud Tenant. The list contains both the default set of configurations and the user-defined configurations. You can update, delete, or disable any configuration as per your requirements.

Disabling or deleting the Anomaly Detection configurations for the entities affect the root cause analysis functionality. To use the root cause analysis functionality, always keep Anomaly Detection enabled for all the entities in the call path.