Related Pages: |
You can define a condition or a set of conditions to evaluate the performance metrics of your application.
Use the following options to evaluate the conditions:
AND
and OR
operators in a boolean expression.You must specify the evaluation scope in the Critical Criteria and Warning Criteria panels.
You can use Alert Sensitivity Tuning to configure health rule conditions for a business transaction, a service endpoint or a remote service only. You must set the health rule evaluation criteria to 'average of all nodes'. For more information, see Alert Sensitivity Tuning. |
If you select the percentage of nodes, enter the percentage. If you select the number of nodes, enter an absolute number.
When you monitor serverless entities comprising tiers for AWS Lambda, the health rules are evaluated as described below.
Health Rule Type | Affected Entities | Condition Evaluation Criteria | Evaluation |
---|---|---|---|
| serverless tier(s) | The BT Average | Metrics are aggregated at the tier level. |
serverless node(s) |
| Metrics for serverless tiers are aggregated at the tier level, while the metrics for other tiers are evaluated as per the defined criteria. | |
Tier/Node Health (Transaction Performance) | serverless tier(s) |
| Metrics for serverless tiers are aggregated at the tier level, regardless of the evaluation criteria defined. |
serverless node(s) |
| The performance of serverless tiers is not evaluated for Tier/Node Health (Hardware) health rules. AWS does not offer node-level dashboards or metrics because the serverless platform runtime instances spin up and down on-demand. | |
Tier/Node Health (Hardware) |
| - | The performance of serverless tiers is not evaluated for Tier/Node Health (Hardware) health rules. AWS does not offer node-level dashboards or metrics because the serverless platform runtime instances spin up and down on-demand. |
In the first field of the condition row, enter a name for the condition.
This name is used in the generated notification text and in the console to identify the violation.
Ensure that you enter a unique name for each condition you define. |
a. From the Value drop-down list, select a qualifier for the metric from the following options:
b. To specify a simple metric, click Select a Metric. Metric Selection window is displayed. The metric browser in the Metric Selection window displays metrics appropriate to the health rule type. Alternatively, you can define a relative metric path. c. Select a metric to monitor and click Select Metric. |
You can use Alert Sensitivity Tuning to fine-tune metric evaluation for a health rule (that monitors BT, service endpoint or remote service). You must select a single metric to evaluate the condition. See Create a Health Rule and Fine-tune Metric Evaluation for more information. |
or
From the drop-down list after the metric, select the type of comparison to evaluate the metric.
To limit the effect of the health rule to conditions during which the metric is within a defined range—standard deviations or percentages—from the baseline, select Within Baseline from the menu. To limit the effect of the health rule to when the metric is not within that defined range, select Not Within Baseline. Then select the baseline to use, the numeric qualifier of the unit of evaluation and the unit of evaluation. For example:
Within Baseline of the Default Baseline by 3 Baseline Standard Deviations |
To compare the metric with a static literal value, select < Specific value, > Specific Value, = Specific Value, or != Specific Value. And enter the specific value in the text field. For example:
Value of Errors per Minute > 100 |
To compare the metric with a baseline, select < Baseline or > Baseline from the drop-down list, and then select the baseline to use, the numeric qualifier of the unit of evaluation and the unit of evaluation. You can use the Baseline Standard Deviation or Baseline Percentage as the unit of evaluation. For example:
Maximum of Average Response Time is > Baseline of the Daily Trend by 3 Baseline Standard Deviations |
See Dynamic Baselines for information about the baseline options.
The baseline percentage is the percentage above or below the established baseline at which the condition will trigger. For example, if you have a baseline value of 850 and you have defined a baseline percentage of > 1%, the condition is true if the value is To prevent health rule violations from being triggered when the sample sets are too small, these rules are not evaluated if the load—the number of times the value has been measured—is less than 1000. For example, if a very brief time slice is specified, the rule may not violate even if the conditions are met, because the load is not large enough. |
If you want to define a 'Persistence Threshold' for the condition to reduce false alerts:
Select 'Trigger only when a violation occurs __ times in the last __ min(s)'.
Define the number of times metric performance data should exceed the defined threshold to constitute a violation.
If required, adjust the evaluation time frame by setting an alternate evaluation time frame.
You can define a persistence threshold for a condition only if you have defined an evaluation time frame of 30 minutes or less. |
Click Save when done.
Using Health Rule Conditions to evaluate agent availability metrics can result in false positives. For example:
Set your condition to be the Sum of < Specific value of three.
This configuration generates a violation when the agent is down for more than two minutes during the last five minutes.
The purpose of the availability metrics is to check if the applications monitored by agents are available.
If an agent goes down due to any reasons, the controller does not get the status of the corresponding application.
If a health rule is created for an availability metric, the health rule violates when the agent is down, and an alert is generated. This is because the situation is considered as no data received from the source and health rule is evaluated. In such case, the health rule violation does not indicate that your application is down. You can ignore such alert and disable the health rule.
The option Evaluate to True on No Data does not have any impact on such health rules.
To access the expression builder to create a complex expression as the basis of a condition, select the Metric Expression option from the drop-down list and click Add Expression. The Metric Expression window is displayed that allows you to construct a mathematical expression to use as a metric.
For example, the following expression is created to measure the percent of slow business transactions. See the screenshot that follows for the UI location where each step is performed.
From the drop-down list, select the qualifier for the metric from the following options:
Qualifier Type | Description | |
---|---|---|
Minimum | The minimum value reported across the configured evaluation time length. This qualifier is not available for all the metrics. | |
Maximum | The maximum value reported across the configured evaluation time length. This qualifier is not available for all the metrics. | |
Value | The arithmetic average of all metric values reported across the configured evaluation time length. This value is based on the type of the metric.
| |
Sum | The sum of all the metric values reported across the configured evaluation time length. | |
Count | The number of times the metric value has been measured across the configured evaluation time length. | |
Group Count | The number of nodes contributing to a metric value, generally relevant for application or tier level metrics. | |
Current | The value for the current minute. |
Click Select a metric to open an embedded metric browser.
A health rule is not evaluated if any metric in the expression has a null value. This is to avoid erroneous evaluations as shown in the following examples:
|
When the expression is built, click Save.
Once you define all the conditions required for the health rule, you can create a custom boolean expression to evaluate the health rule.
Enter a combination of conditions using AND
and/or OR
operators. For example, (A OR B) AND C
.
Ensure that you enter a valid combination of conditions using |
Delete a condition component by clicking the delete (X) icon.
If you delete a condition, update the boolean expression accordingly. |