Download PDF
Download page Define Health Rule Evaluation Conditions.
Define Health Rule Evaluation Conditions
You can define a condition or a set of conditions to evaluate the performance metrics of your application in Step 2 of the Create New Health Rule wizard. Use the following options to evaluate complex conditions:
- Expression builder embedded in the health rules wizard to create a condition based on a complex expression comprising multiple interdependent metrics.
- Custom boolean expression to evaluate multiple conditions within a health rule. You can use the
AND
andOR
operators in a boolean expression.
Create a Condition
Click the dropdown list Use data from last <> min(s) and select a number between 1 and 120 minutes. The value you specify is the latest time interval during which data is collected to determine if there is a health violation.
- If you plan to define a persistence threshold for the health rule condition, ensure that you define an evaluation time frame of 30 mins or less.
- If you have already configured a condition in the Warning Criteria tab, you can copy the configuration from the Warning Criteria to Critical Criteria using the Copy from Warning Criteria option. Similarly, you can copy the configuration from Critical Criteria to Warning Criteria using the Copy from Critical Criteria option.
- In the Critical Criteria tab, select one of the following from the dropdown list:
- All—if all of the conditions must evaluate as
true
to constitute a health violation. - Any—if any of the conditions must evaluate as
true
to constitute a health violation. Custom—if a combination of conditions defined in a boolean expression must evaluate as
See Create a Custom Boolean expression.true
to constitute a health violation.
- All—if all of the conditions must evaluate as
- Click + Add Condition to add a new condition component.
A condition row listing the configuration details appears. - Configure the Condition as required.
- Repeat steps 3 and 4 to add more conditions. You can add a maximum of 8 conditions. Conditions are designated as A, B, C, and so on.
- If you have defined multiple conditions and want the health rule to evaluate a combination of conditions, define a Custom Boolean expression.
- If you want to define a condition in Warning Criteria, click Warning Criteria and repeat steps 2 to 6. Alternately, you can copy the configuration from Critical Criteria using the Copy from Critical Criteria option.
- Click Next.
If you have completed Step 2 in the Create New Health Rule wizard, proceed to Step 3, Configure Health Rollup.
Configure a Condition
In the first field of the condition row, enter a name for the condition.
The conditions are named A, B, C, and so on by default. You can enter a meaningful name for your reference.
- Click the Add Condition button and define a condition to evaluate. You can create a condition for a single metric, an event, or build a metric expression.
- To create a condition for a metric:
- Click Single Metric from the dropdown list. The dropdown lists metrics appropriate to the selected entity types.
- From the Value dropdown list, select a qualifier for the metric.
- If applicable, select a source from the dropdown list.
- If applicable, select attributes to create a filter expression for the metric in Metric attribute filters (Optional). When you select an attribute, the available values are automatically suggested. The supported operators are
Equal to
,IN
,AND
, and~
. This option is visible only for a metric that has attributes. For example, you can monitor an entity and trigger alert when the attributehttp.status_code = 400
of the metricSpan Count
is greater than 5.
- To create a condition using a metric expression, see build a metric expression.
- To create a condition for an event:
- Click Event from the dropdown list. The dropdown lists events appropriate to the selected entity types.
- From the Value dropdown list, select a qualifier for the metric. Currently, the supported qualifier for event is Count.
- If applicable, select a source from the dropdown list.
- If applicable, select attributes to create a filter expression for the event in Event attribute filters (Optional). When you select an attribute, the available values are automatically suggested. The supported operators are
Equal to
,IN
, andAND
. This option is visible only for an event that has attributes. You can also use the wildcard * to match strings on the attributes. The following regular expressions are supported:*abc*
: All attributes that contain abc.abc*
: All attributes that start with abc.*abc
: All attributes that end with abc.
- To create a condition for a log:
- Click Logs from the dropdown list. The logs are displayed as a generic record.
- From the Value dropdown list, select a qualifier for the log. Currently, the supported qualifier for logs is Count.
- If applicable, select a source from the dropdown list.
- (Optional) If applicable, select attributes to create a filter expression for the log in Log attribute filters. When you select an attribute, the available values are automatically suggested. The supported operators are Equal to, IN, and AND. This option is visible only for a log that has attributes. The following regular expressions are supported:
- *abc* : All attributes that contain abc.
- abc* : All attributes that start with abc.
- *abc : All attributes that end with abc.
- To create a condition for a metric:
Select the one of the types of comparison to evaluate the metric or event:
Comparison Type Description Examples Within Baseline To limit the effect of the health rule to conditions during which the metric is within a defined range—standard deviations or percentages—from the baseline. Maximum of Average Response Time is Within Baseline of the Daily Trend - Last 30 days by 3 Baseline Standard Deviations Not Within Baseline To limit the effect of the health rule to when the metric is not within that defined range—standard deviations or percentages—from the baseline. Maximum of Average Response Time is Not Within Baseline of the Daily Trend - Last 30 days by 3 Baseline Standard Deviations > Baseline To compare the metric with more than the highest value of a defined range —standard deviations or percentages—from the baseline. Maximum of Average Response Time is > Baseline of the Daily Trend - Last 30 days by 3 Baseline Standard Deviations < Baseline To compare the metric with less than the lowest value of a defined range —standard deviations or percentages— from the baseline. Maximum of Average Response Time is < Baseline of the Daily Trend - Last 30 days by 3 Baseline Standard Deviations > Specific Value To compare the metric with a static literal value. Value of Errors per Minute > 100 < Specific Value To compare the metric with a static literal value. Value of Errors per Minute < 100 = Specific Value To assign the metric with a static literal value. Disk Pressure = 1 Some of the comparison types are not available for Events.
- For the comparison type – Within Baseline, Not Within Baseline, > Baseline, and < Baseline:
- Select one of the baselines:
- Monthly Trend- Last 1 year: A monthly trend calculates the baseline from data accumulated at the same hour and same day of the month over the last 365 days.
- Weekly Trend- Last 3 months: A weekly trend calculates the baseline from the data accumulated on the same hour and day of the week over the last 90 days.
- Daily Trend- Last 30 days: A daily trend calculates the baseline from the data accumulated at the same hour each day over the last 30 days.
- All Data- Last 15 days: An all data trend calculates the baseline from the data accumulated across all hours over the last 15 days.
- Select one of the units of evaluation:
- Baseline Standard Deviation(s): A baseline deviation is the standard deviation from a baseline at a point in time, represented as an integer value. For example, 3 Baseline Standard Deviations.
- Baseline Percentage: The baseline percentage is the percentage above or below the established baseline at which the condition triggers. For example, if you have a baseline value of 850 and you have defined a baseline percentage of > 1%, the condition is
true
if the value is > [850+(850x0.01)] or 859.
- Select one of the baselines:
If you want the condition to evaluate as
true
whenever a configured metric does not return any data during the evaluation time frame:- Expand Advanced Settings.
Check the Evaluate to true on no data option.
This option does not affect the evaluation of unknown in the case where there is no enough data for the rule to evaluate. For example, if the health rule is configured to evaluate the last 30 minutes of data and a new pod is added, the condition evaluates to unknown for the first 30 minutes even if the Evaluate to true on no data box is checked.
If you want to define a Persistence Threshold for the condition to reduce false alerts:
Select Trigger only when violation occurs __ times in the last __ min(s).
Define the number of times metric performance data should exceed the defined threshold to constitute a violation.
If required, adjust the evaluation time frame by setting an alternate evaluation time frame.
You can define a persistence threshold for a condition only if you have defined an evaluation time frame of 30 minutes or less.
For no data scenarios, you can control the health rule evaluation to one of the following conditions:
- Critical or Warning: The health rule considers this no data scenario as Critical or Warning condition and the health rule status is shown in Red or Yellow respectively.
- Unknown: The health rule considers the no data scenario as unknown and the health status is shown in Gray.
- Healthy: The health rule considers the no data scenario as healthy and the heath rule status is shown in Green.
Supported Qualifiers
The following table lists the qualifiers for a single metric and a metric expression:
Qualifier Type | Description |
---|---|
Minimum | The minimum value reported across the configured evaluation time length. Not all metrics have this type. |
Maximum | The maximum value reported across the configured evaluation time length. Not all metrics have this type. |
Value | The arithmetic average of all metric values reported across the configured evaluation time length. This value is based on the type of metric. |
Sum | The sum of all the metric values reported across the configured evaluation time length. |
Percentile | The percentile value of a metric data. This qualifier type is only applicable for the single metric Response Time- Histogram (ms) . Currently, the supported percentile values (integer) are: 50 , 75 , 90 , 95 , and 99 . The percentile type is only applicable for a health rule violation that occurs <x> times in the last <y> minutes. Here, x and y are positive integer values. |
Build a Metric Expression
To access the expression builder to create a complex expression as the basis of a condition, click Add Expression. The Metric Expression window is displayed that allows you to construct a mathematical expression to use as a metric.
- In the Variable Declarations pane of the Metric Expression builder, click Add variable.
- Select a metric from the dropdown list. The dropdown list displays the metrics corresponding to the entity type selected on the Select Entity Type wizard. For example, if you select
apm:service
as the entity type, the metrics available areCalls Per Minute (calls/min)
,Average Response Time (ns)
, andErrors Per Minute
(errors/min
). - Select a source from the dropdown list. The dropdown list displays the sources appropriate to the selected entity types. For example, if you select
apm:service
as the entity type, the sources available aresys_derived
,infra-agent
, andderived_metric
. Select one of the following qualifiers for the metric.
- If you want to add more variables to use in the expression, click Add variable and repeat steps 2 through 4. You can remove a variable by clicking the X icon.
- In the Expression pane, build the expression by clicking Insert Variable to insert variables you created along with appropriate mathematical signs.
A health rule is not evaluated if any metric in the expression has a null value. This is to avoid erroneous evaluations as shown in the following examples:
Health Rule Evaluation Condition
A health rule is not evaluated if any metric in the expression has a null value. This is to avoid erroneous evaluations as shown in the following examples:
Expression Null Value Evaluation a-b-c
a
entire expression is evaluated negative
a/b
b
the number 'a' is divided by zero, evaluates to an error
a*b
a or b
entire expression is evaluated as zero
When the expression is built, click Submit.
Create a Custom Boolean Expression
Once you define all the conditions required for a health rule, you can create a custom boolean expression to evaluate the health rule. In the Enter condition combination field, enter a combination of conditions using AND
and/or OR
operators. For example, (A OR B) AND C
. This field displays only if you specify Custom expression to evaluate the health rule.
You can form a boolean expression by combining multiple conditions of the same type such as multiple metric-based conditions, event-based conditions, or log-based conditions. However, a boolean expression that is formed by combining together an event-based condition, a metric-based condition, and log-based condition is not supported.
Delete a Condition
Click to delete a condition component.
If you delete a condition, update the boolean expression accordingly.