This page applies to an earlier version of the AppDynamics App IQ Platform.
For documentation on the latest version, see the 4.4 Documentation.


On this page:

Related pages:

Your Rating:
Results:
PatheticBadOKGoodOutstanding!
50 rates

This topic describes the detailed steps for configuring health rules using the health rule wizard. For more information about what these settings mean see Health Rules.

To access the health rule wizard:
  1. Click Alert & Respond in the menu bar.
  2. Click Health Rules either in the right panel or the left navigation pane.
  3. Select the context for the health rule from the pulldown menu.
  4. Do one of the following:
    • To create a new health rule, click the + icon.
    • To edit an existing health rule, select the health rule and click the Edit (pencil) icon.
    • To remove an existing health rule, select the health rule and click the Delete (-) icon.

Structure of the Health Rule Wizard

The health rule wizard contains four panels:

  • Overview: Sets the health rule name, enabled status, health rule type, health rule enabled period, and health rule evaluation time.
  • Affects: Sets the entities evaluated by the health rule. The options presented vary according to the health rule type set in the Overview panel.
  • Critical Condition: Sets the conditions, whether all or any of the conditions need to be true for a health rule violation to exist, and the evaluation scope (business transaction and node health policies defined at the tier level only); it also includes an expression builder to create complex expressions containing multiple metrics.
  • Warning Condition: Settings are identical to Critical Condition, but configured separately.

You can navigate among these panels using the Back and Next buttons at the bottom of each panel or by clicking their entries in the left panel of the wizard. You should configure the panels in order, because the configuration of the health rule type in the Overview panel determines the available affected entities in the Affects panel as well as the available metrics in the Condition panels.

Use the Health Rules Wizard

This section describes the procedure for creating health rules of one of the standard types.

Configure Generic Heath Rule Settings

You configure generic settings in the Overview panel.

  1. Enter a name. If a name already exists, you can change it.
  2. Check Enabled to enable the rule, clear the check box to disable it.
  3. Select a health rule type by clicking the name in the list.
    This setting affects metrics offered for configuration in subsequent panels in the wizard, so you must select a health rule type before continuing to other panels.
  4.  If the health rule is always (24/7) enabled, check the Always check box.
    If the health rule is enabled only at certain times, clear the Always check box and either:
    1. Select a predefined time interval from the During these times drop-down menu
      or
    2. Click Create New Schedule. See below for information on creating a new health rule schedule. 

  5. Click the drop down menu Use the last <> minutes of data and select a value between 1 and 360 minutes for the evaluation window. This is the amount of recent data to use to determine whether a health rule violation exists. This value applies to both critical and warning conditions. See Health Rule Evaluation Window.

  6.  In the Wait Time after Violation section, enter the number of minutes to wait before evaluating the rule again for the same affected entity in which the violation occurred. See Health Rule Wait Time After Violation.

  7. Save your configuration.

To Create a New Health Rule Schedule:
  1.  In the Overview window of the Health Wizard, clear the Always check box if it is checked.
  2. Click Create New Schedule.
     Enter a name for the schedule.
  3. Enter an optional description of the schedule.
  4. Enter the start and end times for the schedule. For example, this configuration starts applying the rule at 6:00 and ends it at 15:00 every Monday-Friday.

    The Controller cron expression format is based on Quartz Scheduler cron expressions. For more information, see the Quartz Scheduler documentation
  5. Save your configuration.

After a new health rule schedule has been created, it cannot be modified or deleted.

Configure Affected Entities

The Affects panel lets you define what your health rule affects. The choices you are offered depends on the health rule type you chose in the Overview panel. In the example below, the health rule type selected affects Tiers or Nodes.

  1. Use the dropdown menu to select the the entities affected by this health rule.
    The entity affected and the choices presented in the menu depend on the health rule type configured in the Overview window.
    See Entities Affected by a Health Rule for information about the types of entities that can be affected by the various health rule types.
  2. If you select entities based on matching criteria, specify the matching criteria.
    For nodes, you can restrict the node on criteria such as meta-info, environment variables, and JVM system environment properties. Meta-info includes key value pairs for:
    • key: supportsDevMode
    • key:ProcessID
    • key: appdynamics.ip.addresses
    • any key passed to the agent in the appdynamics.agent.node.metainfo system property
  3. If you are configuring a JMX health rule, select the JMX objects that the health rule is evaluated on. See JMX Health Rules.

Configure Health Rule Conditions

The high-level process for configuring conditions is:

  1. Determine the number and kind of metrics the health rule should evaluate. For each performance metric you want to use, create a condition.
    1. You can use a single condition component or multiple condition components for a single condition state. 
    2. You can use values based on complex mathematical expressions.
  2. Decide whether the health rule is violated if all of the tests are true or if any single test is true.
  3. For business transaction performance health rules and node health rule types that specify affected entities at the tier level, decide how many of the nodes must be violating the health rule to produce a violation event. See Health Rule Evaluation Scope.
  4. To configure a critical condition use the Critical Condition window. To configure a warning condition use the Warning Condition window.

The configuration processes for critical and warning conditions are identical.

You can copy the settings between Critical and Warning condition panels and just edit the fields you desire. For example, if you have already defined a critical condition and you want to create a warning condition that is similar, in the Warning Condition window click  Copy from Critical Condition to populate the fields with settings from the Critical condition.

To Create a Condition:
  1. In the Critical Condition or Warning Condition window, click + Add Condition to add a new condition component.
    The row defining the component opens. See To Configure a Condition Component. Continue to add components to the condition as needed.
  2. From the drop-down menu above the components, select All if all of the components must evaluate to true to constitute violation of the rule. Select Any if a health rule violation exists if any single component is true.
  3. For health rules based on the following health rule types:
    1. business transaction 
    2. node health-hardware
    3. node health-transaction performance 
    you must specify evaluation scope:

  4.  If the Health Rule will violate if the conditions above evaluate to true section is visible, click the appropriate radio button to set the evaluation scope.

    If you select percentage of nodes, enter the percentage. If you select number of nodes, enter the absolute number of nodes.

To Configure a Condition Component:

  1. In the first field of the condition row, name the condition.
    This name is used in the generated notification text and in the AppDynamics console to identify the violation.
  2. To select the metric on which the condition is based, do one of the following:
    1. To specify a simple metric, click the metric icon to open a small metric browser and select Specify a Metric from the Metric Tree.
      The browser display metrics appropriate to the health rule type. Select the metric to monitor and click Select Metric.
      or
    2. To build an expression using multiple metric values, click the gear icon at the end of the row and select Use a mathematical expression of 2 or more metric values.
      This opens the mathematical expression builder where you can construct the expression to use as the metric.

      See 
      To Build an Expression for details on how to do this.
  3. From the Value drop-down menu before the metric, select the qualifier to apply to the metric from the following options:

    Qualifier Type

    What This Means

    Minimum

    The minimum value reported across the configured evaluation time length. Not all metrics have this type.

    Maximum

    The maximum value reported across the configured evaluation time length.  Not all metrics have this type.

    Value

    The arithmetic average of all metric values reported across the configured evaluation time length.

    Sum

    The sum of all the metric values reported across the configured evaluation time length.

    Count

    The number of times the metric value has been measured across the configured evaluation time length.

    Current

    The value for the current minute.

    4. From the drop-down menu after the metric, select the type of comparison by which the metric is evaluated.

    1. To limit the effect of the health rule to conditions during which the metric is within a defined distance (standard deviations or percentages) from the baseline, select Within Baseline from the menu.T o limit the effect of the health rule to when the metric is not within that defined distance, select Not Within Baseline. Then select the baseline to use, the numeric qualifier of the unit of evaluation and the unit of evaluation. For example:

      Within Baseline of the Default Baseline by 3 Baseline Standard Deviations
      
    2. To compare the metric with a static literal value, select < Specific value or > Specific Value from the menu, then enter the specific value in the text field. For example:

      Value of Errors per Minute > 100
    3. To compare the metric with a baseline, select < Baseline or > Baseline from the drop-down menu, and then select the baseline to use, the numeric qualifier of the unit of evaluation and the unit of evaluation. For example:
       

      Maximum of Average Response Time is > Baseline of the Daily Trend by 3 Baseline Standard Deviations

      See Dynamic Baselines for information about the baseline options.

Baseline Percentages

The "baseline percentage" is the percentage above or below the established baseline at which the condition will be triggered. If, for example, you have a baseline value of 850 and you have defined a baseline percentage of "> 1%", the condition is true if the value is > [850+(850x0.01)] or 859.  In addition, to prevent too small sample sets from triggering health rules violations, these rules are not evaluated if the load (the number of times the value has been measured) is less than 1000. So if, for example, a very brief time slice is specified, the rule may not violate even if the conditions are met, because the load is not large enough.

Remember to Save.

Using Health Rule Conditions to evaluate agent availability metrics can result in false positives. For example:

  • Agents may not be connecting with controllers due to communication errors for a couple of minutes.
  • Data may be delayed for a couple of minutes due to latency issues.

You can avoid occasional 1-2 minute metric loss due to network issues or late arrival by configuring your Health Rule as follows:
  1. Use the last 5 minutes, with a wait time of 10 minutes.
  2. Select Node Health as the Type.
  3. Select Agent|App|Availability or Machine Availability (for machine agent) as the Business Transaction Metric.
  4. Set your condition to be the Sum of < Specific Value of 3.

This will generate a violation when the agent is down for more than 2 minutes in the last 5 mins.

To remove a condition component:

Remove a component condition by clicking the delete icon.

To build an expression:

To access the expression builder to create a complex expression as the basis of a condition, click the gear icon at the end of the row and select Use a mathematical expression of 2 or more metric values.

In the expression builder, use the Expression pane to construct the expression.
Use the Variable Declaration pane to define variables based on metrics to use in the expression. For example, this is a metric to measure the percent of slow business transactions.

  1. In Variable Declaration pane of the Mathematical Expression builder, click + Add variable to add a variable.
  2. In the Variable Name field enter a name for the variable.
  3. Click Select a metric to open an embedded metric browser
  4. From the drop-down menu select the qualifier for the metric.
  5. Repeat steps 1 through 4 for each metric that you will use in the expression.
    You can remove a variable by clicking the delete icon.
  6. Build the expression by clicking the Insert Variable button to insert variables created in the Variable Declaration pane.  
     
  7. When the expression is built, click Use Expression.

Custom Metrics in Multiple Entities

To create a health rule on a custom metric in a single business transaction, node, or overall application performance, you specify the health rule type as "custom" and when you configure the condition component, in the Select Metric window choose Specify a Metric from the Metric Tree and select the metric from the embedded metric browser.

A different use case is to create a rule that evaluates a custom metric that exists across various entities, for example across several nodes. You want to do this with one health rule, you do not want to create a separate health rule for each node. In this case, you need to specify the custom metric using the relative metric path to the metric instead of selecting the metric from the embedded metric browser.

First get the relative path to the metric and then configure the health rule using that relative path.

To get the relative metric path for a multi-entity metric:
  1. Navigate to the Metric Browser by selecting Metric Browser in the left navigation pane.
  2. Select the metric that you want to use for the condition.
  3. Right-click and select Copy Full Path.
  4. Save this value in a file from which you can copy it later.

The following example gets the metric path for the CPU %Busy metric for the Inventory Server tier. This would be appropriate to use in a health rule that affects all the nodes in that tier.

To configure a health rule that evaluates the custom metric over multiple entities:
  1. In the Overview panel of health rule wizard choose the health rule type for the kind of entity that you are monitoring.
  2. In the Affects panel select the affected entity.
  3. When you create the condition component that uses the metric, in the Select Metric window choose Specify a Relative Path Metric.
  4. Crop the relative metric path that you saved from the metric browser by doing one of the following:
    1. For all health rule types except Node Health-Hardware, JVM, CLR or Custom, crop the path to use the metric name alone - for example, Average Wait Time (ms)) 
    2. For Node Heath-Hardware, JVM, CLR and Custom health rule types, crop the path to use everything after the entity, for example, after the Node name. In the example below the cropped path would look like this. 

  5. Paste the cropped relative metric path in the relative metric path field of the Select Metric window.
  6. Click Select Metric.

Additional Considerations

When you are configuring health rules for business transactions with a very fast average response time (ART) such as 25 ms, using standard deviation as a criterion can cause the health rule to be violated too frequently. This is because a very small increase in response time can represent multiple standard deviations. In this case, consider adding a second condition that sets a minimum ART as a threshold. For example, if you don't want to be notified unless ART is over 50 ms, you could set your threshold as: ART > 2 Standard Deviations and ART > 50 ms.

Similarly, when configuring health rules for calls-per-minute (CPM) metrics, the health rule may never be violated if the condition is using standard deviations, and the resulting value is below zero. In this case, consider adding a second condition that checks for a zero value, such as: CPM < 2 Standard Deviations and CPM < 1.

  • No labels