This page applies to an earlier version of the AppDynamics App IQ Platform.
For documentation on the latest version, see the 4.4 Documentation.


Skip to end of metadata
Go to start of metadata

This topic is a "field guide" to certain metrics you may see in the AppDynamics UI. It does not attempt to explain every metric, as most are pretty self explanatory. However, it explains certain metrics frequently asked about. Metrics show up in various places, including dashboard flow maps and tabs, the metric browser, and transaction snapshots. 

Business Transactions Metrics

AppDynamics displays all business transactions for a single business application on the Business Transactions list. You can configure what columns to show in the View Options menu. Use these performance metrics to troubleshoot your applications. See  Set Performance Boundaries Using ThresholdsConfigure Transaction Thresholds, and Troubleshoot Slow Response Times

 

% Slow Transactions

Where foundapplication > Business Transactions list > Slow/Stalled Requests

Definition: Percentage of instances that are slow over the selected time frame.

% Stalled Transactions

Where foundapplication > Business Transactions list > Slow/Stalled Requests

Definition: Percentage of instances that stalled over the selected time frame.

% Very Slow Transactions

Where foundapplication > Business Transactions list > Slow/Stalled Requests

Definition: Percentage of instances that are very slow over the selected time frame.

Block Time (ms)

Where foundapplication > Business Transactions list > Slow/Stalled Requests

Definition: Average time spent when instances are blocked for thread synchronization and locks.  

Calls

Where foundapplication > Business Transactions list > Key Performance Indicators

Definition: KPI. See Calls.

Calls/min

Where foundapplication > Business Transactions list > Key Performance Indicators

Definition: See Calls Per Min.

CPU Used (ms)

Where foundapplication > Business Transactions list > CPU Usage

Definition: Presents the same information as JVM CPU Burnt (ms/min). This is the amount of time the JVM spent using the CPU to process transactions monitored by the Java agent.Transactions monitored by other agents may not present a meaningful metric here. 

Relevance: An instance might wait or be blocked when it is not using the CPU. 

Errors/min

Where foundapplication > Business Transactions list > Key Performance Indicators

Definition: See Errors Per Min.

Error %

Where foundapplication > Business Transactions list > Key Performance Indicators

Definition: Percentage of instances that are errors.

Health

Where foundapplication > Business Transactions list > Key Performance Indicators

Definition: The health column shows red, yellow, or green icons corresponding to the health rule settings for the business transaction. Click the icon to get more information. See Default Health Rules.

Max Response Time (ms)

Where foundapplication > Business Transactions list > Key Performance Indicators

Definition: KPI. Longest time spent processing an instance.

Min Response Time (ms)

Where foundapplication > Business Transactions list > Key Performance Indicators

Definition: KPI. Shortest time spent processing an instance.

Relevance

Original Name

Where foundapplication > Business Transactions list > Other

Definition: Default name applied by AppDynamics. 

Relevance: Business transaction identifier. If you renamed the business transaction, viewing the original name can be useful for debugging.

Response Time (ms)

Where foundapplication > Business Transactions list > Key Performance Indicators

Definition:  See Response Time.

Slow Transactions

Where foundapplication > Business Transactions list > Slow/Stalled Requests

Definition: The number of instances that meet the criteria defined for a slow transaction.

RelevanceSee  Set Performance Boundaries Using Thresholds.

Spark charts

Where foundapplication > Business Transactions list > Key Performance Indicators

Definition: Shows the response time, calls per minute, and errors per minute as a graph over the selected time range.

Stalled Transactions

Where foundapplication > Business Transactions list > Slow/Stalled Requests

Definition: See Stall Count.

Tier

Where foundapplication > Business Transactions list > Other

Definition: Display name of the originating tier for the business transaction.

Relevance: Business transaction identifier.

Type

Where foundapplication > Business Transactions list > Other

DefinitionThe types that are listed depends on the app agent (Java, .NET, PHP, and so on).

Relevance: Business transaction identifier.

Very Slow Transactions

Where foundapplication > Business Transactions list > Slow/Stalled Requests

Definition: Number of instances that meet the criteria defined for a very slow transaction.

Relevance See Set Performance Boundaries Using Thresholds.

Wait Time (ms)

Where foundapplication > Business Transactions list > Slow/Stalled Requests

Definition: Average time spent when invocations are in a thread sleep or wait state.

Business Transactions Dashboard

AppDynamics displays summary statistics for a specific business transaction on the Business Transactions Dashboard.

Custom Metrics

You can write a monitoring extension for the Standalone Machine Agent to add custom metrics to the metric set that AppDynamics already collects and reports to the Controller. For example, the ehCache monitoring extension available from AppDynamics eXchange can collect metrics that appear in Application Infrastructure Performance > Custom Metrics section of the Metric Browser and can be used to create a custom dashboard to monitor Ehcache performance.  

Flow Map Metrics

Flow maps appear in several of the built-in dashboards in the UI, and show different information depending upon the context in which they appear:

For example, the Application Flow Map displays overall performance statistics for all Business Transactions, including calls per minute; average response time for calls made to other tiers, databases, and remote services; and business transaction errors per minute. These metrics are based on all calls made from a specific tier to another tier, database or remote service across all business transactions. 

Calls per minute and average response time metrics are also presented for calls between tiers, and from tiers to backend systems such as databases.

Metric Browser Metrics

 

For most types of metrics in the browser, you can click any of the points in the graph to view more information about the metric observed at that point in time. The information shown includes the metric identifier, date and time of the observation, along with any of the following values relevant to the metric:

  • Obs (observed value): the average of all data points seen for that interval. For the Percentile Metric for the App Agent for Java, this is the percentile value. For a cluster or a time rollup, this represents the weighted average across nodes or over time. 
  • Min: the minimum data point value seen for that interval
  • Max: the maximum data point value seen for that interval
  • Sum: the sum of all data point values seen for that interval. For the Percentile Metric for the App Agent for Java, this is the result of the percentile value multiplied by the Count.
  • Count: the number of observations aggregated in that one point. For example, a count of 5 indicates that there were 5 1-minute data points aggregated into one point.

The following describe select metrics reported in the Metric Browser:

Availability - App

Where found: Application Infrastructure Performance > tier > Agent > App > Availability

Definition: Application server (such as JVM) availability. The application server is available if it is reporting to the Controller. The application server may be running on more than one node and this metric reflects how many nodes the application server was running on.

Relevance: When the application server is shutdown or crashes, then its availability metric decreases. Be aware of how often and for how long your application server was down. Availability is a good indicator of server health. 

Availability - Machine

Where found: Application Infrastructure Performance > tier > Agent > Machine > Availability

Definition: Machine availability. This Controller reports the machine as available as long as the Standalone Machine Agent is reporting. 

Relevance: When the machine is not available, no metrics for your server are available. The machine may need to be restarted or there may be a networking problem preventing connection with the Controller.

Average Block Time

Where found: Business Transaction Performance > Business Transactions > application > business transaction > Average Block Time (ms) 

Definition: Average wait time to get a lock. 

Relevance: A high block time means there is often contention for the lock required for a thread to work on an object. You can use thread dumps to diagnose lock contention problems and optimize application and JVM performance. 

Average Response Time 

Where found: Business Transaction Performance > Overall Application Performance > tier > Average Response Time (ms)

Definition: Average time per minute to process a business transaction. 

Relevance: A high response time may indicate slow or stalled transactions, slow database or remote service calls, or problems with backends.

Average Wait time  

Where found: Business Transaction Performance > Business Transactions > application > business transaction > Average Wait Time (ms)  

Definition: Average wait time for a thread to process an object such as TIMED_Wait (sleeping while waiting for disk or network I/O, on object monitor), WAITING (parking) for a thread lock, TIMED_WAITING (on object monitor). 

Relevance: A high average wait time may be indicative of problems such as disk lock or network contention. Examine thread dumps to determine what the threads are waiting for. 

Avg Queue time (ms) - Disk 

Where found: Application Infrastructure Performance > Hardware Resources > Disks > disk mount point > Avg Queue Time (ms) 

Definition: Average time spent in the queue before a read or write request could be serviced.  

Relevance: High disk usage can result in bottlenecks that negatively impact application performance. If you see this metric increasing steadily over time, you may want to consider adding more or using faster disks.

Avg Service Time - Disk

Where found: Application Infrastructure Performance > Hardware Resources > Disks > disk mount point > Avg Service Time (ms) 

DefinitionAverage time required to service a read or write request. 

Relevance: When the service time increases, it may mean that the disk has become fragmented and needs to be defragmented. It also could indicate that the disk has many unreadable/unwritable blocks and should be replaced.  

Calls per Minute 

Where found: 

  • Overall Application Performance  >  Calls per Minute 
  • Overall Application Performance  > tier > Calls per Minute 
  • Overall Application Performance  > tier > node > Calls per Minute 
  • Overall Application Performance  >  tier > node > individual node > Calls per Minute  

Definition: Total number of business transaction executions per minute. 

Relevance: A decrease in the number of calls per minute may indicate problems processing the transactions because of code, network or hardware problems and should be investigated further. See Monitor Errors and Exceptions and Compare Snapshots to Find Slow Calls. 

% CPU Time - Disk 

Where found: Application Infrastructure Performance > Hardware Resources > Disks > disk mount point > %CPU Time 

Definition: The percentage of CPU processing capabilities consumed by the disk during read and write operations.  Application Infrastructure Performance metrics are available for systems monitored by the Standalone Machine Agent.  

Relevance: A high percentage could indicate that a database accessed by the application is missing an index so many rows are read from the disk before the information required is found. It could also indicate that the database cache is not being used properly and needs tuning.  It may also point to an I/O bottleneck because this metric also takes into consideration the amount of time the CPU is waiting for read or write operations to complete.

Events Exceeding Limits

Where foundapplication > Agent > Event Upload > Events Exceeding Limit 

Definition: This is the number of events dropped due to the maximum number of events uploaded per minute for the agent being exceeded. Events are dropped based on the limits imposed by the events.buffer.size value. No additional events are tracked for the rest of the minute at which time the event count starts over. 

Relevance: To ensure you are not missing event notifications, ensure that the events.buffer.size and events.uploaded.per.min Controller settings are set appropriately. You can see event details on the Events tab of the UI. 

Events Uploaded 

Where foundapplication > Agent > Event Upload > Events Uploaded 

Definition: Number of agent events that were uploaded.

Relevance: You can see event details on the Events tab of the UI. 

Errors per minute

Where found: Overall Application Performance > tier > Errors per minute 

Definition: Unhandled exceptions and any exception that prevents a business transaction from completing successfully are counted as errors. Error configurations let you define the types of errors that the agent reports, so you see just those that are most useful for monitoring and troubleshooting your application environment. See Configure Error Detection.

Relevance: Errors usually indicate underlying code problems and should be resolved as soon as possible. See Troubleshoot Errors and Exceptions 

Exceptions per minute

Where found: Overall Application Performance > tier > Exceptions per minute 

Definition: An exception is a code-based anomalous or exceptional event, usually requiring special processing. Unhandled exceptions are errors and are not included in this count.  

Relevance: Exceptions usually indicate underlying code problems and should be resolved as soon as possible. See Troubleshoot Errors and Exceptions. 

HTTP Error Codes per Minute 

Where found: Overall Application Performance > tier > HTTP Error Codes per Minute

DefinitionHTTP errors include all HTTP calls done outside of a web service call that produced an error.  

Relevance: HTTP Errors usually indicate underlying code problems and should be resolved as soon as possible. See Troubleshoot Errors and Exceptions.

Infrastructure Errors per Minute 

Where found: Overall Application Performance > tier > Infrastructure Errors per Minute

Definition: Infrastructure errors include errors relating to everything outside the business transaction, such as disk and network errors monitored by the Standalone Machine Agent.

Relevance: Infrastructure failures can cost thousands of dollars per hour to your company.   

Invalid Metrics 

Where found: Application Infrastructure Performance > Agent > Metric Upload > Invalid Metrics 

Definition: The number of metrics which were invalid and not accepted by the Controller.

RelevanceEach metric sent from the agent is validated by the Controller. An invalid metric is one for which the value for current, min, max, sum, or count is less than zero.  

Number of Application Infrastructure Changes Sent

Where found: Application Infrastructure Performance > Agent > ConfigChannel > Number of Application Infrastructure Changes Sent

Definition: Number of changes sent to the application server to restart the server or deploy code.

Relevance: When using the Standalone Machine Agent, you can set health rule policies to fire under adverse circumstances and run remediation scripts to perform application infrastructure changes such as restarting the server or running code to diagnose or resolve the problem. 

Registration Failed

Where found: Application Infrastructure Performance > tier > Agent > Business Transactions > Registration Failed 

Definition: For the selected time range, indicates the number of unsuccessful registrations for the business transaction. If there are new business transactions that haven’t been seen before by the agent, they are posted to the Controller for registration every 10 seconds.  

Relevance: If the business transaction registration fails it may be because the registration limit for the Controller has been reached. In this case, you may want to review the business transactions that you are currently monitoring. See Application Performance Management to refine business transaction discovery for your agents. 

Registration Successful 

Where found: Application Infrastructure Performance > tier > Agent > Business Transactions > Registration Successful 

Definition: For the selected time range, indicates the number of successful registrations for the business transaction. If there are new business transactions that haven’t been seen before by the agent, they are posted to the Controller for registration every 10 seconds. 

Relevance: New transactions were successfully registered. See Application Performance Management to refine business transaction discovery for your agents.

Request License Errors 

Where found: Application Infrastructure Performance > Agent > Metric Upload > Request License Error 

Definition: Errors which are due to a mismatch between the number of license and the maximum number of requests. Counts are incremented when this occurs.

Relevance: The Controller setting, rsds.upload.limit.per.min controls the maximum number of requests that can be uploaded per minute per number of license units in the account. Ensure this setting is configured appropriately for your account. 

RQ 

Where found: Application Infrastructure Performance > Hardware Resources > System > RQ 

Definition: Number of processes that are ready and waiting to run as soon as the processor finishes running other processes.

Relevance: A high RQ could indicate that the load placed on the processor is too high and perhaps the processor should be upgraded to accommodate this load. Application performance would be faster if the processes could run when they are ready to.  

Space Available

Where found: Application Infrastructure Performance > Hardware Resources > Disks > disk > Space Available 

Definition: The amount of unused or available disk space. 

Relevance: You should monitor this metric to ensure you always have enough disk space. You could for example, create a health rule to send an email when space available is below a certain level. 

Space Used 

Where found: Application Infrastructure Performance > Hardware Resources > Disks > disk > Space Used 

Definition: The amount of used or unavailable disk space. 

Relevance: You should monitor this metric to ensure you always have enough disk space. You could for example, create a health rule to send an email when space used reaches a certain percentage of available disk space.

Stall Count

Where found: Business Transaction Performance > Business Transactions > application > transaction > Stall Count 

Definition: Number of instances that meet the criteria defined for a stalled transaction. If a transaction hits the stall threshold (takes more than 45 seconds or the set stall threshold to finish), a stall transaction event is sent out because the transaction might take a very long time to eventually finish or time out.  Criteria for slow, very slow, and stalled transaction performance is determined by thresholds. See Configure Transaction Thresholds. 

Relevance: A high stall count impacts application performance and results in slow transactions. See Troubleshoot Slow Response Times 

Time skews Errors 

Where found: Application Infrastructure Performance > tier > Agent > Event Upload > Time skews Errors 

Definition: Number of event errors that occurred because of a difference between agent and controller timestamp and subsequently the events couldn't be uploaded or were not accepted by the Controller. 

Relevance: If the Controller receives metrics that are time-stamped ahead of its own time, the Controller reject the metrics. To avoid this possibility, maintain clock-time consistency throughout your monitored environment. 

Unmonitored calls per minute 

Where found: Application Infrastructure Performance > tier > Agent > Business Transactions > Unmonitored Calls per Minute 

Definition: Number of business transactions which were not monitored by agent 

RelevanceNumber of business transactions detected in excess of the configured business transaction (BT) limit. Instead of tracking all BTs, AppDynamics tracks a max of 50 BTs by default per node. When that limit has been reached, AppDynamics tracks additional BTs with a simple counter and reports them in the 'All Other Traffic' category in the Application > Business Transactions list and in the Metric Browser as unmonitored calls per minute.

Node Dashboard Metrics

AppDynamics displays summary statistics for calls on the Node Dashboard.

Calls

Where found: application > Servers > App Servers > tier  > node > Dashboard
                        application > Business Transactions >  Business Transactions list

Definition: Call volume, the total number of invocations of the entry point for all instances of the business transaction during the specified time from the node to the destination displayed.

Relevance: The more calls in your system, the busier it is.  You might analyze the number of calls for system sizing.

Calls/min

Where found: application > Servers > App Servers > tier  > node > Dashboard
                        application > Business Transactions >  Business Transactions list

DefinitionThe average number of incoming or outgoing calls per minute during the specified time from the node to the destination displayed.

RelevanceThe more calls in your system, the busier it is.  You might analyze the number of calls to determine for promotion purposes or for system sizing. See also, Calls per Minute.

Errors

Where foundapplication > Servers > App Servers > tier > node > Dashboard
                        application > Business Transactions >  Business Transactions list

DefinitionThe number of errors experienced by the calls or business transactions during the specified time from the node to the destination displayed.

Relevance: Error configurations let you define the types of errors that the agent reports, so you see just those that are most useful for monitoring and troubleshooting your application environment. See Configure Error Detection.

Errors/min

Where found: application > Servers > App Servers > tier > node > Dashboard
                        application > Business Transactions >  Business Transactions list

DefinitionThe average number of errors per minute experienced by calls/business transactions during the specified time from the node to the destination displayed.

RelevanceError configurations let you define the types of errors that the agent reports, so you see just those that are most useful for monitoring and troubleshooting your application environment. See Configure Error Detection. See also Errors per Minute in the Metric Browser for Overall Application Performance and Business Transaction Performance.

Response Time

Where found: application > Servers > App Servers > tier > node > Dashboard
                        application > Business Transactions  > Business Transactions list

Definition: Average response time (ART) spent processing the business transaction, for all instances of the business transaction, from start to end of the entry point invocation.

Relevance: An acceptable response time is crucial to your business. A slow response time indicates problem areas that should be investigated further. See also the value for Normal Average Response Time (ms) in the Metric Browser for the Overall Application Performance and Business Transaction Performance. 

Node Dashboard - Memory Metrics

The following shows the Memory tab of the Node Dashboard. 

GC Time Spent (ms/min)

Where foundapplication > Server > App Servers > tiernode

DefinitionFor the selected time range, indicates the average number of milliseconds used by the JVM during the JVM CPU Burnt time to compete Garbage Collection (GC).

Relevance: JVM GC time can negatively impact applications and services.  Tune GC management for each application or service.  If the GC time Spent is high, it could indicate that the heap setting for the JVM is undersized for the application load and can be can be quickly exhausted, triggering the JVM GC. If the JVM GC can't free enough memory, the JVM would likely run again for just a short time before running out of memory again, triggering another JVM GC session. 

JVM CPU Burnt (ms/min)

Where found: Memory tab of the Node Dashboard

Definition: JVM CPU Burnt metrics are obtained via the OperatingSystemMXBean's internal api. The metric is the total CPU used by the process since it started. The agent converts this forever-increasing metric into minute chunks. For the selected time range, JVM CPU Burnt indicates the average number of milliseconds of CPU time the JVM used for its processes in a minute.

RelevanceHigh CPU usage can degrade performance and may result when the JVM memory settings are too low. If the heap setting is undersized for the application load, it can be quickly exhausted, triggering the JVM GC. If the JVM GC can't free enough memory, the JVM would then run again for just a short time before running out of memory again, triggering another JVM GC session. If the JVM GC session is running over and over again, CPU usage will increase.

Transaction Snapshot Metrics

The following shows the Call Graph tab of a transaction snapshot.


See Transaction Snapshots for a discussion of the metrics reported in transaction snapshots.



 

  • No labels