This page introduces transaction snapshots and call graphs, and describes how to use them.
About Transaction Snapshots
AppDynamics monitors every execution of a business transaction in the instrumented environment, and all such transactions are reflected in the metrics for an application.
For some of business transaction instances, AppDynamics retains a snapshot of the transaction. A transaction snapshot gives you a cross-tier view of the processing flow for that particular invocation of the transaction. Call drill downs, where available, let you dive into the details of the transaction execution on a tier.
Subject to the guidelines and limits described in Transaction Snapshot Retention Rules and Limits, AppDynamics retains snapshots in these cases:
- The user experience for the business transaction was determined to be slow or the transaction incurred an error.
- The snapshot was collected as a result of periodic snapshot collection.
- The transaction snapshot was collected during a diagnostic session.
The snapshot can be partial or complete call graph. The call graph reflects the code-level view of the transaction at each tier that participated in processing the business transaction.
Viewing Transactions on the Flow Map
You can access business transaction snapshots from several locations in the Controller UI. For example, you can click on Slow Response Times or Errors under Troubleshooting in the left navigation tree for a business application. Another way to access a snapshot is by transaction. From the business transaction page, double click a transaction and then click the Transaction Snapshots tab.
From either location, when you double click on a business transaction snapshot, the snapshot viewer appears, as in this example:
As shown in the screenshot, the transaction flow map includes the following metrics:
Tier Response Time (ms)
The total response time for the call as measured at the calling tier. This includes the processing time on the called tier as well as on any tiers and backends it calls in turn.
|2||Percentage of Time Spent (%)|
Percentage metric represents the fraction of time spent processing at a particular tier or in communication with other tiers/backends from the entire execution lifespan of a business transaction. This metric does not include the processing time of asynchronous activities, if any.
Asynchronous Activity Processing Time (ms)
Processing time of all asynchronous activities at this tier. This metric does not contribute to the overall tier response time because the activity is asynchronous by nature. This metric is calculated by adding the execution times of all asynchronous activities at a tier and the time spent in communication between other tiers and backends as follows:
Asynchronous Activity Processing Time = Asynchronous-activity-1-processing-time + Asynchronous-activity-2-processing-time + so on.
Execution Time (ms)
Total time spent processing by the business transaction in all affected tiers and communication with other tiers and backends. This metric does not include processing time of the asynchronous activities. However, in the case of Wait-for-Completion, the originating business transaction will take a longer time processing the request due to blocking and waiting for all the activities to complete before proceeding.
The formula for this metric is calculated by summing up the processing times of a Business Transaction at a particular Tier/communication between Tiers/Backends as follows:
Execution Time = Time-spent-processing-in-Tier-1 + Time-spent-processing-in-Tier-2 + Time-spent-communicating-with-Tier-2 + so on.
The Potential Issues panel gives you a summary of potential root causes for performance issues for the transaction in the form of slow method, or slow SQL and remote service calls. Click an item in the Potential Issues list to go to the call in the call graph.
Depending on the transaction, other metrics may appear as well. For example, when a tier makes an exit call that is received by the same tier, the time for the call is displayed. The metric value shows the the time spent in the call from the moment the call went out of the tier until it returned to the caller. These are identified by the "async" label.
Note that the flow map or Overview is one of several views of the business transaction in the snapshot viewer. Other views are:
- Slow Calls and Errors, which presents information on the slowest database and remote service calls, slowest methods, and errors. You can gain further insight into these slow calls and errors either by viewing their details on by drilling down into their call graphs.
- Waterfall View, which presents the call execution times as they occur during the end-to-end transaction time as a chart
- Segment List, which shows the various legs of the transaction in descending order of duration and give access to their snapshots and allows you to drill down into their details.
Transaction Snapshot Retention Rules
In a flow map for a business transaction, any tier that has retained a snapshot call graph for the transaction includes a Drill Down link. User who are members of a role with view permissions to the correlated application can follow the link to see the detailed call graph.
For a given transaction instance, a snapshot may be available for some tiers but not all. The following guidelines describe when transaction snapshots are captured for the originating and downstream tiers in a transaction. The guidelines apply to business transaction correlation as well as cross-application flow.
- Any tier (originating or continuing) takes a snapshot when it recognizes that it is experiencing slow, very slow, or stalled response times or has errors.
- An originating tier takes a transaction snapshots:
- When a diagnostic session is triggered by the originating tier. The agent starts diagnostic sessions when it detects a pattern of performance problems. In addition you can manually start a diagnostic session from the Business Transaction Dashboard. For details see Diagnostic Sessions.
- When the agent identifies slow, very slow, or stalled response times, or errors on the originating tier. These snapshots may have partial call graph information, because they start at the time when the transaction slowed or experienced an error.
- Based on the periodic collection schedule. By default the agent captures one snapshot every 10 minutes.
- The downstream tier captures snapshots if the tier immediately upstream to it tells it to take a snapshot. An upstream tier might direct its downstream tier to take a snapshot under these circumstances:
- The upstream tier is taking a snapshot for a diagnostic session.
- The upstream tier is taking a snapshot based on the periodic collection schedule.
Within the guidelines, snapshot retention is also subject to snapshot retention limits, as described in the following section.
Transaction Snapshot Limits
Snapshot retention limits prevent excessive resource consumption for a node, and apply even when aggressive snapshot retention is enabled. These limits are:
Originating transaction snapshots are limited to a maximum of 20 originating (5 concurrent) snapshots per node per minute.
Continuing transaction snapshots are limited to a maximum of 200 (100 concurrent) snapshots per node per minute.
AppDynamics applies snapshot retention limits to error transactions as well. As a result, not every error occurrence that is represented in an error count metric, for example, will have a corresponding snapshot. For error transactions, the following limits apply:
- For a single transaction, there is a maximum of two snapshots per minute.
- Across transactions, the maximum is limited to five snapshots per minute. (Specified by the node property max-error-snapshots-per-minute.)
Configure Snapshot Periodic Collection Frequency
By default, AppDynamics collects a snapshot every 10 minutes. You can modify this default in the Slow Transaction Thresholds configuration page. The value will apply to subsequently created business transactions, but if you check Apply to all Existing Business Transactions, all existing business transactions are affected by the change as well.
If you have a high load production environment, it is important that you do not use low values for snapshot collection, in other words, configure collection on a very frequent basis. When there are thousands or millions of requests per minute, collecting snapshots too frequently may result in many extra snapshots that are not highly useful. Either turn OFF the periodic snapshots and apply to all Business Transactions, or choose a very conservative (high) rate depending on the expected load. For example, if you have high load on the application, choose every 1000th executions or every 20 minutes, depending on the load pattern. See Overview of Transaction Snapshots.
In the left navigation pane click Configuration > Slow Transaction Thresholds.
Using the Transaction Snapshot List
You can view transaction snapshots generated in the UI time range from the Transaction Snapshots tab of the application, tier, node, or business transaction dashboards. From there you can:
- Compare Snapshots shows the performance of calls in two snapshots as a side-by-side comparison.
- Identify the most expensive calls / SQL statements in a group of Snapshots shows the calls that take the most time across the snapshots you have selected. You can select up to 30 snapshots.
- Find snapshots using the filter options.
Normally transaction snapshots are purged after a configurable time, two weeks by default. To save a snapshot beyond the normal snapshot lifespan (for example, if you want to make sure a snapshot associated with a particular problem is retained for future analysis), you can archive the snapshot. To archive a snapshot, select it on the list and choose Actions > Archive.
To archive a snapshot, you need the "Application level - Can create applications" permission.
The file cabinet icon in the far right column indicates that the snapshot is an archive snapshot ().
To display only archived snapshots in the snapshot list, filter the snapshot list and check Only Archived.
Transaction Snapshot Call Drill Downs
A call drill down contains details for that business transaction execution on a particular tier. It takes you to the code-level information for the transaction. To get call drill down information, click Drill Down in the transaction snapshot flow map snapshot list. You can drill down into either the node or if you have snapshot correlation configured for transactions between Java agents and Oracle databases monitored by Database Monitoring, you can also drill down into the database details captured during the snapshot.
The contents of a transaction snapshot containing async segments look slightly different if you access the snapshot via the Business Transaction view or via the App/Tier/Node view. In the Business Transaction view, only the originating segments are shown initially, and then you can drill down to the async segments as desired. Because the App/Tier/Node view surfaces all the segments that are relative to that entity, all segments, originating or async, are listed initially.
The following lists the type of information captured in a the call drill down of a node in a transaction snapshot.
Node Drill Down
Problem summary, execution time, CPU timestamp, tier, node, process ID, thread name, etc.
Call graphs show the execution flow for the transaction on a given tier. For details, see Call Graphs.
Slow Calls & Errors
Hot Spots: Hot spots sort calls by execution time with the most expensive calls in the snapshot at the top. To see the invocation trace of a single call in the lower panel, select the call in the upper panel and use the slider to filter which calls to display as hot spots. For example, the following setting filters out all calls faster than 4324 ms from the hot spots list.
Using the force-hotspot-if-diag-session and hotspot-collect-cpu node properties you can respectively control whether or not hot spot snapshots are collected for manually started diagnostic sessions and whether CPU time or real time is collected within the hot spot snapshots.
Note that hot spots that appear in this pane of the snapshot viewer are distinct from a hot spot call graph. A hot spot call graph is a call graph in a snapshot collected in response to a performance issue that includes transaction segments generated before the point at which the transaction was recognized to be slow, very slow, or have another user experience issue. For more information, see Call Graphs.
Error Details: Exception stack traces and HTTP error codes.
DB & Remote Service Calls
SQL Calls: All SQL queries fired during a request. AppDynamics normalizes the queries and by default does not display raw/bind values. You can configure SQL capture settings to monitor raw SQL data in the queries. Individual calls taking less than 10 ms are not reported.
When returning data to a JDBC client, database management systems often return the results as a batched response. Each batch contains a subset of the total result set, with typically 10 records in each batch. The JDBC client retrieves a batch and iterates through the results. If the query is not satisfied, the JDBC client gets the next batch, and so on.
In the SQL query window, a number followed by an X in the Query column means that the query ran the number of times indicated within a batch. The value in the Count column indicates the number of times that the batch job executed.
Remote Service Calls: All queries to remote services such as web services, message queues, or caching servers that were fired during a request.
Graphs for hardware (CPU Memory, Disk IO, Network IO), Memory (Heap, Garbage Collection, Memory Pools), JMX, and more. If you have Server Monitoring, you'll have access to full performance details for the server hardware and operating system.
HTTP Data: HTTP payloads contain basic data such as the URL and session ID, and additional data for Servlet entry points, Struts, JSF, Web Services, etc. You can use HTTP data collectors to specify which query parameter or cookie values should be captured in the transaction snapshot. To enable HTTP parameter collection, see Collecting Application Data.
Cookies: The snapshot can use cookie values to help identify the user who initiated the slow or error transaction. To enable cookie value collection, see Collecting Application Data.
User Data: User data from any method executed during a transaction, including parameter values and return values, to add context to the transaction. You can use method invocation data collectors to specify the method and parameter index. To configure user data collection, see Collecting Application Data.
In cases where an exit call is made just before a business transaction starts, exit call information can show up in this field, particularly if the transaction is marked as slow or having errors. Please note that sensitive information on the exit call may be shown in this situation.
Node Problems: Shows metrics for the node that deviate the most from the established baselines.
Service Endpoints: Shows each service endpoint invoked during the snapshot.
Properties: Servlet URI and Process ID of the transaction.
Queries: Displays the top SQL statements and Stored Procedures. These are the queries that consume the most time in the database. Comparing the query weights to other metrics such as SQL wait times may point you to SQL that requires tuning.
Clients: Displays the hostname or IP addresses of the Top N clients using the database. A database client is any host that accesses the database instance.
Sessions: Displays the Session ID of the Top N sessions using the database sorted by time spent.
Schemas: Displays the names of the Top N busiest schemas on the database server.