This page provides troubleshooting information for issues that may arise during Controller installation and operation.
Controller Server Log
The primary log file for the Controller at the following location:
The first step in troubleshooting Controller issues typically involves checking the log file. Search the log for errors that may correspond to the issue you are encountering. If found, an error log may help you identify and resolve the issue.
Also, see installation troubleshooting information in Custom Install.
Identify Controller Performance Issues
The following are indications of Controller performance issues:
- The Controller UI performs slowly. For short time ranges, such as 15 or 30 minutes, responses that take longer than 10 to 20 seconds can indicate that your Controller is under stress.
- When the Controller's metric reporting lags 7 to 10 minutes behind the current time, it can be an indication that your Controller is under stress. A lag of about 3 to 5 minutes is normal.
- When monitoring the Controller environment, you see that CPU, memory, and disk metrics are at about 75% capacity.
If you observe degradation in Controller performance, it may be due to one of the following:
- The hardware resources for the Controller might not match the correct Controller profile.
- The Controller performance profile may be incorrectly configured.
To troubleshoot Controller performance issues:
- Confirm that the hardware matches the Controller profile you use. For details see Controller System Requirements.
- Confirm that your disk performance matches the recommended thresholds for minimum disk performance. For details see Controller System Requirements.
- Confirm that the Java SDK version is exactly the same as the Java version on the Controller. To display the version of Java used by the Controller:
- Open the command-line utility.
- Go to <Controller_Installation_Directory>/jre/bin
- Run java -version.
Monitor heap usage
- On Windows, use the Task Manager to measure the memory usage for the Controller.
On Linux, use the top command to get statistics for the memory data.
ps -elf (expect to see a "java" process and a "mysql" process) top (expect to see java and mysql with cpu greater then 0)CODE
Timeout errors during Controller installation
While installing the Controller, the Enterprise Console attempts to start up the Controller application server and database. At first database startup, the application attempts to create the database schema, tables, and other artifacts needed by the Controller.
By default, the Enterprise Console waits 45 minutes for the Controller app server or database to start. When installing a medium or large profile Controller or into certain types of environments such as virtual machines, the time it takes to start up the system can exceed the default startup timeout period.
Controller does not start properly on Windows
Your Controller may not be starting due to file extensions of transaction logs created by Glassfish. Excluding the Controller data directory from being scanned by virus scanners as specified on Prepare Windows for the Controller does not account for these extent files found in the <AppDynamicsInstall>\Controller\appserver\glassfish\domains\domain1\logs\server\tx directory. When your antivirus detects these extensions, such as WRY, it may mistakenly stop the process of using these files so the Controller ultimately does not start.
These transaction logs are used to recover any failed Glassfish transactions, so deleting these logs on startup is not advised. Instead, configure your virus scanners to ignore the entire Controller directory.
No data in the Metrics Browser
This may indicate that the agents are not correctly configured. Begin troubleshooting by looking at the server.log file.
All log files for Controller are located in the <Controller_Installation_Directory>/logs folder.
Error receiving metrics (node not properly modeled yet: Could not find component for node.
This error means the app agent tried to upload metric data for a specific node, but the node does not belong to any tier. Nodes must belong to tiers and these tiers must belong to a business application in order to receive metric data for that node. See Overview of Application Monitoring.
Received Metric Registration request for a machine that is NOT registered to any nodes. Sending back null!
This error indicates that the Controller received a registration request for metrics for a Machine Agent that listed a machine ID not yet associated with any node. Configure the Machine Agent to associate with the correct application, tier, and node. See Install the Machine Agent.
|Agent upload blocked, as its reporting a time well into the future.|
The App Agents attempt to report metric data using Controller time. The agents retrieve the time from the Controller every five minutes and report times using a skew of the local machine time, if different.
If for some reason the App Agent reports metrics that are time-stamped ahead of the Controller time, the Controller rejects the metrics. To avoid this event, ensure that the system times for the machine on which the Controller is running and the machines for the app agents are in synchronization.
Controller shutdown does not increase free memory on Linux
You do not generally need to be concerned about the "free memory" value, as it will always trend towards zero. The Linux kernel tries to keep its cache as large as possible. As a result, the Linux kernel does not release the memory even after process termination. The memory is freed only if it is required by another process.
Controller process unexpectedly shut down
On Linux, memory allocation failures may cause the Controller process to be shut down unexpectedly by the Linux Out-of-Memory (OOM) Killer. The Controller log, server.log, does not provide information about the shutdown. Instead, to diagnose this event, check the system log (usually /var/log/messages) for "out of memory" entries written by the OOM killer, for example, as follows:
If you encounter this log entry, make sure that you have allocated sufficient swap space on the Controller machine. AppDynamics recommends allocating a minimum of 10 GB of swap space.
Controller server swapping too often
If you encounter unexpected swapping on the Controller machine, you can configure how aggressive the operating system swaps by configuring the
swappiness parameter. The
swappiness parameter controls how often the Linux kernel moves processes out of physical memory and onto the swap disk. The default value for the parameter is usually 60. When you decrease the value, you lower the tendency of the operating system to swap. This results in less default file caching.
See the documentation for your Linux distribution for recommendations on the value for the
swappiness parameter. For example, RedHat recommends setting swappiness to 10 for CentOS and RedHat kernels version 2.6.32-303 or later if you encounter OOM issues even though swap space is still available.
Before you configure the
swappiness parameter though, ensure that the machine has sufficient RAM and that the buffer pool size for MySQL is properly configured.
To configure swappiness
Check the current value for
/sbin/sysctl -a | grep swappiness
For example, add the following line to set the
snappinessparameter to 10.
swappinessparameter in the
/etc/sysctl.conffile to the same value you used in step 2.
For example, add the following line to the
Could not determine the IP address of this host error during installation
During the installation process, the Enterprise Console attempts to ping the Controller by the host name or IP address you enter. If the ping is unsuccessful during the user input validation, the following error message appears: "Could not determine the IP address of this host. Please ensure that the IP address of the Controller host resolves to its hostname or to localhost. You may need to add an entry in the hosts file on the Controller host and retry the operation."
To make the hostname resolvable, add an entry for it to the hosts file on the machine on which you are installing the Controller. On Linux, the hosts file is typically at /etc/hosts. On Windows, look for the file at the following location, C:\Windows\System32\Drivers\etc\hosts, or the location appropriate for your version of Windows.
Add the entry in the form of the following example:
Use the IP address and hostnames appropriate for your system.
For example, the following shows the entry added as the third line of the default RedHat hosts file:
Controller Cannot Connect to the MySQL Database
The following exception message in server.log file indicates that the Controller cannot connect to its embedded database.
If you encounter this error, verify that the Controller database is running properly. On Linux, you can do so using one of the following commands:
SysInternals Process Explorer, will provide a list of files
List open files opened by process
netstat -anp | grep 3388
netstat -ano | find "3388"
List all networking ports opened by process
ps -aef | grep mysql
tasklist /v | find "mysql"
Lists all processes and then checks if the process
If no processes are found, it indicates that the Controller database was incorrectly terminated. Start the Controller database again and verify the Controller server.log file for any error messages.
Stack overflow exception when installing the Controller installation on Windows
This exception is usually caused when you set the -Xss option to a lower value. We recommend changing this value to 96000.
Triggering automatic collection of Controller logs
Use the following console commands to trigger automatic capture of Controller log files:
On Linux, run:
On Windows, open an elevated command prompt (in the Windows start menu, right-click on the Command Prompt icon and choose Run as Administrator) and run:
The logs will be copied in the Enterprise Console host under platform-admin/logs-controller-<platform-name>-<date-time-stamp>.zip.
Collecting Troubleshooting Information for the Controller
If opening a support case for Controller troubleshooting, you can facilitate the diagnosis of the problem by providing the following information:
- Submit all platform-admin/logs/* and platform-admin/logs-controller-*.zip, in particular the server.log files. You can also use the log file utility described in Triggering automatic collection of Controller logs to collect logs.
- If the Controller runs out of memory, it generates a heap dump. Submit all files in <controller_home>/appserver/glassfish/domains/domain1/config/hprof.
- Submit all <controller_home>/appserver/glassfish/domains/domain1/config/gc.log files.
- Submit information about the hardware and operating system configuration of the machine that is currently hosting the Controller, including operating system, bit version, CPU cores, clock speed, disk configuration, and RAM.
Indicate the Performance profile of Controller. Run the controller diagnosis command which captures the information in platform-admin-server.log:
bin/platform-admin.sh submit-job --platform-name <platform_name> --job diagnosis --service controllerBASH
bin/platform-admin.exe cli submit-job --platform-name <platform_name> --job diagnosis --service controllerBASH
Issues Generating Audit Reports Immediately after Upgrading the Controller to 4.5
When the Controller upgrade is complete, audit reports may not work immediately. The audit database table is getting migrated only after the upgrade process and the migration takes at least an hour to complete. If audit reports are run before completing the migration process, audit table migration messages are logged in the server.log file.
No actions are required, try running the audit reports again after an hour.