Download PDF
Download page Troubleshoot Controller Issues.
Troubleshoot Controller Issues
This page provides troubleshooting information for issues that may arise during Controller installation and operation.
Controller Server Log
The primary log file for the Controller at the following location:
<controller_home>/logs/server.log
The first step in troubleshooting Controller issues typically involves checking the log file. Search the log for errors that may correspond to the issue you are encountering. If found, an error log may help you identify and resolve the issue.
Also, see installation troubleshooting information in Custom Install.
Identify Controller Performance Issues
The following are indications of Controller performance issues:
- The Controller UI performs slowly. For short time ranges, such as 15 or 30 minutes, responses that take longer than 10 to 20 seconds can indicate that your Controller is under stress.
- When the Controller's metric reporting lags 7 to 10 minutes behind the current time, it can be an indication that your Controller is under stress. A lag of about 3 to 5 minutes is normal.
- When monitoring the Controller environment, you see that CPU, memory, and disk metrics are at about 75% capacity.
If you observe degradation in Controller performance, it may be due to one of the following:
- The hardware resources for the Controller might not match the correct Controller profile.
- The Controller performance profile may be incorrectly configured.
To troubleshoot Controller performance issues:
- Confirm that the hardware matches the Controller profile you use. For details, see Controller System Requirements.
- Confirm that your disk performance matches the recommended thresholds for minimum disk performance. For details, see Controller System Requirements.
- Confirm that the Java SDK version is exactly the same as the Java version on the Controller. To display the version of Java used by the Controller:
- Open the command-line utility.
- Go to
<Controller_Installation_Directory>/jre/bin
- Run
java -version
.
Monitor Heap Usage
- On Windows, use the Task Manager to measure the memory usage for the Controller.
On Linux, use the top command to get statistics for the memory data.
ps -elf (expect to see a "java" process and a "mysql" process) top (expect to see java and mysql with cpu greater then 0)
BASH
Timeout Errors During Controller Installation
While installing the Controller, the Enterprise Console attempts to start up the Controller application server and database. At first database startup, the application attempts to create the database schema, tables, and other artifacts needed by the Controller.
By default, the Enterprise Console waits 45 minutes for the Controller app server or database to start. When installing a medium or large profile Controller or into certain types of environments such as virtual machines, the time it takes to start up the system can exceed the default startup timeout period.
Controller Does Not Start Properly on Windows
Your Controller may not be starting due to file extensions of transaction logs created by Glassfish. Excluding the Controller data directory from being scanned by virus scanners as specified on Prepare Windows for the Controller does not account for these extent files found in the <AppDynamicsInstall>\Controller\appserver\glassfish\domains\domain1\logs\server\tx
directory. When your antivirus detects these extensions, such as WRY, it may mistakenly stop the process of using these files so the Controller ultimately does not start.
These transaction logs are used to recover any failed Glassfish transactions, so deleting these logs on startup is not advised. Instead, configure your virus scanners to ignore the entire Controller directory.
No Data in the Metrics Browser
This may indicate that the agents are not correctly configured. Begin troubleshooting by looking at the server.log
file.
All log files for Controller are located in the <Controller_Installation_Directory>/logs
folder.
Error Message | Solution |
---|---|
Error receiving metrics (node not properly modeled yet: Could not find component for node. | This error means the app agent tried to upload metric data for a specific node, but the node does not belong to any tier. Nodes must belong to tiers and these tiers must belong to a business application in order to receive metric data for that node. See Overview of Application Monitoring. |
Received Metric Registration request for a machine that is NOT registered to any nodes. Sending back null! | This error indicates that the Controller received a registration request for metrics for a Machine Agent that listed a machine ID not yet associated with any node. Configure the Machine Agent to associate with the correct application, tier, and node. See Install the Machine Agent. |
Agent upload blocked, as its reporting a time well into the future. | The App Agents attempt to report metric data using Controller time. The agents retrieve the time from the Controller every five minutes and report times using a skew of the local machine time, if different. If for some reason the App Agent reports metrics that are time-stamped ahead of the Controller time, the Controller rejects the metrics. To avoid this event, ensure that the system times for the machine on which the Controller is running and the machines for the app agents are in synchronization. |
Multibyte Characters Are Garbled
In a Windows environment, you may see component names that contain multibyte characters (i.e. Japanese characters) that are garbled in the Controller UI or in exported files. To fix this, add the following to the Controller JVM arguments:
-Dfile.encoding=UTF-8 jvm property to the Controller jvm arguments
Controller Shutdown Does Not Increase Free Memory on Linux
You do not generally need to be concerned about the "free memory" value, as it will always trend towards zero. The Linux kernel tries to keep its cache as large as possible. As a result, the Linux kernel does not release the memory even after process termination. The memory is freed only if it is required by another process.
Controller Process Unexpectedly Shut Down
On Linux, memory allocation failures may cause the Controller process to be shut down unexpectedly by the Linux Out-of-Memory (OOM) Killer. The Controller log, server.log
, does not provide information about the shutdown. Instead, to diagnose this event, check the system log (usually /var/log/messages
) for "out of memory" entries written by the OOM killer, for example, as follows:
grep -i "Out of memory" /var/log/messages
If you encounter this log entry, make sure that you have allocated sufficient swap space on the Controller machine. AppDynamics recommends allocating a minimum of 10 GB of swap space.
Controller Server Swapping Too Often
If you encounter unexpected swapping on the Controller machine, you can configure how aggressive the operating system swaps by configuring the swappiness
parameter. The swappiness
parameter controls how often the Linux kernel moves processes out of physical memory and onto the swap disk. The default value for the parameter is usually 60. When you decrease the value, you lower the tendency of the operating system to swap. This results in less default file caching.
See the documentation for your Linux distribution for recommendations on the value for the swappiness
parameter. For example, RedHat recommends setting swappiness to 10 for CentOS and RedHat kernels version 2.6.32-303 or later if you encounter OOM issues even though swap space is still available.
Before you configure the swappiness
parameter though, ensure that the machine has sufficient RAM and that the buffer pool size for MySQL is properly configured.
To configure swappiness
:
Check the current value for
swappiness
./sbin/sysctl -a | grep swappiness
BASHSet the
swappiness
parameter.For example, add the following line to set the
swappiness
parameter to 10.echo 10 > /proc/sys/vm/swappiness
BASHSet the
swappiness
parameter in the/etc/sysctl.conf
file to the same value you used in step 2.For example, add the following line to the
/etc/sysct1.conf
file:vm.swappiness = 10
BASH
Could Not Determine the IP address of This Host Error During Installation
During the installation process, the Enterprise Console attempts to ping the Controller by the hostname or IP address you enter. If the ping is unsuccessful during the user input validation, the following error message appears: "Could not determine the IP address of this host. Please ensure that the IP address of the Controller host resolves to its hostname or to localhost. You may need to add an entry in the hosts file on the Controller host and retry the operation."
To make the hostname resolvable, add an entry for it to the hosts file on the machine on which you are installing the Controller. On Linux, the hosts file is typically at /etc/hosts
. On Windows, look for the file at the following location, C:\Windows\System32\Drivers\etc\hosts
, or the location appropriate for your version of Windows.
Add the entry in the form of the following example:
127.0.0.1 localhost myhostname
Use the IP address and hostnames appropriate for your system.
For example, the following shows the entry added as the third line of the default RedHat hosts file:
127.0.0.1 localhost.localdomain localhost ::1 localhost6.localdomain6 localhost6 198.51.100.2 myhost myhost.example.org
Controller Cannot Connect to the MySQL Database
The following exception message in server.log file indicates that the Controller cannot connect to its embedded database.
*Server log exception:* "Caused by: java.net.ConnectException: Connection refused"
If you encounter this error, verify that the Controller database is running properly. On Linux, you can do so using one of the following commands:
Linux | Windows | Description |
---|---|---|
| SysInternals Process Explorer, will provide a list of files opened by process with pid 3388. | List open files opened by process with pid 3388. |
|
| List all networking ports opened by process with pid 3388. |
|
| Lists all processes and then checks if the process with name "mysql" is active and alive. |
If no processes are found, it indicates that the Controller database was incorrectly terminated. Start the Controller database again and verify the Controller server.log
file for any error messages.
Stack Overflow Exception When Installing the Controller Installation on Windows
This exception is usually caused when you set the -Xss
option to a lower value. We recommend changing this value to 96000.
Triggering Automatic Collection of Controller Logs
Use the following console commands to trigger automatic capture of Controller log files:
On Linux, run:
bin/platform-admin.sh submit-job --platform-name test --service controller --job retrieve-log
On Windows, open an elevated command prompt (in the Windows start menu, right-click the Command Prompt icon and choose Run as Administrator) and run:
bin/platform-admin.exe cli submit-job --platform-name test --service controller --job retrieve-log
The logs will be copied in the Enterprise Console host under
platform-admin/
logs-controller-<platform-name>-<date-time-stamp>.zip
.
See Platform Log Files to learn how to manage your Controller logs.
Collecting Troubleshooting Information for the Controller
If opening a support case for Controller troubleshooting, you can facilitate the diagnosis of the problem by providing the following information:
- Submit all
platform-admin/logs/*
andplatform-admin/logs-controller-*.zip
, in particular theserver.log
files. You can also use the log file utility described in Triggering automatic collection of Controller logs to collect logs. - If the Controller runs out of memory, it generates a heap dump. Submit all files in
<controller_home>/appserver/glassfish/domains/domain1/config/hprof
. - Submit all
<controller_home>/appserver/glassfish/domains/domain1/config/gc.log
files. - Submit information about the hardware and operating system configuration of the machine that is currently hosting the Controller, including operating system, bit version, CPU cores, clock speed, disk configuration, and RAM.
Indicate the Performance profile of Controller. Run the controller diagnosis command which captures the information in
platform-admin-server.log
:bin/platform-admin.sh submit-job --platform-name <platform_name> --job diagnosis --service controller
BASHbin/platform-admin.exe cli submit-job --platform-name <platform_name> --job diagnosis --service controller
BASHplatform-admin-server.log
. See a sample Controller diagnostic data on Manage a High Availability Deployment.
Issues Generating Audit Reports Immediately after Upgrading the Controller to 4.5
When the Controller upgrade is complete, audit reports may not work immediately. The audit database table is getting migrated only after the upgrade process and the migration takes at least an hour to complete. If audit reports are run before completing the migration process, audit table migration messages are logged in the server.log
file.
No actions are required, try running the audit reports again after an hour.