Summary

Performing a full Controller upgrade, an upgrade of MySQL, or changing the database configuration in db.cnf (through the Enterprise Console) may cause mysqld to crash at shutdown, potentially resulting in data corruption and/or loss.

The crash is due to a known MySQL bug, MySQL Bug #95285 which results in mysqld_safe performing an unexpected restart of mysqld. Processes performing a MySQL shutdown expect it to be down and subsequent operations based on that expectation can cause MySQL corruption. For example:

  • Reboots and other service-level operations usually initiate a MySQL shutdown and then proceed after a timeout without checking for a clean complete shutdown

  • Enterprise Console performs log-file management operations after a shutdown and prior to a MySQL upgrade or db.cnf setting change

Affected Software

On-premises Controllers with MySQL v5.7.26 or higher (see the database.log banner)

Workaround

You must apply the Oracle workaround, particularly if you plan on performing an upgrade.


For a Standalone Controller

Before you start this procedure, you must take the system down because a Controller restart is required. 

  1. Perform a backup of <Controller Home>/db/db.cnf, enter:

    cp <Controller Home>/db/db.cnf <Controller Home>/db/db.cnf.pre-oracle-patch
    CODE
  2. Oracle has provided a configuration change to prevent the crash during shutdown.
    1. If not present, add or update the value in <Controller Home>/db/db.cnf.
    2. Change internal_tmp_disk_storage_engine=INNODB to internal_tmp_disk_storage_engine=MYISAM.
  3. Shut down the Controller App Server, enter:

    <Controller Home>/bin/controller.sh stop-appserver 
    CODE
  4. Shut down the Controller database, enter:

    <Controller Home>/bin/controller.sh stop-db
    CODE
  5. Verify that the database is stopped by checking the process, enter:

    ps -ef | grep mysqld
    CODE
    • No "mysqld" or "mysqld_safe" process should be running from the <Controller Home>/db directory.
    • If you have Enterprise Console running on the same server, there may be another set of "mysqld" and "mysqld_safe" processes running from your <platform>/mysql directory. You can ignore them.
    • If you still see "mysqld" or "mysqld_safe" processes running from the <Controller Home>/db directory, re-run the "Shutdown the Controller database" command until the processes are shut down.
    • Do not kill the MySQL process because it may cause data corruption.  
  6. Add or update the configuration from the Enterprise Console UI so that the values will persist during upgrades.
    1. If not present, add or update the value in <Controller Home>/db/db.cnf.
    2. Change internal_tmp_disk_storage_engine=INNODB to internal_tmp_disk_storage_engine=MYISAM.
    3. Select Save to start the database and the Controller.  
  7. Update the Enterprise Controller to the latest version.
  8. Upgrade the Controller. 

For an HA Controller

Before you start this procedure, you must take the system down because a Controller restart is required. 

  1. Perform a backup of <Controller Home>/db/db.cnf on both servers, enter:

    cp <Controller Home>/db/db.cnf <Controller Home>/db/db.cnf.pre-oracle-patch
    CODE
  2. Oracle has provided a configuration change to prevent the crash during shutdown.
    1. If not present, add or update the value in <Controller Home>/db/db.cnf.
    2. Change internal_tmp_disk_storage_engine=INNODB to internal_tmp_disk_storage_engine=MYISAM.
  3. Restart the secondary database. To stop the secondary database, enter:

    <Controller Home>/bin/controller.sh stop-db
    CODE
  4. Verify that the secondary database is stopped by checking the process, enter:

    ps -ef | grep mysqld
    CODE
    • No "mysqld" or "mysqld_safe" process should be running from the <Controller Home>/db directory.
    • If you have Enterprise Console running on the same server, there may be another set of "mysqld" and "mysqld_safe" processes running from your <platform>/mysql directory. You can ignore them.
    • If you still see "mysqld" or "mysqld_safe" processes running from the <Controller Home>/db directory, re-run the "Shutdown the Controller database" command until the processes are shut down.
    • Do not kill the MySQL process because it may cause data corruption.  
  5. Start the secondary database, enter:

    <Controller Home>/bin/controller.sh start-db
    CODE
  6. If running on the secondary database, stop the Watchdog. 
    • If you manage HA using the HA Module, then: 
      1. From the UI, stop the Watchdog.
      2. Verify the Watchdog status in the UI.
    • If you manage HA using the HA Toolkit, then enter:

      [[ -z $(pgrep -f "[w]atchdog.sh") ]] || kill -9 $(pgrep -f "[w]atchdog.sh")
      
      ps -eaf | grep "[w]atchdog.sh"
      CODE

      No "HA/watchdog.sh" process should be running. To verify, enter:

      <Controller Home>/HA/appdstatus.sh
      CODE
  7. On the primary, shut down the Controller App Server, enter:

    <Controller Home>/bin/controller.sh stop-appserver 
    CODE
  8. Shut down the primary database. To stop the primary database, enter:

    <Controller Home>/bin/controller.sh stop-db
    CODE
  9. Verify that the primary database is stopped by checking the process, enter:

    ps -ef | grep mysqld
    CODE
    • No "mysqld" or "mysqld_safe" process should be running from the <Controller Home>/db directory.
    • If you have Enterprise Console running on the same server, there may be another set of "mysqld" and "mysqld_safe" processes running from your <platform>/mysql directory. You can ignore them.
    • If you still see "mysqld" or "mysqld_safe" processes running from the <Controller Home>/db directory, re-run the "Shutdown the Controller database" command until the processes are shut down.
    • Do not kill the MySQL process because it may cause data corruption.  
  10. Add or update the configuration from the Enterprise Console UI so that the values will persist during upgrades.
    1. If not present, add or update the value in <Controller Home>/db/db.cnf.
    2. Change internal_tmp_disk_storage_engine=INNODB to internal_tmp_disk_storage_engine=MYISAM.
    3. Select Save to start the database and the Controller.  
  11. Update the Enterprise Controller to the latest version.
  12. Upgrade the Controller. 


Resolution

AppDynamics is actively working on wrapper code for the shutdown procedure as a fix until Oracle issues a long-term solution.

Revision History

05/05/2020, v1 (initial publication of this advisory)