This page describes how to validate and troubleshoot Cluster Agent auto-instrumentation.

Prerequisites

Validate Auto-Instrumentation

  1. After applying auto-instrumentation, confirm that the application is reporting to the Controller using the application name assigned based on the name strategy you chose. See Auto-Instrument Applications with the Cluster Agent.

  2. Verify that the application pods targeted for auto-instrumentation have been recreated. As auto-instrumentation is being applied, use kubectl get pods with the -w flag to print out updates to pod status in real-time. A successfully auto-instrumented application shows that the app pod was recreated, and the correct init container was applied and copied the App Server Agent to the application container.

    kubectl -n <app ns> get pods -w
    NAME                          READY   STATUS            RESTARTS   AGE
    dotnet-app-85c7d66557-s8hzm   1/1     Running           0          3m11s
    dotnet-app-6c45b6d4f-fpp75    0/1     Pending           0          0s
    dotnet-app-6c45b6d4f-fpp75    0/1     Init:0/1          0          0s
    dotnet-app-6c45b6d4f-fpp75    0/1     PodInitializing   0          5s
    dotnet-app-6c45b6d4f-fpp75    1/1     Running           0          6s
    dotnet-app-85c7d66557-s8hzm   1/1     Terminating       0          20m\
    BASH

    You can also use kubectl get pods -o yaml to check whether the application spec was updated to include the init container and the state of the init container:

    kubectl -n <app-ns> get pod <app-pod> -o yaml
     ...
     initContainers:
      - command:
        - cp
        - -r
        - /opt/appdynamics/.
        - /opt/appdynamics-java
        image: docker.io/appdynamics/java-agent:latest
      ...
      initContainerStatuses:
      - containerID: docker://8bb892f322e5a043866d038631392a2272b143e54c8c431b3590312729043eb9
        image: appdynamics/java-agent:20.9.0
        imageID: docker-pullable://appdynamics/java-agent@sha256:077ac1c4f761914c1742f22b2f591a37a127713e3e96968e9e570747f7ba6134
        ...
        state:
          terminated:
            containerID: docker://8bb892f322e5a043866d038631392a2272b143e54c8c431b3590312729043eb9
            exitCode: 0
            finishedAt: "2021-02-03T22:39:25Z"
            reason: Completed
    YML

    The pod annotation may display  APPD_POD_INSTRUMENTATION_STATE as failed for Node.js and .NET Core (Linux) applications even when the instrumentation is successful.

Troubleshoot Auto-Instrumentation When Not Applied

If the application pod was not recreated, and an init container was not applied, there may be an issue with the namespace and application matching rules used by the Cluster Agent.

  1. First, confirm that the Cluster Agent is using the latest auto-instrumentation configuration.

    kubectl -n appdynamics get cm instrumentation-config -o yaml 
    CODE

    If the YAML output does not reflect the latest auto-instrumentation configuration, then validate the Cluster Agent YAML configuration, and apply or upgrade the Cluster Agent:

    kubectl apply -f cluster-agent.yaml
    BASH
    helm upgrade -f ./ca1-values.yaml "<my-cluster-agent-helm-release>" appdynamics-charts/cluster-agent --namespace appdynamics
    BASH
  2. If the pods are still not recreated, then enable DEBUG logging in the Cluster Agent configuration:

    cluster-agent.yaml

    apiVersion: cluster.appdynamics.com/v1alpha1
    kind: Clusteragent
    metadata:
      name: k8s-cluster-agent
      namespace: appdynamics
    spec:
      # content removed for brevity
      logLevel: DEBUG
    YML

    ca1-values.yaml

    clusterAgent:
      # content removed for brevity
      logProperties:
        logLevel: DEBUG
    YML
  3. Apply or upgrade the Cluster Agent to update the DEBUG logging configuration:

    kubectl apply -f cluster-agent.yaml
    BASH
    helm upgrade -f ./ca1-values.yaml "<my-cluster-agent-helm-release>" appdynamics-charts/cluster-agent --namespace appdynamics
    BASH
  4. After you update the Cluster Agent logging configuration, tail the logs to search for messages indicating that the Cluster Agent has detected the application, and considers the application in scope for auto-instrumentation:

    # apply grep filter to see instrumentation related messages:
    kubectl -n appdynamics logs <cluster-agent-pod> -f | grep -E 'instrumentationconfig.go|deploymenthandler.go'
    BASH

    However, if output displays (similar to this example) for the application you want to auto-instrument, then the Cluster Agent does not consider the application in scope for auto-instrumentation:

    [DEBUG]: 2021-03-30 21:22:09 - instrumentationconfig.go:660 - No matching rule found for Deployment dotnet-app in namespace stage with labels map[appName:jah-stage framework:dotnetcore]
    [DEBUG]: 2021-03-30 21:22:09 - instrumentationconfig.go:249 - Instrumentation state for Deployment dotnet-app in namespace stage with labels map[appName:jah-stage framework:dotnetcore] is false
    BASH

    Review your auto-instrumentation configuration to determine any necessary updates and save the changes. Then, delete and re-create the Cluster Agent as described previously. For example, if you overwrote the namespace configuration in an instrumentationRule using namespaceRegex, be sure that you use a value that is included in nsToInstrumentRegex (dev in the example):

    apiVersion: cluster.appdynamics.com/v1alpha1
    kind: Clusteragent
    metadata:
      name: k8s-cluster-agent
      namespace: appdynamics
    spec:
      # content removed for brevity
      nsToInstrumentRegex: stage|dev
      instrumentationRules:
        - namespaceRegex: dev
    
    YML

    After you update the configuration and the Cluster Agent has been recreated, your output should be similar to this example indicating that the Cluster Agent recognizes the application as in scope for auto-instrumentation:

    [DEBUG]: 2021-03-30 21:22:10 - instrumentationconfig.go:645 - rule stage matches Deployment spring-boot-multicontainer in namespace stage with labels map[appName:jah-stage acme.com/framework:java]
    [DEBUG]: 2021-03-30 21:22:10 - instrumentationconfig.go:656 - Found a matching rule {stage  map[acme.com/framework:[java]]   java select .*  JAVA_TOOL_OPTIONS map[agent-mount-path:/opt/appdynamics image:docker.io/appdynamics/java-agent:21.3.0 image-pull-policy:IfNotPresent] map[bci-enabled:true port:3892] 0 0 appName  0 false []} for Deployment spring-boot-multicontainer in namespace stage with labels map[appName:jah-stage acme.com/framework:java]
    [DEBUG]: 2021-03-30 21:22:10 - instrumentationconfig.go:249 - Instrumentation state for Deployment spring-boot-multicontainer in namespace stage with labels map[appName:jah-stage acme.com/framework:java] is true
    [DEBUG]: 2021-03-30 21:22:10 - deploymenthandler.go:312 - Added instrument task to queue stage/spring-boot-multicontainer
    BASH

Troubleshoot Auto-Instrumentation When it Fails to Complete 

If the application pod was re-created but the instrumented application is not reporting to the Controller, then auto-instrumentation did not complete successfully, or the App Server Agent failed to register with the Controller.

  1. If the application pods are being restarted, but the init container is not completing successfully, then check for events in the application namespace using the Cluster Agent Events dashboard or kubectl get events. Some common issues that may prevent the init container from completing are: image pull or resource quotas issues.

    kubectl -n <app ns> get events
    BASH
  2. Verify that the application deployment spec was updated with an init container, and if the status information includes any errors:

    kubectl -n <app-ns> get pod <app-pod> -o yaml
    
    BASH
  3. Start a shell inside the application container to confirm the auto-instrumentation results. Check if the App Server Agent environment variables are set correctly and if the logs are free of errors.

    kubectl -n <app-ns> exec -it <app-pod> /bin/sh
    # Issue these commands in the container
    env | grep APPD
    
    # For Java app
    env | grep JAVA_TOOL_OPTIONS # or value of defaultEnv
    ls /opt/appdynamics-java
    cat /opt/appdynamics-java/ver<version>/logs/<node-name>/agent.<date>.log
    
    # For .NET Core app
    ls /opt/appdynamics-dotnetcore
    cat /tmp/appd/dotnet/1_agent_0.log
    
    # For Node.js app
    ls /opt/appdynamics-nodejs
    cat /tmp/appd/<id>/appd_node_agent_<date>.log
    BASH
  4. If you are instrumenting a Node.js or .NET Core application, and the auto-instrumentation files have been copied to the app container, but the App Server Agent logs do not exist, or errors regarding binary incompatibilities display, then you may have a mismatch between the app image and the App Server Agent Image (referenced in the image property of the auto-instrumentation configuration). 
    For .NET Core, the image OS versions must match (Linux for example). 
    For Node.js, the image OS versions must match, as well as the Node.js version. To confirm, check the image tags, Dockerfiles, or exec-ing into the app container:

    $ kubectl -n <app-ns> exec -it <app-pod> /bin/sh
    cat /etc/os-release   # (or /etc/issue)
    NAME="Alpine Linux"... 
    # for Node.js app
    node -v
    CODE

    Troubleshoot Re-instrumentation Issues for Upgraded Deployment 

    • When you edit the deployment specification using Helm upgrade, the Cluster Agent does not re-instrument the upgraded deployment. If your changes to Cluster Agent-applied specification interferes with instrumentation properties, instrumentation stops working.
      This issue occurs mostly in case of Java instrumentation, when your Java application uses JAVA_TOOL_OPTIONS and there is a possibility that helm upgrade results in overriding JAVA_TOOL_OPTIONS environment variable in the deployment specification.
      The overriding happens because the Java Agent uses this environment variable to consume the system properties, causing instrumentation to fail. 
    • Cluster Agent uses annotations to check the instrumentation state of the deployment. The annotations on the deployment are not updated when the deployment tool or script uses the edit or the patch Kubernetes APIs to edit the deployment specification. Therefore, when there is no change in the instrumentation state the re-instrumentation does not happen.
      You can workaround this issue by updating the annotation, APPD_POD_INSTRUMENTATION_STATE on the deployment spec to  APPD_POD_INSTRUMENTATION_STATE: Failed.