This page describes how to monitor your Kubernetes entities.

Monitor Kubernetes Entities

After you install Kubernetes and App Service Monitoring, you can monitor your entities:

  1. Log into the Cisco Cloud Observability UI.
  2. On the Observe page, navigate to the Kubernetes domain.
    This domain contains links to entity-centric pages, which are UI pages that display everything of relevance (e.g., metrics, metadata, health status, events, logs, relationships) for a given entity.
  3. Click an entity name to monitor your applications. For more information on the entities that can be monitored, see:

Retention and Purge Time-to-Live (TTL)

The following table lists the retention TTL and purge TTL for all Kubernetes entities.

EntityRetention TTLPurge TTL

Clusters

180 minutes (3 days)525,600 minutes (365 days)
Namespaces180 minutes (3 days)525,600 minutes (365 days)
Workloads180 minutes (3 days)525,600 minutes (365 days)
Pods180 minutes (3 days)30,240 minutes (21 days)
Containers180 minutes (3 days)30,240 minutes (21 days)
Persistent Volume Claims (PVCs)180 minutes (3 days)525,600 minutes (365 days)
Ingresses180 minutes (3 days)525,600 minutes (365 days)
Configurations180 minutes (3 days)525,600 minutes (365 days)
Autoscalers180 minutes (3 days)525,600 minutes (365 days)

Predefined Health Rules for Kubernetes Entities

The Kubernetes and App Service monitoring solution provides the following predefined health rules. You can use these health rules for a particular entity or modify them as custom rules based on your requirements.


Health Rule NameDescriptionEnabled by DefaultAvailable for Entities
K8s CronJob Determines if a cronjob has failed.Yes
  • Managed Job
  • Cronjob
  • Namespace
K8s Pod Restart Count

Triggers warning alert if pod restart count, in a minute > 0

The count of violation would be in last 10 minutes

Triggers critical alert if total pod restart count in last 10 minutes > 4

Yes

Pod
K8s Ingress Error Count Above BaselineTriggers on error count when an ingress crosses baselineYes
  • Namespace
  • Cluster
  • Ingress 

K8s Node Resource Memory, PID, Disk Pressure

Determines if a cluster is unhealthy.

In a cluster, if 10% of nodes are under pressure (exceeds the limit) for memory, process ID and Disk space, cluster is marked as unhealthy.

No

Cluster
K8s Pod Resource Usage vs LimitsDetermines if the CPU/memory usage is too high compared to limitsNo
  • Pod
  • Workload
  • Namespace
  • Cluster 
K8s Deployment Running vs DesiredDetermines if enough pods are running in a deploymentNo
  • Namespace
  • Cluster
  • Workload

K8s Statefulset Running vs DesiredDetermines if enough pods are running in a statefulsetNo
K8s Daemonset Running vs DesiredDetermines if enough pods are running in a daemonsetNo
K8s Replicaset Running vs DesiredDetermines if enough pods are running in a replicasetNo

K8s Replication Controller Running vs Desired

Determines if enough pods are running in a replication controller

No

  • If the health rules are violated continuously, then the next alert gets triggered after 30 minutes.
  • If you increase the time frame for more than an hour (Last 1 hour), the Pods Running Status Sum metric in the following predefined health rules will show more spikes on the graph because of the higher time range:
    • K8s Deployment Running vs Desired
    • K8s Daemonset Running vs Desired
    • K8s Replicaset Running vs Desired
    • K8s Statefulset Running vs Desired

For information about using Health Rules, see Health Rules.

Kubernetes Health Rollup Path

The health rollup allows you to monitor a group of entities. You can define the health rollup relationship between the Kubernetes entities and the parent entities. For more information, see Health Rollup.

When you select the required Kubernetes entity type during a health rule creation, you get the health rollup path suggestion based on the selected entity. You can select the rollup path based on your requirement. The rule applies to the child entity and rolls up to the parent entity. In the UI, you can observe the unhealthy Kubernetes entities in the list view of the respective entity page. However, to observe some Kubernetes entities, you must view the details under a specific entity page in the user interface. The following table lists these entity types and the entity pages where you can observe the details:

Entity TypeDisplayed on Entity PagesDescription
Nodes

Clusters

Any Health Rule created on the metrics of a Kubernetes node entity can be rolled up and displayed on the cluster entity page.
  • Unmanaged Replicaset
  • Unmanaged Job
Workloads

To create health rules for unmanaged replicaset and unmanaged job, you need to select the entity type as replicaset and job respectively. You can use the following health rollup path to create the health rules defined for the pod entities that are deployed by unmanaged job and replica set: 

  • Pod>unmanaged replica set>namespace>cluster
  • Pod>unmanaged job>namespace>cluster
  • deployment
  • statefulset
  • daemonset
  • cronjob
Workloads

Workload represents all the Kubernetes workloads such as deployment, statefulset, daemonset, cronjob, unmanaged replicaset and unmanaged job. 

You can select the following health rollup path:

Pod>Workload>namespace>cluster

If you need to create a health roll up path for a specific sub-type of workload, choose the roll-up path of the sub-type from the list.

  • ConfigMap
  • Secret
Configurations

Configuration represents all the Kubernetes configurations that is K8s ConfigMap and Secret. 
You can select the following health rollup path:
Configuration>namespace>cluster

OpenTelemetry™ and Kubernetes® (as applicable) are trademarks of The Linux Foundation®.