Kubernetes Entities

This page describes how to monitor your Kubernetes entities.

Monitor Kubernetes Entities

After you install Kubernetes and App Service Monitoring, you can monitor your entities:

Log into the Cisco Cloud Observability UI.
On the Observe page, navigate to the Kubernetes domain.
This domain contains links to entity-centric pages, which are UI pages that display everything of relevance (e.g., metrics, metadata, health status, events, logs, relationships) for a given entity.
Click an entity name to monitor your applications. For more information on the entities that can be monitored, see:

Clusters

Namespaces

Workloads

Pods

Containers

Persistent Volume Claims

Ingresses

Configurations

Autoscalers

Retention and Purge Time-to-Live (TTL)

The following table lists the retention TTL and purge TTL for all Kubernetes entities.

Entity	Retention TTL	Purge TTL
Clusters	180 minutes (3 days)	525,600 minutes (365 days)
Namespaces	180 minutes (3 days)	525,600 minutes (365 days)
Workloads	180 minutes (3 days)	525,600 minutes (365 days)
Pods	180 minutes (3 days)	30,240 minutes (21 days)
Containers	180 minutes (3 days)	30,240 minutes (21 days)
Persistent Volume Claims (PVCs)	180 minutes (3 days)	525,600 minutes (365 days)
Ingresses	180 minutes (3 days)	525,600 minutes (365 days)
Configurations	180 minutes (3 days)	525,600 minutes (365 days)
Autoscalers	180 minutes (3 days)	525,600 minutes (365 days)

Predefined Health Rules for Kubernetes Entities

The Kubernetes and App Service monitoring solution provides the following predefined health rules. You can use these health rules for a particular entity or modify them as custom rules based on your requirements.

Health Rule Name	Description	Enabled by Default	Available for Entities
K8s CronJob	Determines if a cronjob has failed.	Yes	Managed Job Cronjob Namespace
K8s Pod Restart Count	Triggers warning alert if pod restart count, in a minute > 0 The count of violation would be in last 10 minutes Triggers critical alert if total pod restart count in last 10 minutes > 4	Yes	Pod
K8s Ingress Error Count Above Baseline	Triggers on error count when an ingress crosses baseline	Yes	Namespace Cluster Ingress
K8s Node Resource Memory, PID, Disk Pressure	Determines if a cluster is unhealthy. In a cluster, if 10% of nodes are under pressure (exceeds the limit) for memory, process ID and Disk space, cluster is marked as unhealthy.	No	Cluster
K8s Pod Resource Usage vs Limits	Determines if the CPU/memory usage is too high compared to limits	No	Pod Workload Namespace Cluster
K8s Deployment Running vs Desired	Determines if enough pods are running in a deployment	No	Namespace Cluster Workload
K8s Statefulset Running vs Desired	Determines if enough pods are running in a statefulset	No
K8s Daemonset Running vs Desired	Determines if enough pods are running in a daemonset	No
K8s Replicaset Running vs Desired	Determines if enough pods are running in a replicaset	No
K8s Replication Controller Running vs Desired	Determines if enough pods are running in a replication controller	No

If the health rules are violated continuously, then the next alert gets triggered after 30 minutes.
If you increase the time frame for more than an hour (Last 1 hour), the Pods Running Status Sum metric in the following predefined health rules will show more spikes on the graph because of the higher time range:
- K8s Deployment Running vs Desired
- K8s Daemonset Running vs Desired
- K8s Replicaset Running vs Desired
- K8s Statefulset Running vs Desired

For information about using Health Rules, see Health Rules.

Kubernetes Health Rollup Path

The health rollup allows you to monitor a group of entities. You can define the health rollup relationship between the Kubernetes entities and the parent entities. For more information, see Health Rollup.

When you select the required Kubernetes entity type during a health rule creation, you get the health rollup path suggestion based on the selected entity. You can select the rollup path based on your requirement. The rule applies to the child entity and rolls up to the parent entity. In the UI, you can observe the unhealthy Kubernetes entities in the list view of the respective entity page. However, to observe some Kubernetes entities, you must view the details under a specific entity page in the user interface. The following table lists these entity types and the entity pages where you can observe the details:

Entity Type	Displayed on Entity Pages	Description
Nodes	Clusters	Any Health Rule created on the metrics of a Kubernetes node entity can be rolled up and displayed on the cluster entity page.
Unmanaged Replicaset Unmanaged Job	Workloads	To create health rules for unmanaged replicaset and unmanaged job, you need to select the entity type as replicaset and job respectively. You can use the following health rollup path to create the health rules defined for the pod entities that are deployed by unmanaged job and replica set: Pod>unmanaged replica set>namespace>cluster Pod>unmanaged job>namespace>cluster
deployment statefulset daemonset cronjob	Workloads	Workload represents all the Kubernetes workloads such as deployment, statefulset, daemonset, cronjob, unmanaged replicaset and unmanaged job. You can select the following health rollup path: Pod>Workload>namespace>cluster If you need to create a health roll up path for a specific sub-type of workload, choose the roll-up path of the sub-type from the list.
ConfigMap Secret	Configurations	Configuration represents all the Kubernetes configurations that is K8s ConfigMap and Secret. You can select the following health rollup path: Configuration>namespace>cluster

OpenTelemetry™ and Kubernetes® (as applicable) are trademarks of The Linux Foundation®.