Cisco Cloud Observability ingests metrics in OpenTelemetry™ Line Protocol (OTLP) format from your OpenTelemetry-compatible agents or collectors, converts them into the Splunk AppDynamics metrics model, enriches them with derived data, and stores them in the Cisco Cloud Observability metrics datastore in Splunk AppDynamics metrics model format.The Splunk AppDynamics metrics model is similar to the OpenTelemetry data model for metrics, with important differences. This page describes the Splunk AppDynamics metrics model and explains how it relates to the OTLP format.

For help with individual metrics you might see in Cisco Cloud Observability, see Understand the Observe UI.

This document contains references to third-party documentation. Splunk AppDynamics does not own any rights and assumes no responsibility for the accuracy or completeness of such third-party documentation.

What is a Metric?

A metric is a numerical measurement, sampled over a specific timeframe and typically with a fixed frequency, such as TargetConnectionErrorCountSplunk AppDynamics metrics are registered by a domain or feature in a type system. A metric has properties, a content type, and a category.

Terminology

TermDefinition
Measurement event

The act of recording one metric. For example, the act of recording the request latency of entity <N>. A measurement event is associated with exactly one timestamp.

Metric category

The default consumption function used when no consumption function is supplied in a query. The metric category defines how Cisco Cloud Observability computes a metric's value from a given content type. See AppDynamics Metric Categories.

Metric data point

A summary or aggregation of multiple numerical measurements, typically taken over a specific time range and at a fixed frequency. For example, a request duration is reported for the last minute. 

Metric timeseries

A series of metric data points having the same entity ID, metric type, source, and a unique set of attributes. For entities, see Understand the Observe UI. For sources, see Observe UI Overview.

Cisco Cloud Observability displays metric timeseries as graphs in which time ranges are represented as (startTimegranularity), whereas OpenTelemetry represents time ranges as (startTimeendTime). Both representations are interchangeable. On graphs, Cisco Cloud Observability associates metric data points with only the timestamp, startTime.

Metric typeA unique way to identify what the metric corresponds to and consists of the name of the metric, its content type, data type, and so on. For example, calls or https.response.size.
Time aggregation

An aggregation of measurement events or metric data points within the same timeseries. There are two types of time aggregations:

    • Measurement events that are converted into metric data points. Typically, the conversion is done on the client-side, but can also be done on the server-side.

    • Metric data points are aggregated to get fewer metric data points. Typically, this aggregation is done on the server-side, but can also be done on the client-side.

Space aggregation

An aggregation of metric data points having the same time ranges from multiple metric timeseries.

The OpenTelemetry Data Model for Metrics

The OpenTelemetry data model for metrics specifies data formats and protocols for the import, transportation, and export of metrics. It includes the OpenTelemetry Line Protocol (OTLP) and the OpenTelemetry Timeseries Model.

OpenTelemetry Line Protocol

The OpenTelemetry Line Protocol (OTLP) defines how the OpenTelemetry metric stream is encoded and transported over gRPC or HTTP 1.1 to an OpenTelemetry timeseries store. Each metric stream is identified by its name, attributes, originating resource, and OTLP type (point kind). There can be more than one metric stream per instrument in the event model.

Splunk AppDynamics supports the following OTLP types. Each OTLP type maps to a specific aggregation function or functions.

OTLP Type (Point Kind)DescriptionAggregation FunctionMonotonic Supported?Supported Aggregation Temporalities
SumThe sum of all measurement event values.sum()YesDelta
GaugeA sampled value at a given time. Gauges do not provide an aggregation semantic. Instead, they provide a "last sample value". For this reason, the startTime is not meaningful for gauges; instead, it is a point event associated with endTimestamp, unlike the other OTLP types above. See the gauge definition in the metrics protobuf.latest()No-
Summary 

Splunk AppDynamics supports this OTLP type only when p0 and p100 are provided along with sum and count.

---

OpenTelemetry Timeseries Model

The OpenTelemetry Timeseries Model specifies how OpenTelemetry backends store metrics—in other words, the at rest format of metrics at their destination.

The Splunk AppDynamics Metrics Model

Pre-ingest, Ingest, and Post-ingest Granularity

  • Pre-ingest metric granularity depends on the data source. The granularity of sampling can vary depending on the collector's configuration or the data source.
  • Supported ingest granularities are one minute and five minutes. Granularities that fall within a threshold range(±3s) of defined ingest granularities are also acceptable. At ingest time, the Cisco Observability Platform aggregates metrics collected at sub-minute granularities into one-minute granularities by default. Cisco Observability Platform schemas define metrics and ingest granularities for all registered entity types.
  • Post-ingest, Cisco Cloud Observability may aggregate metrics into higher granularities and compute roll-ups and summaries both at an entity's relationship level and at an entity's attribute level. 

Metric Retention

Cisco Cloud Observability retains one-minute aggregations for eight days, and one-hour aggregations for 367 days. 

Metric Content Types

The Splunk AppDynamics metrics model defines SumDistribution, and Gauge metric content types. The following table explains how Splunk AppDynamics maps OTLP types to Splunk AppDynamics metric content types.

OTLP Type

Splunk AppDynamics Metric Content Type

DescriptionExamplesFields
GaugeGauge
  • Same as OTLP type.
  • Can be long or double type depending on an entity's metricTypes attribute.
  • The startTime attribute is mandatory, unlike in the OTLP Gauge type.
  • system.cpu.utilization
  • system.memory.utilization
  • Room Temperature
  • current
  • groupCount
SumSum
  • Same as OTLP type.
  • Can be long or double type depending on an entity's metricTypes attribute.
  • The monetary value of transactions
  • The number of requests
  • system.paging.faults

  • system.cpu.time
  • Net profit value of stocks
  • Current Queue Size
  • Active Requests
  • system.memory.usage
  • system.paging.usage

  • Heap size
  • Memory Buffer sizes
  • sum
  • groupCount: Number of base entities participating in space-aggregation. Default value is 1

Summary, when p0 and p100 are provided along with sum and count.

Additional rules for conversion from OpenTelemetry Summary to Splunk AppDynamics Distribution:

  • OpenTelemetry Summary supports only double values. If the MetricType.type is defined as Long, then Summary.sum double value will be rounded to Distribution.sum long value. The same applies to p0 and p100 values. This may result in a loss of precision.
    This is domains/agent responsibility to not use the fraction part of the OpenTelemetry Summary, when declaring the MetricType.type as long.
    If fractions parts need to be preserved, the distribution should be of the type double.
  • If the summary reports quantiles other than p0 and p100, they will be ignored during the conversion.
Distribution
  • Captures sum, min, max, and count. Useful for getting averages.
  • Can be long or double type depending on an entity's metricTypes attribute.
  • All fields are mandatory.

  • Is a superset of Sum. So, all Sum use cases can be addressed using Distribution. However, Distribution is costlier in processing and storage. Do not use these unless required.
  • http.server.duration
  • rpc.client.request.size
  • sum
  • groupCount
  • count
  • min
  • max

Metric Categories 

The Splunk AppDynamics metrics model introduces the concept of metric categories. Metric categories do not exist in the OpenTelemetry data model for metrics. When you look at metric graphs on Cisco Cloud Observability, you see a single value for each timestamp even though some Splunk AppDynamics metric content types can have multiple values. This single value is a calculation based on the metric category. The metric category is a consumption function --a mathematical function that defines how that single value is calculated.  Therefore, the best practice is to assign a metric category to each metric in your OpenTelemetry collector's configuration. 

The Splunk AppDynamics metric model defines the following metric categories:

  • AVERAGE
  • CURRENT
  • CURRENT_PER_INSTRUMENTED_ENTITY
  • RATE_PER_MIN
  • RATE_PER_SEC
  • SUM
  • SUM_PER_INSTRUMENTED_ENTITY

The following table lists each Splunk AppDynamics metric category, the mathematical formula that Splunk AppDynamics uses to calculate a single value to display, and what metric content types Splunk AppDynamics can assign to that metric category. For example, Splunk AppDynamics assigns metrics of category CURRENT to content type Guage.

Splunk AppDynamics Metric Category

DescriptionMathematical FormulaAllowed Metric Content TypesSample Usage

AVERAGE

Mathematical average

(sum / count)

Distribution

For a metric request-latency using content type Distribution, and sending latencies of 100 requests:

  • sum = 320
  • count = 100
  • AVERAGE = 320/100 = 3.2 seconds
CURRENTThe current valuecurrent

Gauge


CURRENT_PER_INSTRUMENTED_ENTITY

Average in spatial dimension

(current / groupCount)

Gauge

For a metric system.cpu.utilization reported from 2 nodes as 10%, 20%:

  • current = 30 (see Space Aggregation for Gauge)
  • groupCount = 2
  • CURRENT_PER_INSTRUMENTED_ENTITY = 30/2 = 15%
RATE_PER_MIN

Rate of change per minute

sum / granularity (in min)

where granularity = endTime - startTime

  • Sum
  • Distribution
  • Gauge

For a metric Number-Of-Requests with content type Sum, one call every second, and an agent reporting every 30 seconds, RATE_PER_MIN and RATE_PER_SEC are:


timestamp | sum | RATE_PER_MIN| RATE_PER_SEC   
60s | 30 | 30/0.5=60 | 30/30=1
120s | 30 | 30/0.5=60 | 30/30=1
RATE_PER_SEC

Rate of change per second

Same as RATE, but granularity, endTime, and startTime are in seconds.
  • Sum
  • Distribution
  • Gauge

SUM

Mathematical sum


  • Sum
  • Distribution
-

SUM_PER_INSTRUMENTED_ENTITY

Average in spatial dimension

(sum / groupCount)
  • Sum
  • Distribution

-

Consumption Functions

A consumption function is a mathematical function, which can show a different view or aggregation of metric data. For example, a max function gives the maximum value of all metric data points in the given time range. A consumption function is similar to a metric category, except that it can be supplied dynamically at query time. In other words, in addition to a default value based on a category, a consumption function can be used when querying metrics. This is useful when you want to override the metric category and apply a different mathematical function for a query.  

Consumption FunctionDescription or Underlying FormulaAllowed Content Types
minmin

Distribution

maxmax

Distribution

p

Percentile. Any percentile value can be queried from the underlying digest summary using this consumption function.

Example: p99.98

Histogram

countcount

Distribution

groupCount

Number of base entities participating in a space aggregation.

Sum

Distribution

Gauge

stdDev

Standard deviation

Sum

Distribution

Gauge

sumCumulative

Latest sum value of cumulative metrics

Sum

Distribution

value

A reference to the metric category's underlying function.

VALUE = CATEGORY function

Sum

Distribution

Gauge

Splunk AppDynamics Metric Aggregations

Space Aggregations

Cisco Cloud Observability does not support space aggregations if the timeseries have different aggregation temporalities, such as Delta and Cumulative timeseries.

Splunk AppDynamics Metric Content Type

Aggregation TemporalitySpace AggregationsSpace Aggregated Type
Sum

Delta

  • sum = sum(sums)
  • groupCount = sum(groupCounts)
Sum
DistributionDelta
  • sum = sum(sums)
  • groupCount = sum(groupCounts)
  • count = sum(counts)
  • min = min(mins)
  • max = max(maxes)
Distribution
GaugeNot applicable
  • current = sum(currents)
  • groupCount = sum(groupCounts)
Gauge

Time Aggregations

Cisco Cloud Observability converts all Cumulative metrics into Delta before storing them. We handle resets and gaps in Cumulative metrics as follows:

  • We detect resets and gaps based on the StartTimeUnixNano present in the metric packet and a comparison of that value with the same value in a previously received metric packet. If the current value is greater than the previous value, we treat it as a reset. Also, in case of monotonically increasing metrics, if the current metric value is less than previous metric value, then we treat it as a reset.
  • We treat drops in the continuous data flow of metric data as gaps. The first metric we receive after a gap is not stored in our backend but is instead used as the new reference point to calculate future Delta metrics.

Splunk AppDynamics Metric Content Type

Aggregation TemporalityTime AggregationsTime Aggregated Type
SumDelta
  • sum = sum(sum)
  • groupCount = max(groupCounts)
Sum
Distribution

Delta

  • sum = sum(sums)
  • groupCount = max(groupCounts)
  • count = sum(counts)
  • min = min(mins)
  • max = max(maxes)
Distribution
GaugeNot applicable
  • current = latest(currents)
  • groupCount = latest(groupCounts)
Gauge

OpenTelemetry™ and Kubernetes® (as applicable) are trademarks of The Linux Foundation®.