GCP Dataflow is a fully-managed, scalable data processing service for executing batch, stream, and ETL processing patterns.

You must configure cloud connections to monitor this entity. See Configure Google Cloud Platform Connection.

Cisco Cloud Observability displays GCP entities on the Observe page. Metrics are displayed for specific entity instances in the list and detail views.

This document contains references to third-party documentation. Splunk AppDynamics does not own any rights and assumes no responsibility for the accuracy or completeness of such third-party documentation.

Detail View 

To display the detail view of a GCP Dataflow job:

  1. Navigate to the Observe page.
  2. Under App Integrations, click GCP Dataflow Jobs.
    The list view now displays.
  3. From the list, click a Name to display the detail view.
    The detail view displays metrics, key performance indicators, and properties (attributes) related to the instance you selected.

Metrics and Key Performance Indicators 

Cisco Cloud Observability displays the following metrics and key performance indicators (KPIs) for GCP Dataflow jobs.

Some GCP metrics have been modified in Cisco Cloud Observability. Metric display names and descriptions may differ from the source metric.

Display NameSource Metric NameDescription
Errors (Binary)

job/is_failed

Specifies if this job failed.
Elapsed Time in Running State (s)

job/elapsed_time

The duration that the current run of this pipeline has been in the Running state so far, in seconds.
System Lag (s)

job/system_lag

The current maximum duration that an item of data has been processing or awaiting processing, in seconds.
Data Watermark Age (s)

job/data_watermark_age

The age (time since event timestamp) up to which all data has been processed by the pipeline.
Data Processed (By)


job/billable_shuffle_data_processedThe billable bytes of shuffle data processed by this Dataflow job.

job/total_shuffle_data_processed

The total bytes of shuffle data processed by this Dataflow job. 

job/total_streaming_data_processed

The total bytes of streaming data processed by this Dataflow job.
Elements (Count)

job/element_count

Number of elements added to the pcollection so far.
Estimated Bytes (By)

job/estimated_byte_count

An estimated number of bytes added to the pcollection so far.
Backlog Elements (Count)

job/backlog_elements

The amount of known, unprocessed input for a stage in elements.
Backlog Bytes (By)

job/backlog_bytes

The amount of known, unprocessed input for a stage in bytes.
Current vCPUs (Count)

job/current_num_vcpus

The number of vCPUs currently being used by this Dataflow job.
vCPU Usage (s)

job/total_vcpu_time

The total vCPU seconds used by this Dataflow job.
Memory Capacity (By)

job/memory_capacity

The amount of memory currently being allocated to all workers associated with this Dataflow job.
Memory Usage (GBy.s)

job/total_memory_usage_time

The total GB seconds of memory allocated to this Dataflow job.
Disk Capacity (By)

job/disk_space_capacity

The amount of persistent disk currently being allocated to all workers associated with this Dataflow job.
Disk Usage (GBy.s)

job/total_pd_usage_time

The total GB seconds for all persistent disk used by all workers associated with this Dataflow job.
GPU Utilization (%)
  • job/gpu_memory_utilization
  • job/gpu_utilization
Percent of time over the past sample period during which global (device) memory was being read or written.
Worker Instances

job/max_worker_instances_limit

The maximum number of workers autoscaling is allowed to request.

job/min_worker_instances_limit

The minimum number of workers autoscaling is allowed to request.
Current Shuffle Slots (Count)

job/current_shuffle_slots

The current shuffle slots used by this Dataflow job.
User Counter

job/user_counter

A user-defined counter metric.

Properties (Attributes)

Cisco Cloud Observability displays the following properties for GCP Dataflow jobs.

Display NameSource Property NameDescription
ID-The ID of the dataflow job.
NamenameThe user-specified Cloud Dataflow job name.
Numeric ID-The numeric ID of the dataflow job.
Project ID-The ID of the GCP project.
Region-The geographical region the resource is running.
TypetypeThe type of Cloud Dataflow job.
Current StatecurrentStateThe current state of the job.
Current State TimecurrentStateTimeThe timestamp associated with the current state.
Replaced By Job ID

replacedByJobId

The regional endpoint that contains this job.
Start Time

startTime

If this is specified, the job's initial state is populated from the given snapshot.
Service Account Email

serviceAccountEmail

The identity to run virtual machines as. Defaults to the default account.
Worker Region

workerRegion

The Compute Engine region.
Worker Zone

workerZone

The Compute Engine zone.

Retention and Purge Time-To-Live (TTL)

For all cloud and infrastructure entities, the retention TTL is 180 minutes (3 hours) and the purge TTL is 525,600 minutes (365 days). 

Third party names, logos, marks, and general references used in these materials are the property of their respective owners or their affiliates in the United States and/or other countries. Inclusion of such references are for informational purposes only and are not intended to promote or otherwise suggest a relationship between Splunk AppDynamics and the third party.