Default JMX Metrics for Apache Kafka Backends

This page describes default metrics for Apache Kafka Backends. The Java Agent includes rules for key metrics exposed by Apache Kafka producers and consumers. To monitor JMX metrics not collected by default, you can use the MBean browser to select the Kafka JMX metric and create a rule for it.

Kafka Producer JMX Metrics

Response rate: the rate at which the producer receives responses from brokers
Request rate: the rate at which producers send request data to brokers
Request latency average: average time between the producer's execution of KafkaProducer.send() and when it receives a response from the broker
Outgoing byte rate: producer network throughput
IO wait time: percentage of time the CPU is idle and there is at least one I/O operation in progress
Record error rate: average record sends per second that result in errors
Waiting threads: number of user threads blocked waiting for buffer memory to enqueue their records
Requests in flight: current number of outstanding requests awaiting a response
Network IO rate: average number per second of network operations, reads or writes, on all connections

Kafka Consumer JMX Metrics

Records lag max: maximum lag in terms of number of records for any partition within the timeframe
Bytes consumed rate: average number of bytes consumed per second
Fetch rate: number of fetch requests per second
Records consumed rate: average number of records consumed per second
Fetch latency max: maximum time this is taken for any fetch request

Kafka Server JMX Metrics

broker-request-total-time-ms: Total end-to-end time in milliseconds.
broker-request-send-response-ms: Responses dequeued are sent remotely through a non-blocking IO. The time between dequeuing the response and completing send is indicated by this metric.
broker-request-response-queue-ms: Responses too are added to a queue. There is one queue per network processor. Network processor dequeues this response and sends it back. Time spent waiting in this queue is indicated by this metric.
broker-request-remote-time-ms: If the request needs remote processing, the time spent in remote processing is indicated by this metric. For example, for the producer request, if acks is set to -1, the request is not completed until the acknowledgment is received from the followers. Time spent waiting for the followers in indicated by this metric. A fetch request can also be delayed if there is not enough data to fetch. That time too is accounted for by this metric.
broker-request-processing-ms: The request is then processed by the KafkaAPI. The time spent in the processing is indicated by this metric. If this metric is high, debug the relevant request handler.
broker-request-queue-time-ms: The request is added to a common queue. The items in the queue are processed by the request handler threads. The number of these handler threads can be configured through num.io.threads parameter. The average time, a request spends in this queue, is indicated by this metric. If this metric is high, increase the handler threads.