This page describes default metrics for Apache Kafka Backends. The Java Agent includes rules for key metrics exposed by Apache Kafka producers and consumers. To monitor JMX metrics not collected by default, you can use the MBean browser to select the Kafka JMX metric and create a rule for it.

Kafka Producer JMX Metrics

  • Response rate: the rate at which the producer receives responses from brokers
  • Request rate: the rate at which producers send request data to brokers
  • Request latency average: average time between the producer's execution of KafkaProducer.send() and when it receives a response from the broker
  • Outgoing byte rate: producer network throughput
  • IO wait time: percentage of time the CPU is idle and there is at least one I/O operation in progress
  • Record error rate: average record sends per second that result in errors
  • Waiting threads: number of user threads blocked waiting for buffer memory to enqueue their records
  • Requests in flight: current number of outstanding requests awaiting a response
  • Network IO rate: average number per second of network operations, reads or writes, on all connections

Kafka Consumer JMX Metrics

  • Records lag max: maximum lag in terms of number of records for any partition within the timeframe 
  • Bytes consumed rate: average number of bytes consumed per second
  • Fetch rate: number of fetch requests per second
  • Records consumed rate: average number of records consumed per second
  • Fetch latency max: maximum time this is taken for any fetch request

Kafka Server JMX Metrics

  • broker-request-total-time-msTotal end-to-end time in milliseconds.
  • broker-request-send-response-ms: Responses dequeued are sent remotely through a non-blocking IO.  The time between dequeuing the response and completing send is indicated by this metric. 
  • broker-request-response-queue-ms: Responses too are added to a queue. There is one queue per network processor. Network processor dequeues this response and sends it back.  Time spent waiting in this queue is indicated by this metric. 
  • broker-request-remote-time-ms: If the request needs remote processing, the time spent in remote processing is indicated by this metric. For example, for the producer request, if acks is set to -1, the request is not completed until the acknowledgment is received from the followers. Time spent waiting for the followers in indicated by this metric.  A fetch request can also be delayed if there is not enough data to fetch. That time too is accounted for by this metric.
  • broker-request-processing-ms: The request is then processed by the KafkaAPI. The time spent in the processing is indicated by this metric. If this metric is high, debug the relevant request handler. 
  • broker-request-queue-time-msThe request is added to a common queue. The items in the queue are processed by the request handler threads. The number of these handler threads can be configured through num.io.threads parameter. The average time, a request spends in this queue, is indicated by this metric. If this metric is high, increase the handler threads.