Skip to main content
Quick navigation

Metrics

info

Monitoring is changing to improve the ease of setup and usability. Find out more.

Below details the metrics that are surfaced within Conduktor Monitoring.

You will have access to some metrics without any additional configuration, but you should install the agent to use Monitoring at full capacity.

ContextMetricDefinition
Apps MonitoringConsumer group statusIndication of healthy or critical status based on lag(s).
Critical if max lag/s exceeds 180
Apps MonitoringLag messages countNumber of messages each consumer group is behind per partition.
Apps MonitoringLag(s)Estimated number of seconds each consumer group is behind in the topic.
Cluster HealthMessages count
per broker (s)
This metric gives you the ability to gauge how active your producers are. Given batching and other factors this metric will change over time.
Cluster HealthMessages in
per broker (B/s)
This metric gives you the amount of bandwidth, per broker, taken up by producers as well as replication from partitions the broker leads in your cluster. This is useful for planning well distributed leader placement.
Cluster HealthMessages out
per broker (B/s)
This metric indicates how much bandwidth, per broker, is being utilized by consumers, as well as for replication to the broker. This is useful for planning replica and leader placements.
Cluster HealthOffline partitions countOffline partitions can be caused by lingering capacity issues, crashed brokers or cluster wide faults. This is a critical factor in the healthiness of your cluster because an offline partition can not be produced to or consumed from. The view here is that of the controller, if the controller believes a partition is offline it may not reassign or bring online a leader.
Cluster HealthUnder replicated partitions countUnder replicated partitions are a risk to data durability as well as availability. Under replicated partitions can happen for various reasons including, inability for replicas to keep up or network splits.
Cluster HealthUnder min ISR partitions countUnder minimum ISR partitions do not meet the durability requirements to be produced to. Producers that try to produce messages to a partition that is under the specified minimum isr will have the messages rejected and be forced to handle the exception.
Cluster HealthDisk - FS usageIf a Kafka broker fills up its disk durability and availability of data is at risk. Also producers will be unable to to produce to that broker. Filling a brokers' disk is also a hard incident to recover from and often involves loss of data.
Cluster HealthPartitions countTotal number of partitions(including replicas) across selected Kafka cluster.
Cluster HealthActive brokers countNumber of active brokers on selected Kafka cluster.
Cluster HealthActive partitions countTotal number of partitions active on selected Kafka cluster.
Cluster HealthActive controllers countTotal number of active controllers on selected Kafka cluster.
Topic MonitoringMessages count per topic (/s)Number of messages produced per second, per broker at a topic granularity.
Topic MonitoringTopic traffic in (B/s)Byte rate per second of messages produced, per broker at a topic granularity.
Topic MonitoringTopic traffic out (B/s)Byte rate per second of messages consumed, per broker at a topic granularity.
Topic MonitoringTotal size of messagesTotal size of messages in the topic.
Topic MonitoringDuplicates countDuplicate message count over the last N minutes, per topic.
Note this depends on Topic Analyzer being enabled.
Topic MonitoringDistinct duplicates countNumber of distinct, duplicated messages over the last N minutes, per topic.
Note this depends on Topic Analyzer being enabled.
Topic MonitoringTransaction abort countNumber of transactions aborted per topic.
Note this depends on Topic Analyzer being enabled.
Topic MonitoringBatch sizeAverage size of batches produced to a topic.
Note this depends on Topic Analyzer being enabled.
Topic MonitoringMessages per batchAverage number of messages produced per Batch to a topic.
Note this depends on Topic Analyzer being enabled.