Producer default partitioner and sticky partitioner

Kafka producer partitioners determine which partition receives each message, affecting load distribution, ordering guarantees, and overall system performance.

How partitioning works

When a producer sends a message, the partitioner decides which partition receives it based on:

Message key: If present, used to determine partition assignment
Partition specification: Explicit partition number in ProducerRecord
Partitioner logic: Default or custom partitioning algorithm

Partition assignment impactPartitioning affects message ordering, consumer distribution, load balancing, and data locality. Choose your partitioning strategy carefully based on your use case requirements.

Default partitioner behavior

With message keys

When messages have keys, the default partitioner uses a hash-based approach:

partition = hash(key) % number_of_partitions

Characteristics:

Same key always goes to same partition (within same topic configuration)
Provides ordering guarantees per key
Distributes load based on key distribution
Changes when partition count changes

Without message keys

For messages without keys, behavior depends on Kafka version: Kafka < 2.4 (Round-robin partitioner):

Cycles through partitions sequentially
Each message goes to next partition in sequence
Poor batching efficiency due to partition spreading

Kafka >= 2.4 (Sticky partitioner):

Sticks to one partition until batch is full
Switches to different partition for next batch
Improves batching efficiency and throughput

Sticky partitioner

The sticky partitioner, introduced in Kafka 2.4, improves performance for messages without keys.

How sticky partitioner works

Partition selection: Choose random available partition
Batch filling: Send all messages to chosen partition until batch fills
Partition switching: Switch to different partition for next batch
Repeat process: Continue cycle for optimal batching

Benefits of sticky partitioner

Improved throughput:

Better batch utilization (more messages per batch)
Reduced network requests
More efficient compression

Better resource utilization:

Fewer partially filled batches
Improved producer buffer usage
Enhanced broker processing efficiency

Maintained load distribution:

Over time, messages distribute evenly across partitions
No hot partitions with sustained load

Configuration

Sticky partitioner is enabled by default in Kafka 2.4+:

# Explicitly configure (usually not needed)
partitioner.class=org.apache.kafka.clients.producer.internals.DefaultPartitioner

Round-robin partitioner

The original default partitioner (pre-2.4) that cycles through partitions.

Configuration

partitioner.class=org.apache.kafka.clients.producer.RoundRobinPartitioner

When to use round-robin

Legacy compatibility: Matching behavior of older Kafka versions
Strict distribution: When perfectly even distribution is critical
Partition-level processing: When downstream processing benefits from spread

Round-robin performance impactRound-robin partitioning creates smaller batches and reduces throughput compared to sticky partitioning for keyless messages.

Uniform sticky partitioner

An alternative sticky partitioner that considers partition leadership and availability.

Configuration

partitioner.class=org.apache.kafka.clients.producer.UniformStickyPartitioner

Benefits

Broker load balancing: Considers partition leader distribution
Availability awareness: Avoids unavailable partitions
Network optimization: Reduces cross-broker traffic

Custom partitioner implementation

Create custom partitioners for specific business requirements.

Implementing a custom partitioner

public class CustomPartitioner implements Partitioner {
    
    @Override
    public int partition(String topic, Object key, byte[] keyBytes,
                        Object value, byte[] valueBytes, Cluster cluster) {
        
        List<PartitionInfo> partitions = cluster.partitionsForTopic(topic);
        int partitionCount = partitions.size();
        
        if (key == null) {
            // Handle keyless messages
            return ThreadLocalRandom.current().nextInt(partitionCount);
        }
        
        // Custom key-based partitioning logic
        String keyString = key.toString();
        
        if (keyString.startsWith("priority-")) {
            // Route priority messages to partition 0
            return 0;
        } else if (keyString.contains("analytics")) {
            // Route analytics to last partition
            return partitionCount - 1;
        } else {
            // Use hash for other messages
            return Utils.toPositive(Utils.murmur2(keyBytes)) % partitionCount;
        }
    }
    
    @Override
    public void configure(Map<String, ?> configs) {
        // Initialize partitioner with configuration
    }
    
    @Override
    public void close() {
        // Cleanup resources if needed
    }
}

Registering custom partitioner

partitioner.class=com.example.CustomPartitioner

Or programmatically:

Properties props = new Properties();
props.put("partitioner.class", "com.example.CustomPartitioner");
Producer<String, String> producer = new KafkaProducer<>(props);

Partitioning strategies

Key-based partitioning

Use case: When you need ordering guarantees per key

// Messages with same key go to same partition
ProducerRecord<String, String> record = 
    new ProducerRecord<>("topic", "user123", "user data");

Benefits:

Ordering guarantees within key
Data locality for key-based processing
Predictable partition assignment

Considerations:

Key distribution affects load balancing
Partition count changes break key-to-partition mapping
Hot keys can create hot partitions

Explicit partition assignment

Use case: When you need complete control over partition assignment

// Explicitly specify partition
ProducerRecord<String, String> record = 
    new ProducerRecord<>("topic", 2, "key", "value");  // Partition 2

Benefits:

Complete control over placement
Can implement complex routing logic
Useful for testing and debugging

Considerations:

Must handle partition availability
Need to track partition count changes
Bypasses partitioner logic entirely

Random/sticky partitioning

Use case: When you don’t need ordering and want optimal throughput

// No key, let partitioner decide
ProducerRecord<String, String> record = 
    new ProducerRecord<>("topic", "value");

Benefits:

Optimal batching and throughput
Even load distribution over time
Simple implementation

Considerations:

No ordering guarantees
Random distribution per batch

Partition count considerations

Impact of partition changes

Adding partitions affects key-to-partition mapping:

Original: hash(key) % 3 partitions
After adding: hash(key) % 6 partitions

Consequences:

Same key may go to different partition
Breaks ordering guarantees temporarily
May create temporary hotspots

Best practices for partition counts

Plan ahead: Choose partition count based on expected load
Consider consumers: Partition count limits consumer parallelism
Think about keys: More partitions = better key distribution
Monitor hotspots: Watch for uneven partition usage
Avoid frequent changes: Partition changes disrupt key distribution

Performance optimization

Maximizing throughput

For highest throughput with keyless messages:

# Use sticky partitioner (default in Kafka 2.4+)
partitioner.class=org.apache.kafka.clients.producer.internals.DefaultPartitioner

# Optimize batching
batch.size=32768
linger.ms=5
buffer.memory=67108864

Balancing load

For even load distribution:

# Consider uniform sticky partitioner
partitioner.class=org.apache.kafka.clients.producer.UniformStickyPartitioner

# Monitor partition metrics
# Adjust partition count if needed

Custom business logic

For specific routing requirements:

public class BusinessLogicPartitioner implements Partitioner {
    public int partition(String topic, Object key, byte[] keyBytes,
                        Object value, byte[] valueBytes, Cluster cluster) {
        
        // Extract business identifier from value
        JsonNode json = parseJson(valueBytes);
        String department = json.get("department").asText();
        
        // Route by department
        switch (department) {
            case "sales": return 0;
            case "marketing": return 1;
            case "engineering": return 2;
            default: 
                return Utils.toPositive(Utils.murmur2(keyBytes)) % 
                       cluster.partitionsForTopic(topic).size();
        }
    }
}

Monitoring partitioner behavior

Key metrics to track

Partition distribution:

Messages per partition = total_messages / partition_count
Distribution variance = variance across partition message counts

Batch efficiency:

Average batch size = total_bytes_sent / batch_count
Batch utilization = average_batch_size / configured_batch_size

JMX metrics

kafka.producer:type=producer-topic-metrics,client-id=<client-id>,topic=<topic>
- record-send-rate per topic
- byte-rate per topic

kafka.producer:type=producer-metrics,client-id=<client-id>
- batch-size-avg
- records-per-request-avg

Monitoring partition hotspots

Track broker-level metrics to identify hotspots:

# Check partition leader distribution
kafka-topics.sh --bootstrap-server localhost:9092 \
  --describe --topic my-topic

# Monitor broker metrics
kafka-run-class.sh kafka.tools.JmxTool \
  --object-name kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec

Common partitioning patterns

Geographic partitioning

Route messages based on geographic regions:

public class GeographicPartitioner implements Partitioner {
    private static final Map<String, Integer> REGION_PARTITIONS = Map.of(
        "US", 0, "EU", 1, "APAC", 2
    );
    
    public int partition(String topic, Object key, byte[] keyBytes,
                        Object value, byte[] valueBytes, Cluster cluster) {
        String region = extractRegion(value);
        return REGION_PARTITIONS.getOrDefault(region, 
            ThreadLocalRandom.current().nextInt(cluster.partitionsForTopic(topic).size()));
    }
}

Time-based partitioning

Route messages based on time periods:

public class TimeBasedPartitioner implements Partitioner {
    public int partition(String topic, Object key, byte[] keyBytes,
                        Object value, byte[] valueBytes, Cluster cluster) {
        
        int partitionCount = cluster.partitionsForTopic(topic).size();
        long timestamp = extractTimestamp(value);
        
        // Partition by hour of day
        int hourOfDay = (int) ((timestamp / (1000 * 60 * 60)) % 24);
        return hourOfDay % partitionCount;
    }
}

Priority-based partitioning

Route high-priority messages to specific partitions:

public class PriorityPartitioner implements Partitioner {
    public int partition(String topic, Object key, byte[] keyBytes,
                        Object value, byte[] valueBytes, Cluster cluster) {
        
        int priority = extractPriority(value);
        int partitionCount = cluster.partitionsForTopic(topic).size();
        
        if (priority >= 9) {
            return 0; // High priority partition
        } else if (priority >= 5) {
            return 1; // Medium priority partition
        } else {
            // Low priority messages use remaining partitions
            return 2 + (Utils.toPositive(Utils.murmur2(keyBytes)) % (partitionCount - 2));
        }
    }
}

Best practices

Production recommendations

Use sticky partitioner: Default choice for keyless messages (Kafka 2.4+)
Consider key distribution: Ensure keys distribute evenly across partitions
Monitor partition usage: Watch for hotspots and uneven distribution
Plan partition count: Choose based on expected throughput and consumer count
Test custom partitioners: Validate custom logic under load

Troubleshooting partitioning issues

Uneven load distribution:

Check key distribution patterns
Monitor per-partition message rates
Consider custom partitioner for better distribution

Poor batching performance:

Switch from round-robin to sticky partitioner
Increase linger.ms to allow batch filling
Monitor batch utilization metrics

Ordering issues:

Verify key-based partitioning for ordering requirements
Check if partition count changed recently
Consider using single partition for strict ordering

Hot partitions:

Analyze key distribution
Implement custom partitioner for better balance
Consider increasing partition count

Partition count changesChanging partition count breaks key-to-partition mapping and can disrupt ordering guarantees. Plan partition counts carefully and avoid frequent changes.

Testing partitionersAlways test custom partitioners with realistic data distributions and load patterns. Partition assignment affects performance significantly.

Kafkademy

Understanding Kafka

Practicing Kafka

Next level Kafka

​How partitioning works

​Default partitioner behavior

​With message keys

​Without message keys

​Sticky partitioner

​How sticky partitioner works

​Benefits of sticky partitioner

​Configuration

​Round-robin partitioner

​Configuration

​When to use round-robin

​Uniform sticky partitioner

​Configuration

​Benefits

​Custom partitioner implementation

​Implementing a custom partitioner

​Registering custom partitioner

​Partitioning strategies

​Key-based partitioning

​Explicit partition assignment

​Random/sticky partitioning

​Partition count considerations

​Impact of partition changes

​Best practices for partition counts

​Performance optimization

​Maximizing throughput

​Balancing load

​Custom business logic

​Monitoring partitioner behavior

​Key metrics to track

​JMX metrics

​Monitoring partition hotspots

​Common partitioning patterns

​Geographic partitioning

​Time-based partitioning

​Priority-based partitioning

​Best practices

​Production recommendations

​Troubleshooting partitioning issues

How partitioning works

Default partitioner behavior

With message keys

Without message keys

Sticky partitioner

How sticky partitioner works

Benefits of sticky partitioner

Configuration

Round-robin partitioner

Configuration

When to use round-robin

Uniform sticky partitioner

Configuration

Benefits

Custom partitioner implementation

Implementing a custom partitioner

Registering custom partitioner

Partitioning strategies

Key-based partitioning

Explicit partition assignment

Random/sticky partitioning

Partition count considerations

Impact of partition changes

Best practices for partition counts

Performance optimization

Maximizing throughput

Balancing load

Custom business logic

Monitoring partitioner behavior

Key metrics to track

JMX metrics

Monitoring partition hotspots

Common partitioning patterns

Geographic partitioning

Time-based partitioning

Priority-based partitioning

Best practices

Production recommendations

Troubleshooting partitioning issues