Kafka producer batching groups multiple messages together before sending them to brokers, dramatically improving throughput at the cost of slightly increased latency.

How batching works

Instead of sending messages individually, the Kafka producer accumulates messages in memory and sends them in batches:
  1. Message accumulation: Producer collects messages in memory buffers per partition
  2. Batch creation: Messages are grouped into batches based on size or time limits
  3. Network transmission: Complete batches are sent to brokers in single network requests
  4. Broker processing: Brokers process entire batches more efficiently than individual messages
Kafka Producer Batching
Batching is automaticKafka producers handle batching automatically - you only need to configure the parameters that control when batches are sent.

Key batching parameters

batch.size

Controls the maximum size of a batch in bytes.
# Default: 16384 (16KB)
batch.size=32768  # 32KB batches
Characteristics:
  • Larger batches improve throughput but increase memory usage
  • Batches are sent when they reach this size, regardless of time
  • Each partition has its own batch buffer
  • Does not enforce a hard limit on message size

linger.ms

Controls how long to wait for additional messages before sending a batch.
# Default: 0 (send immediately)
linger.ms=10  # Wait up to 10ms for more messages
Characteristics:
  • Adds artificial delay to allow batches to fill up
  • Improves throughput by creating larger batches
  • Increases end-to-end latency slightly
  • Balances throughput vs latency trade-off

buffer.memory

Total memory available for batching across all partitions.
# Default: 33554432 (32MB)
buffer.memory=67108864  # 64MB total buffer
Characteristics:
  • Shared across all partitions and topics
  • Producer blocks when buffer is full
  • Must accommodate batch.size x number of active partitions
  • Affects producer’s ability to handle traffic spikes

Batching behavior

When batches are sent

Batches are sent when ANY of these conditions are met:
  1. Batch size reached: Batch reaches batch.size bytes
  2. Linger time elapsed: linger.ms milliseconds have passed since first message
  3. Buffer full: Producer needs space for new messages
  4. Producer flush: Explicit flush() call or producer close
  5. Metadata update: Topic metadata changes require sending pending batches

Batch size dynamics

Batch sizes are dynamic and depend on:
  • Message arrival rate: High rates create fuller batches
  • Message size: Large messages fill batches faster
  • Partition distribution: Messages spread across partitions reduce batch efficiency
  • Linger time: Longer waits allow larger batches to form

Throughput optimization

Maximizing throughput

For maximum throughput, optimize batching parameters:
# Large batches for high throughput
batch.size=65536          # 64KB batches
linger.ms=20              # Wait longer for full batches
buffer.memory=134217728   # 128MB buffer for more batches
compression.type=lz4      # Compress batches for efficiency

Throughput benefits

Batching improves throughput through:
  • Reduced network overhead: Fewer network requests
  • Better compression: Larger batches compress more efficiently
  • Broker efficiency: Brokers process batches more efficiently
  • Reduced per-message overhead: Amortized protocol overhead

Measuring throughput impact

Monitor these metrics to measure batching effectiveness:
Records per batch = total_records_sent / total_batches_sent
Batch utilization = average_batch_size / batch_size_config
Network efficiency = bytes_sent / network_requests_sent

Latency considerations

Latency trade-offs

Batching affects latency in different ways: Increased latency:
  • Batching delay: Messages wait in batches before sending
  • Processing time: Larger batches take longer to process
Reduced latency:
  • Network efficiency: Fewer network round-trips
  • Broker processing: More efficient batch processing

Minimizing latency

For low-latency scenarios:
# Minimize batching delay
linger.ms=0               # Send immediately
batch.size=16384          # Moderate batch size
buffer.memory=33554432    # Default buffer size

Latency measurement

Track end-to-end latency components:
Total latency = produce_time + network_time + broker_processing_time
Batching delay = time_in_batch + linger_ms_delay

Memory management

Memory allocation

Producer memory is allocated as:
Total memory = buffer.memory
Per-partition memory = batch.size (when active)
Available partitions = buffer.memory / batch.size

Memory pressure handling

When buffer memory is exhausted:
  1. Block producer: send() calls block until memory available
  2. Exception: After max.block.ms, throw TimeoutException
  3. Batch completion: Existing batches complete and free memory

Memory optimization

# Balance memory usage with performance
buffer.memory=67108864    # 64MB buffer
batch.size=32768          # 32KB batches
max.block.ms=60000        # 1 minute block timeout

Advanced batching scenarios

High partition count

For topics with many partitions:
# More memory needed for many partitions
buffer.memory=134217728   # 128MB for more partitions
batch.size=16384          # Smaller batches to spread memory
linger.ms=10              # Longer wait for batch filling

Variable message sizes

For mixed message sizes:
# Accommodate large messages
batch.size=65536          # Larger batches for big messages
buffer.memory=67108864    # More memory for size variation
linger.ms=5               # Moderate wait time

Bursty traffic patterns

For irregular traffic:
# Handle traffic bursts
buffer.memory=134217728   # Large buffer for bursts
batch.size=32768          # Moderate batch size
linger.ms=15              # Longer wait for batch efficiency
max.block.ms=30000        # Reasonable block timeout

Monitoring batching performance

Key metrics to track

Batch efficiency metrics:
  • batch-size-avg: Average batch size in bytes
  • batch-size-max: Maximum batch size observed
  • records-per-request-avg: Average records per batch
  • requests-in-flight: Number of active batches
Performance metrics:
  • record-send-rate: Messages sent per second
  • byte-rate: Bytes sent per second
  • request-rate: Batches sent per second
  • request-latency-avg: Average batch send latency

JMX monitoring

kafka.producer:type=producer-metrics,client-id=<client-id>
- batch-size-avg
- batch-size-max
- records-per-request-avg
- request-rate
- byte-rate
- record-send-rate

Configuration examples

High-throughput configuration

# Optimize for maximum throughput
batch.size=65536          # Large 64KB batches
linger.ms=20              # Wait for full batches
buffer.memory=134217728   # 128MB buffer
compression.type=lz4      # Fast compression
acks=1                    # Balance durability/speed

Low-latency configuration

# Optimize for minimum latency
batch.size=16384          # Moderate batch size
linger.ms=0               # Send immediately
buffer.memory=33554432    # Default buffer
compression.type=none     # No compression delay
acks=1                    # Fast acknowledgment
# Balance throughput and latency
batch.size=32768          # 32KB batches
linger.ms=5               # Short wait time
buffer.memory=67108864    # 64MB buffer
compression.type=snappy   # Fast compression
acks=1                    # Reasonable durability

Memory-constrained configuration

# Minimize memory usage
batch.size=8192           # Smaller 8KB batches
linger.ms=10              # Wait for efficiency
buffer.memory=16777216    # 16MB buffer
compression.type=lz4      # Compress to save space

Best practices

Production recommendations

  1. Start with defaults: Default settings work well for most use cases
  2. Monitor batch utilization: Ensure batches are reasonably full
  3. Adjust based on traffic patterns: Tune for your specific workload
  4. Consider memory constraints: Don’t over-allocate buffer memory
  5. Test latency impact: Measure end-to-end latency changes

Tuning methodology

  1. Establish baseline: Measure current throughput and latency
  2. Adjust one parameter: Change batch.size OR linger.ms, not both
  3. Monitor impact: Watch throughput, latency, and memory usage
  4. Iterate gradually: Make incremental adjustments
  5. Load test: Validate under production-like conditions

Common pitfalls

Mistakes to avoid:
  • Setting linger.ms too high (increases latency unnecessarily)
  • Making batch.size too large (wastes memory for small messages)
  • Insufficient buffer.memory for partition count
  • Not monitoring batch utilization rates
  • Changing multiple parameters simultaneously

Troubleshooting batching issues

Poor batch utilization:
  • Increase linger.ms to allow batches to fill
  • Check if messages are spread across too many partitions
  • Monitor message arrival patterns
High memory usage:
  • Reduce batch.size or buffer.memory
  • Check for inactive partitions consuming memory
  • Monitor buffer pool utilization
Increased latency:
  • Reduce linger.ms to decrease batching delay
  • Monitor end-to-end latency components
  • Consider compression impact on processing time
Memory blockingWhen buffer memory is exhausted, the producer will block send() calls. Monitor buffer-available-bytes to prevent blocking in production systems.
Batch size vs message sizeThe batch.size parameter is a target, not a strict limit. Batches can exceed this size if single messages are larger, and will be smaller if linger.ms timeout occurs first.