How batching works
Instead of sending messages individually, the Kafka producer accumulates messages in memory and sends them in batches:- Message accumulation: Producer collects messages in memory buffers per partition
- Batch creation: Messages are grouped into batches based on size or time limits
- Network transmission: Complete batches are sent to brokers in single network requests
- Broker processing: Brokers process entire batches more efficiently than individual messages

Batching is automaticKafka producers handle batching automatically - you only need to configure the parameters that control when batches are sent.
Key batching parameters
batch.size
Controls the maximum size of a batch in bytes.- Larger batches improve throughput but increase memory usage
- Batches are sent when they reach this size, regardless of time
- Each partition has its own batch buffer
- Does not enforce a hard limit on message size
linger.ms
Controls how long to wait for additional messages before sending a batch.- Adds artificial delay to allow batches to fill up
- Improves throughput by creating larger batches
- Increases end-to-end latency slightly
- Balances throughput vs latency trade-off
buffer.memory
Total memory available for batching across all partitions.- Shared across all partitions and topics
- Producer blocks when buffer is full
- Must accommodate batch.size x number of active partitions
- Affects producer’s ability to handle traffic spikes
Batching behavior
When batches are sent
Batches are sent when ANY of these conditions are met:- Batch size reached: Batch reaches
batch.size
bytes - Linger time elapsed:
linger.ms
milliseconds have passed since first message - Buffer full: Producer needs space for new messages
- Producer flush: Explicit
flush()
call or producer close - Metadata update: Topic metadata changes require sending pending batches
Batch size dynamics
Batch sizes are dynamic and depend on:- Message arrival rate: High rates create fuller batches
- Message size: Large messages fill batches faster
- Partition distribution: Messages spread across partitions reduce batch efficiency
- Linger time: Longer waits allow larger batches to form
Throughput optimization
Maximizing throughput
For maximum throughput, optimize batching parameters:Throughput benefits
Batching improves throughput through:- Reduced network overhead: Fewer network requests
- Better compression: Larger batches compress more efficiently
- Broker efficiency: Brokers process batches more efficiently
- Reduced per-message overhead: Amortized protocol overhead
Measuring throughput impact
Monitor these metrics to measure batching effectiveness:Latency considerations
Latency trade-offs
Batching affects latency in different ways: Increased latency:- Batching delay: Messages wait in batches before sending
- Processing time: Larger batches take longer to process
- Network efficiency: Fewer network round-trips
- Broker processing: More efficient batch processing
Minimizing latency
For low-latency scenarios:Latency measurement
Track end-to-end latency components:Memory management
Memory allocation
Producer memory is allocated as:Memory pressure handling
When buffer memory is exhausted:- Block producer:
send()
calls block until memory available - Exception: After
max.block.ms
, throwTimeoutException
- Batch completion: Existing batches complete and free memory
Memory optimization
Advanced batching scenarios
High partition count
For topics with many partitions:Variable message sizes
For mixed message sizes:Bursty traffic patterns
For irregular traffic:Monitoring batching performance
Key metrics to track
Batch efficiency metrics:batch-size-avg
: Average batch size in bytesbatch-size-max
: Maximum batch size observedrecords-per-request-avg
: Average records per batchrequests-in-flight
: Number of active batches
record-send-rate
: Messages sent per secondbyte-rate
: Bytes sent per secondrequest-rate
: Batches sent per secondrequest-latency-avg
: Average batch send latency
JMX monitoring
Configuration examples
High-throughput configuration
Low-latency configuration
Balanced configuration (recommended)
Memory-constrained configuration
Best practices
Production recommendations
- Start with defaults: Default settings work well for most use cases
- Monitor batch utilization: Ensure batches are reasonably full
- Adjust based on traffic patterns: Tune for your specific workload
- Consider memory constraints: Don’t over-allocate buffer memory
- Test latency impact: Measure end-to-end latency changes
Tuning methodology
- Establish baseline: Measure current throughput and latency
- Adjust one parameter: Change batch.size OR linger.ms, not both
- Monitor impact: Watch throughput, latency, and memory usage
- Iterate gradually: Make incremental adjustments
- Load test: Validate under production-like conditions
Common pitfalls
Mistakes to avoid:- Setting
linger.ms
too high (increases latency unnecessarily) - Making
batch.size
too large (wastes memory for small messages) - Insufficient
buffer.memory
for partition count - Not monitoring batch utilization rates
- Changing multiple parameters simultaneously
Troubleshooting batching issues
Poor batch utilization:- Increase
linger.ms
to allow batches to fill - Check if messages are spread across too many partitions
- Monitor message arrival patterns
- Reduce
batch.size
orbuffer.memory
- Check for inactive partitions consuming memory
- Monitor buffer pool utilization
- Reduce
linger.ms
to decrease batching delay - Monitor end-to-end latency components
- Consider compression impact on processing time
Memory blockingWhen buffer memory is exhausted, the producer will block
send()
calls. Monitor buffer-available-bytes
to prevent blocking in production systems.Batch size vs message sizeThe
batch.size
parameter is a target, not a strict limit. Batches can exceed this size if single messages are larger, and will be smaller if linger.ms
timeout occurs first.