Beyond basic producer settings, advanced configurations provide fine-grained control over producer behavior, performance, and reliability for production deployments.

Buffer memory configuration

Producer buffer memory controls how much memory is available for batching messages before sending to brokers.

buffer.memory

Total memory allocated for producer buffering:
# Default: 33554432 (32MB)
buffer.memory=67108864  # 64MB total buffer
Impact of buffer size:
  • Too small: Producer blocks frequently when buffer fills
  • Too large: Excessive memory usage, potential GC pressure
  • Optimal size: Balance between memory efficiency and producer throughput

max.block.ms

Time to block when buffer is full:
# Default: 60000 (1 minute)
max.block.ms=30000  # Block for 30 seconds maximum
Behavior:
  • Producer blocks send() calls when buffer is exhausted
  • After timeout, throws TimeoutException
  • Prevents indefinite blocking in high-load scenarios

Memory calculation

Calculate required buffer memory:
Required buffer = batch.size × active_partitions × safety_factor
Safety factor = 1.5-2.0 (account for traffic spikes)

Example:
- 100 active partitions
- 32KB batch size
- 2x safety factor
= 32KB × 100 × 2 = 6.4MB minimum

Request timeout settings

Timeout configurations control how long producers wait for various operations.

request.timeout.ms

Timeout for individual requests:
# Default: 30000 (30 seconds)
request.timeout.ms=15000  # 15 second request timeout
Applies to:
  • Message send requests
  • Metadata requests
  • Administrative operations

delivery.timeout.ms

Overall timeout for message delivery including retries:
# Default: 120000 (2 minutes)
delivery.timeout.ms=300000  # 5 minute total delivery timeout
Relationship:
delivery.timeout.ms >= request.timeout.ms + (retries x retry.backoff.ms)

metadata.max.age.ms

How long to cache topic metadata:
# Default: 300000 (5 minutes)
metadata.max.age.ms=60000  # Refresh metadata every minute
Considerations:
  • Shorter intervals detect partition changes faster
  • Longer intervals reduce metadata requests
  • Balance between responsiveness and efficiency

Connection configurations

Producer connection settings affect network behavior and resource usage.

connections.max.idle.ms

Close idle connections after specified time:
# Default: 540000 (9 minutes)
connections.max.idle.ms=300000  # Close idle connections after 5 minutes
Benefits:
  • Reduces connection overhead
  • Prevents resource leaks
  • Balances connection reuse with resource management

max.in.flight.requests.per.connection

Maximum unacknowledged requests per connection:
# Default: 5
max.in.flight.requests.per.connection=3  # More conservative
Impact on ordering:
  • max.in.flight.requests.per.connection=1: Strict ordering, lower throughput
  • max.in.flight.requests.per.connection>1: Higher throughput, potential reordering
  • With enable.idempotence=true: Can use up to 5 while maintaining ordering

reconnect.backoff.ms

Wait time before reconnecting failed connections:
# Default: 50
reconnect.backoff.ms=100  # Wait 100ms before reconnect attempts

reconnect.backoff.max.ms

Maximum reconnect backoff time:
# Default: 1000 (1 second)
reconnect.backoff.max.ms=5000  # Cap backoff at 5 seconds

Metadata and discovery

Producer metadata management configurations.

bootstrap.servers

Initial broker list for cluster discovery:
bootstrap.servers=broker1:9092,broker2:9092,broker3:9092
Best practices:
  • Include multiple brokers for redundancy
  • Use broker DNS names rather than IP addresses
  • Include brokers from different racks if possible

metadata.max.idle.ms

Cache metadata when idle:
# Default: 300000 (5 minutes)  
metadata.max.idle.ms=180000  # 3 minutes

retry.backoff.ms

Wait time between metadata refresh retries:
# Default: 100
retry.backoff.ms=200  # Wait 200ms between metadata retries

Message size and batching

Advanced message size configurations.

max.request.size

Maximum size of request (batch + headers):
# Default: 1048576 (1MB)
max.request.size=5242880  # 5MB maximum request
Considerations:
  • Must be less than or equal to broker’s message.max.bytes setting
  • Affects largest possible batch size
  • Consider network MTU and broker memory

send.buffer.bytes

TCP send buffer size:
# Default: 131072 (128KB)
send.buffer.bytes=262144  # 256KB TCP send buffer

receive.buffer.bytes

TCP receive buffer size:
# Default: 65536 (64KB)
receive.buffer.bytes=131072  # 128KB TCP receive buffer

Performance tuning

Advanced performance-related configurations.

max.poll.interval.ms

Not directly a producer setting, but affects producer threads:
# Consider producer thread timing when setting consumer max.poll.interval.ms
max.poll.interval.ms=300000  # 5 minutes

socket.connection.setup.timeout.ms

Timeout for socket connections:
# Default: 10000 (10 seconds)
socket.connection.setup.timeout.ms=30000  # 30 second connection timeout

socket.connection.setup.timeout.max.ms

Maximum socket connection timeout with backoff:
# Default: 127000 (127 seconds)
socket.connection.setup.timeout.max.ms=60000  # 1 minute maximum

Monitoring and metrics

Producer monitoring configurations.

metrics.recording.level

Granularity of metrics collection:
# Options: INFO (default), DEBUG, TRACE
metrics.recording.level=INFO
Levels:
  • INFO: Basic metrics, low overhead
  • DEBUG: Detailed metrics, moderate overhead
  • TRACE: All metrics, high overhead (development only)

metrics.sample.window.ms

Metrics sampling window:
# Default: 30000 (30 seconds)
metrics.sample.window.ms=60000  # 1 minute sampling windows

metrics.num.samples

Number of samples to maintain:
# Default: 2
metrics.num.samples=3  # Keep 3 sample windows

metric.reporters

Custom metrics reporters:
metric.reporters=com.example.CustomMetricsReporter

Client identification

Producer identification and naming.

client.id

Unique identifier for this producer:
client.id=order-service-producer-01
Benefits:
  • Easier troubleshooting with meaningful names
  • Better monitoring and alerting
  • Correlation with application logs

client.dns.lookup

DNS resolution strategy:
# Options: default, use_all_dns_ips, resolve_canonical_bootstrap_servers_only
client.dns.lookup=default

Advanced reliability settings

Additional reliability configurations.

enable.idempotence

Enable idempotent producer (prevents duplicates):
# Default: true (Kafka 3.0+)
enable.idempotence=true
When enabled, automatically sets:
  • retries=Integer.MAX_VALUE
  • max.in.flight.requests.per.connection=5
  • acks=all

transaction.timeout.ms

Timeout for transactions (when using transactions):
# Default: 60000 (1 minute)
transaction.timeout.ms=300000  # 5 minute transaction timeout

transactional.id

Unique identifier for transactional producer:
transactional.id=order-service-tx-producer

Configuration templates

High-throughput configuration

# Optimize for maximum throughput
batch.size=65536
linger.ms=20
buffer.memory=134217728
compression.type=lz4
max.in.flight.requests.per.connection=5
request.timeout.ms=30000
delivery.timeout.ms=120000
enable.idempotence=true

Low-latency configuration

# Optimize for minimum latency
batch.size=16384
linger.ms=0
buffer.memory=33554432
compression.type=none
max.in.flight.requests.per.connection=1
request.timeout.ms=10000
delivery.timeout.ms=30000

High-reliability configuration

# Optimize for maximum reliability
acks=all
retries=2147483647
enable.idempotence=true
max.in.flight.requests.per.connection=5
delivery.timeout.ms=600000
request.timeout.ms=30000
buffer.memory=67108864
batch.size=32768

Memory-constrained configuration

# Minimize memory usage
buffer.memory=16777216      # 16MB
batch.size=8192             # 8KB
max.request.size=1048576    # 1MB
connections.max.idle.ms=60000  # 1 minute

Network-optimized configuration

# Optimize for slow/unreliable networks
request.timeout.ms=45000
delivery.timeout.ms=300000
retry.backoff.ms=200
reconnect.backoff.ms=200
reconnect.backoff.max.ms=10000
send.buffer.bytes=262144
receive.buffer.bytes=131072

Environment-specific tuning

Development environment

# Fast feedback, loose reliability
batch.size=16384
linger.ms=0
request.timeout.ms=10000
delivery.timeout.ms=30000
retries=3
buffer.memory=33554432

Testing environment

# Balance speed with some reliability
batch.size=32768
linger.ms=5
request.timeout.ms=15000
delivery.timeout.ms=60000
retries=10
enable.idempotence=true

Production environment

# Full reliability and monitoring
acks=all
retries=2147483647
enable.idempotence=true
delivery.timeout.ms=300000
request.timeout.ms=30000
batch.size=32768
linger.ms=10
buffer.memory=67108864
client.id=myapp-producer-${instance.id}
metrics.recording.level=INFO

Monitoring key configurations

Essential metrics to track

Buffer utilization:
  • buffer-available-bytes: Available buffer space
  • buffer-exhausted-rate: Rate of buffer exhaustion events
Request performance:
  • request-latency-avg: Average request latency
  • request-rate: Requests per second
  • response-rate: Responses per second
Connection health:
  • connection-count: Number of active connections
  • connection-creation-rate: Rate of new connections
  • connection-close-rate: Rate of closed connections
Error rates:
  • record-error-rate: Rate of failed records
  • record-retry-rate: Rate of retried records

JMX monitoring configuration

# Enable JMX for monitoring
# Add to JVM arguments:
# -Dcom.sun.management.jmxremote
# -Dcom.sun.management.jmxremote.port=9999
# -Dcom.sun.management.jmxremote.authenticate=false
# -Dcom.sun.management.jmxremote.ssl=false

Common configuration mistakes

Pitfalls to avoid

  1. Insufficient buffer memory: Causes frequent blocking
  2. Too high linger.ms: Increases latency unnecessarily
  3. Mismatched timeouts: delivery.timeout.ms too small for retry configuration
  4. Ignoring idempotence: Missing duplicate prevention in production
  5. No client.id: Makes troubleshooting difficult
  6. Wrong max.in.flight.requests: Affects ordering guarantees

Configuration validation

// Validate configuration relationships
Properties props = new Properties();
props.put("delivery.timeout.ms", 120000);
props.put("request.timeout.ms", 30000);
props.put("retries", Integer.MAX_VALUE);
props.put("retry.backoff.ms", 100);

// Ensure delivery timeout is sufficient
int deliveryTimeout = (Integer) props.get("delivery.timeout.ms");
int requestTimeout = (Integer) props.get("request.timeout.ms");
int retryBackoff = (Integer) props.get("retry.backoff.ms");

// Rule: delivery.timeout >= request.timeout + reasonable retry time
assert deliveryTimeout >= requestTimeout + (10 * retryBackoff);

Troubleshooting configurations

High latency issues

# Reduce batching delay
linger.ms=0
# Reduce request timeout  
request.timeout.ms=10000
# Ensure no compression delay
compression.type=none
# Reduce in-flight requests
max.in.flight.requests.per.connection=1

Memory pressure

# Reduce buffer size
buffer.memory=16777216
# Smaller batches
batch.size=8192
# Shorter connection idle time
connections.max.idle.ms=60000
# Aggressive blocking timeout
max.block.ms=5000

Connection issues

# Longer connection timeouts
socket.connection.setup.timeout.ms=30000
# Longer reconnect backoff
reconnect.backoff.max.ms=10000
# More aggressive retry
retry.backoff.ms=500
# Reduce connection reuse
connections.max.idle.ms=120000
Configuration compatibilitySome configurations are interdependent. For example, enabling enable.idempotence=true automatically sets retries=Integer.MAX_VALUE, max.in.flight.requests.per.connection=5, and acks=all.
Start simpleBegin with default configurations and adjust one parameter at a time based on monitoring data. Most applications work well with minimal configuration changes.