Learn how idempotent producers prevent duplicate messages in 10 minutes
Idempotent producers ensure that messages are not duplicated even when retries occur, providing exactly-once semantics for producer operations. This is essential for building reliable data pipelines where duplicates cause problems.
What you’ll learn:
- How idempotent producers prevent duplicates
- The mechanisms Kafka uses for deduplication
- Configuration requirements and best practices
- Limitations to be aware of
What is producer idempotency?
Producer idempotency means that sending the same message multiple times will result in exactly one copy of the message being written to the Kafka topic, even in the presence of failures and retries.
Enable idempotent producers
Idempotent producers are enabled by default in Kafka 3.0+. For older versions, enable explicitly:
When idempotency is enabled, Kafka automatically sets these configurations:
retries=Integer.MAX_VALUE
max.in.flight.requests.per.connection=5
acks=all
Default in Kafka 3.0+Idempotent producers are enabled by default in Kafka 3.0 and later versions. This provides better out-of-the-box reliability without requiring explicit configuration.
How Kafka achieves idempotency
Kafka uses two key mechanisms to ensure idempotency:
1. Producer ID (PID)
Each producer instance gets a unique Producer ID from the broker:
- Assigned when producer starts up
- Valid for the lifetime of the producer session
- Used to track message sequences
2. Sequence numbers
Each message gets a sequence number per topic-partition:
- Starts at 0 for each producer-topic-partition combination
- Incremented for each message sent
- Used by broker to detect duplicates
How deduplication works
When a broker receives a message, it checks:
| Scenario | Action |
|---|
| Expected sequence | Message is written normally |
| Duplicate sequence | Message is discarded, success response sent |
| Out-of-order sequence | OutOfOrderSequenceException thrown |
Broker state: Producer 123, Partition 0, Last sequence: 42
Incoming message: Sequence 43 ✅ Accept
Incoming message: Sequence 42 ⚠️ Duplicate (ignore)
Incoming message: Sequence 45 ❌ Out of order (reject)
Configuration requirements
Required settings
enable.idempotence=true
acks=all # Automatically set
retries=Integer.MAX_VALUE # Automatically set
max.in.flight.requests.per.connection=5 # Max value for idempotency
Recommended production configuration
# Complete idempotent producer configuration
enable.idempotence=true
acks=all
retries=Integer.MAX_VALUE
max.in.flight.requests.per.connection=5
delivery.timeout.ms=120000
compression.type=snappy
batch.size=32768
linger.ms=5
| Configuration | Throughput | Latency | Duplicates |
|---|
| No idempotency, retries=0 | Highest | Lowest | None (data loss possible) |
| No idempotency, retries>0 | High | Medium | Possible |
| Idempotent producer | Medium-High | Medium | None |
Trade-offs
Benefits:
- Exactly-once semantics for producer
- Simplified error handling
- Better reliability
Costs:
- Memory overhead on broker for sequence state
- Slightly higher latency for sequence checking
- Max 5 in-flight requests per connection
Error handling
Retriable errors (automatic)
Idempotent producers automatically retry:
TimeoutException
RetriableException
- Network connectivity issues
- Broker leadership changes
Non-retriable errors (require handling)
try {
producer.send(record).get();
} catch (OutOfOrderSequenceException e) {
// Sequence numbers are wrong - producer is in bad state
producer.close();
// Create new producer
} catch (UnknownProducerIdException e) {
// Producer ID expired - recreate producer
producer.close();
}
Limitations
Producer restartsIf a producer application restarts, it will get a new Producer ID and sequence numbers reset to 0. This means potential duplicates across application restarts, even with idempotency enabled.
| Limitation | Description |
|---|
| Session-based | Idempotency only guaranteed within single producer session |
| Partition scope | No deduplication across different partitions |
| Topic scope | No deduplication across different topics |
| Memory | Brokers maintain state per producer-partition |
When to use idempotent producers
| Use case | Recommendation |
|---|
| Production applications | Always recommended |
| Financial data | Essential |
| Audit logs | Essential |
| Metrics/logs | Recommended |
| Development/testing | Optional |
See it in practice with ConduktorConduktor Console lets you monitor topic messages and verify no duplicates are written. Use the message browser to inspect message headers and validate your idempotent producer configuration.
Next steps