How replica reading works
By default, Kafka consumers always read from the partition leader. However, starting with Kafka 2.4, consumers can be configured to read from follower replicas that are “close” to the consumer.Default behavior (leader-only reads)
- All consumers read from partition leaders
- Followers only replicate data, never serve reads
- Simple and consistent behavior
- May result in cross-datacenter traffic
Closest replica reading
- Consumers can read from geographically closest replicas
- Reduces network latency and cross-datacenter bandwidth
- Requires proper rack awareness configuration
- Maintains consistency guarantees
Configuration
Enable closest replica reading
Broker configuration for rack awareness
Topic configuration
When creating topics, consider replica placement:Benefits
Reduced latency
- Consumers read from local replicas instead of remote leaders
- Eliminates cross-datacenter read traffic
- Lower response times for geographically distributed applications
Cost savings
- Reduces expensive cross-region data transfer
- Particularly beneficial in cloud environments with data transfer charges
- Optimizes bandwidth usage in WAN scenarios
Improved availability
- Continues reading even if cross-datacenter links are degraded
- Better resilience in network partition scenarios
- Maintains read availability during datacenter issues
Consistency considerations
Read-after-write consistency
With closest replica reading, you may encounter scenarios where:- Producer writes to leader in datacenter A
- Consumer reads from follower in datacenter B
- Replication lag may cause temporary inconsistency
Mitigation strategies
Use cases
Multi-region deployments
Availability zone optimization
Monitoring and observability
Key metrics to monitor
- Replica read rates: Track reads from leaders vs followers
- Cross-datacenter traffic: Monitor reduction in inter-region bandwidth
- Consumer lag by replica: Ensure follower replicas are keeping up
- Replication lag: Monitor lag between leader and follower replicas
JMX metrics
Configuration examples
Cloud deployment (AWS)
On-premises multi-datacenter
Best practices
Deployment guidelines
- Configure rack awareness on all brokers and consumers
- Ensure sufficient replicas in each rack/region
- Monitor replication lag to avoid stale reads
- Test failover scenarios to ensure proper behavior
Performance optimization
Consistency requirements
- Strong consistency needs: Consider leader-only reads
- Eventually consistent OK: Closest replica reading works well
- Mixed requirements: Use different consumer groups with different configurations
Replication lag impactWhen reading from follower replicas, consumers may see slightly stale data due to replication lag. Ensure your application can tolerate this eventual consistency model.
Gradual rolloutConsider implementing closest replica reading gradually:
- Start with non-critical consumer groups
- Monitor metrics and consistency behavior
- Expand to more critical workloads as confidence builds
Troubleshooting
Common issues
- Missing rack configuration: Ensure both brokers and consumers have rack settings
- Insufficient replicas: Need replicas in consumer’s rack for local reads
- High replication lag: Follower replicas falling behind leader
- Network configuration: Ensure proper connectivity between racks/regions