Learn about ZooKeeper’s role in Kafka in five minutes
ZooKeeper has been a critical component of Kafka clusters, handling metadata management and cluster coordination. While ZooKeeper is being phased out in favor of KRaft mode, understanding its role is important for managing existing deployments.
What you’ll learn:
- What ZooKeeper does in a Kafka cluster
- Why ZooKeeper is being removed from Kafka
- Best practices for ZooKeeper with current Kafka versions
ZooKeeper is being eliminated from Kafka
- Kafka 0.x, 1.x & 2.x must use ZooKeeper
- Kafka 3.x can work without ZooKeeper (KRaft mode) and is production ready as of 3.3
- Kafka 4.x will not have ZooKeeper
For new deployments, consider using KRaft mode instead.
What is ZooKeeper in Kafka and what does it do?
How do the Kafka brokers and clients keep track of all the Kafka brokers if there is more than one? The Kafka team decided to use ZooKeeper for this purpose.
ZooKeeper is used for metadata management in the Kafka world. For example:
- ZooKeeper keeps track of which brokers are part of the Kafka cluster
- ZooKeeper is used by Kafka brokers to determine which broker is the leader of a given partition and topic and perform leader elections
- ZooKeeper stores configurations for topics and permissions
- ZooKeeper sends notifications to Kafka in case of changes (e.g. new topic, broker dies, broker comes up, delete topics, etc.)
Consumer offsets: ZooKeeper does NOT store consumer offsets with Kafka clients >= v0.10. Offsets are stored in the internal __consumer_offsets topic.
ZooKeeper ensemble
A ZooKeeper cluster is called an ensemble. It is recommended to operate the ensemble with an odd number of servers, e.g., 3, 5, 7, as a strict majority of ensemble members (a quorum) has to be working in order for ZooKeeper to respond to requests. ZooKeeper has a leader to handle writes, the rest of the servers are followers to handle reads.
Size your ZooKeeper ensemble
| Servers | Quorum needed | Failures tolerated |
|---|
| 1 | 1 | 0 |
| 3 | 2 | 1 |
| 5 | 3 | 2 |
| 7 | 4 | 3 |
Should you use ZooKeeper with Kafka brokers?
For Kafka versions prior to 3.3, you have to use ZooKeeper in your production deployments. For Kafka 3.3 and later, KRaft mode is production ready and recommended for new deployments.
Should you use ZooKeeper with Kafka clients?
Over time, the Kafka clients and CLI have been migrated to use the brokers as a connection endpoint instead of ZooKeeper.
This means that:
- Since Kafka 0.10, consumers store offset in Kafka and should not connect to ZooKeeper as the option is deprecated
- Since Kafka 2.2, the
kafka-topics.sh CLI command references Kafka brokers and not ZooKeeper for topic management (creation, deletion, etc.) and the ZooKeeper CLI argument is deprecated.
- All of the APIs and commands that were previously using ZooKeeper are migrated to use Kafka instead, so that when clusters are migrated to be without ZooKeeper, the change is invisible to clients.
- ZooKeeper is also less secure than Kafka, and therefore ZooKeeper ports should only be opened to allow traffic from Kafka brokers, and not Kafka clients
Never connect clients to ZooKeeperTo be a great modern-day Kafka developer, never use ZooKeeper as a configuration in your Kafka clients, and other programs that connect to Kafka. Always use the bootstrap server configuration pointing to Kafka brokers.
# Correct: Connect to Kafka brokers
bootstrap.servers=broker1:9092,broker2:9092
# Wrong: Never use ZooKeeper connection for clients
# zookeeper.connect=zk1:2181,zk2:2181 # DEPRECATED
Next steps