- To make the cluster highly available and performant, you want multiple brokers in different data centers (racks) to distribute your load. If you are setting up your cluster in AWS, this will be at least three different availability zones.
- You also want a cluster of at least 3 Zookeeper nodes (if using Zookeeper, the alternative being KRaft mode)
Kafka cluster setup gotchas
Kafka cluster has many components for reasons of performance and high availability. It is important to remember a few important points while setting up a Kafka cluster.- It’s not easy to set up a cluster. This is a huge effort and needs dedicated effort. This is why managed Kafka solutions are getting more and more popular ( Amazon MSK, Confluent Cloud, Aiven, CloudKarafka, Instaclustr, Upstash, etc…)
- You want to isolate each Zookeeper & Broker on separate servers. It is not safe to host multiple Kafka components on the same machine as they will compete for resources and make the cluster fail easily in case of errors.
- Monitoring needs to be implemented. As with any distributed system, to understand how the whole system is behaving, it is essential to have a good monitoring system implemented.
- Operations have to be mastered. There are several operations that are required to be performed on a Kafka cluster while it is in operation. For example, upgrades, backups, etc.
- You need a really good Kafka Admin. This role is essential and we need a dedicated Kafka administrator to manage the cluster.