How to Make Multiple Consumers Read Data from a Kafka Topic
In this tutorial, we will learn about the concept of running multiple Kafka consumers to read data from a single Kafka topic.
Before learning how to make multiple consumers read data from Kafka topics, it is necessary to understand the concept of consumers and consumer groups.
Applications that need to read and process data from Kafka topics are called consumers. A consumer application basically needs to subscribe to Kafka topics to receive messages from the Kafka topics.
Kafka Consumer Group
A consumer group is a means for grouping several consumers into a single group.
We can configure consumers either with the same consumer group or different consumer group.
When you use the same consumer group for all consumers, each consumer of that group is assigned to different partitions of a topic with no two consumers assigned to the same partition of a topic. This ensures no two consumers receive the same data.
When you use the different consumer group for consumers, each consumer will be assigned to all the partitions of a topic. Each consumer will receive the same data.
Multiple Kafka Consumers
A consumer application may perform several time-consuming tasks such as reading data from Kafka topics, validating and formatting these data, and storing them into a database. If your application is limited to a single consumer then reading and processing data can become increasingly slow. As a result, a consumer may find it difficult to keep up with the rate at which data is produced by many producers. To solve this problem, we can run multiple consumers configured with a consumer group.
When multiple consumers are configured with the same consumer group, each consumer will receive messages from a different subset of the topic's partitions. This prevents consumers from reading the same data several times. In other words, a message read by one consumer will not be consumed by another consumer when consumers belong to the same consumer group. If consumers are configured with different consumer groups then each consumer will read data from the same Kafka topic independently, a single message will be read multiple times.
Optimum Number of Kafka Consumers
You can have as many consumers as you want. However, the only limitation is that the number of consumers within a consumer group should always be less than or equal to the number of partitions of a topic.
Suppose if we have a Kafka topic A with five partitions and a single consumer in the consumer group A which is subscribed to topic A then this single consumer will receive messages from all the five paratitions. Example shown in the diagram below:
If we have two consumers in the consumer group A then the given five partitions will be divided and assigned for data consumption to both consumer 1 and consumer 2 as shown in the example below:
Suppose if the number of partitions is equal to the number of consumers in the consumer group A then each consumer will be assigned a partition as shown in the example below:
Suppose if the number of partitions is less than the number of consumers in the consumer group A some consumer will not be assigned any partition and remain idle as shown in the example below: