Flink kafka producer partitioning scheme

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Flink kafka producer partitioning scheme

Vishwas Siravara
Hi guys,
From the flink doc 
By default, if a custom partitioner is not specified for the Flink Kafka Producer, the producer will use a FlinkFixedPartitioner that maps each Flink Kafka Producer parallel subtask to a single Kafka partition (i.e., all records received by a sink subtask will end up in the same Kafka partition).

Does this mean that if my downstream topic has 40 partitions , I will need 40 parallel subtasks ? 

Thanks,
Vishwas 
Reply | Threaded
Open this post in threaded view
|

Re: Flink kafka producer partitioning scheme

Eduardo Winpenny Tejedor
Hi Vishwas,

Just because a Kafka topic has 40 partitions doesn't mean the 40 partitions need to have messages in them. However, while it is not a requirement it is certainly convenient to populate the 40 partitions for load balancing.

What this means if you want to use all partitions is you either need 40 subtasks (as you've said) OR you can override the partitioner to decide to which partition each message should go to.

Regards,
Eduardo

On Fri, 13 Sep 2019, 17:29 Vishwas Siravara, <[hidden email]> wrote:
Hi guys,
From the flink doc 
By default, if a custom partitioner is not specified for the Flink Kafka Producer, the producer will use a FlinkFixedPartitioner that maps each Flink Kafka Producer parallel subtask to a single Kafka partition (i.e., all records received by a sink subtask will end up in the same Kafka partition).

Does this mean that if my downstream topic has 40 partitions , I will need 40 parallel subtasks ? 

Thanks,
Vishwas