Hi Sagar,
At the moment number of partitions in Kafka source topics and parallelism of Flink Kafka source operator are completely independent. Flink will internally distribute partitions between a number of source parallel subtasks which you configure. In case of dynamic partition or topic discovery while running it also happens automatically.
Job or source parallelism can be set e.g. to the total number of Kafka partitions over all topics known in advanced, if programmatically then e.g. using Kafka client.
Cheers,
Andrey
Hi,
We have a use case where we are consuming from more than 100s of Kafka Topics. Each topic has different number of partitions.
As per the documentation, to parallelize a Kafka Topic, we need to use setParallelism() == number of Kafka Partitions for a topic.
But if we are consuming multiple topics in Flink by providing pattern eg. my_topic_* and for each topic if there is different configuration for partitions,
then how should we connect all these together so that we can map Kafka Partition to Flink Parallelization correctly and programmatically (so that we don't have to hard code all the topic names and parallelism -- considering we can access kafka topic <-> number of partitions mapping in Flink) ?