Kafka partitions -> task slots? (keyed stream)

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Kafka partitions -> task slots? (keyed stream)

Moiz Jinia
For a keyed stream (where the key is also the message key in the source kafka topic), is the parallelism of the job restricted to the number of partitions in the topic?

Source topic has 5 partitions, but available task slots are 12. (3 task managers each with 4 slots)

Moiz
Reply | Threaded
Open this post in threaded view
|

Re: Kafka partitions -> task slots? (keyed stream)

Stefan Richter
Hi,

it is not restricting the parallelism of your job. Only increasing the parallelism of your Job’s sources to more than 5 will not bring any improvements. All other operators could still benefit from a higher parallelism.

> Am 30.05.2017 um 09:49 schrieb Moiz S Jinia <[hidden email]>:
>
> For a keyed stream (where the key is also the message key in the source kafka topic), is the parallelism of the job restricted to the number of partitions in the topic?
>
> Source topic has 5 partitions, but available task slots are 12. (3 task managers each with 4 slots)
>
> Moiz

Reply | Threaded
Open this post in threaded view
|

Re: Kafka partitions -> task slots? (keyed stream)

Moiz Jinia
I have just 1 job (that has a ProcessFunction with timers).

You're saying that giving more task slots to my job than the number of partitions on the source topic is not going to help.

This implies that 1 partition cannot be assigned to more than 1 task slot. That makes sense as otherwise ordering for a partition would not be guaranteed.

Thanks.

On Tue, May 30, 2017 at 8:43 PM, Stefan Richter <[hidden email]> wrote:
Hi,

it is not restricting the parallelism of your job. Only increasing the parallelism of your Job’s sources to more than 5 will not bring any improvements. All other operators could still benefit from a higher parallelism.

> Am 30.05.2017 um 09:49 schrieb Moiz S Jinia <[hidden email]>:
>
> For a keyed stream (where the key is also the message key in the source kafka topic), is the parallelism of the job restricted to the number of partitions in the topic?
>
> Source topic has 5 partitions, but available task slots are 12. (3 task managers each with 4 slots)
>
> Moiz