Using numberOfTaskSlots to control parallelism

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Using numberOfTaskSlots to control parallelism

Zach Cox
What would the differences be between these scenarios?

1) one task manager with numberOfTaskSlots=1 and one job with parallelism=1

2) one task manager with numberOfTaskSlots=10 and one job with parallelism=10

In both cases all of the job's tasks get executed within the one task manager's jvm. Are there any downsides to doing #2 instead of #1?

I ask this question because of current issues related to watermarks with Kafka sources [1] [2] and changing parallelism with savepoints [3]. I'm writing a Flink job that processes events from Kafka topics that have 12 partitions. I'm wondering if I should just set the job parallelism=12 and make numberOfTaskSlots sum to 12 across however many task managers I set up. It seems like watermarks would work properly then, and I could effectively change job parallelism using the number of task managers (e.g. 1 TM with slots=12, or 2 TMs with slots=6, or 12 TMs with slots=1, etc).

Am I missing any important details that would make this a bad idea? It seems like a bit of abuse of numberOfTaskSlots, but also seems like a fairly simple solution to a few current issues.

Thanks,
Zach

Reply | Threaded
Open this post in threaded view
|

Re: Using numberOfTaskSlots to control parallelism

Aljoscha Krettek
IMHO the only change for 2) is that you possibly get better machine utilization because it will use more parallel threads.  So I think it’s a valid approach.

@Ufuk, could there be problems with the number of network buffers? I think not, because the connections are multiplexed in one channel, is this correct?

I’ll also talk with the others so see if we can resolve the watermark/kafka partition issues before the 1.0 release.

> On 20 Feb 2016, at 02:14, Zach Cox <[hidden email]> wrote:
>
> What would the differences be between these scenarios?
>
> 1) one task manager with numberOfTaskSlots=1 and one job with parallelism=1
>
> 2) one task manager with numberOfTaskSlots=10 and one job with parallelism=10
>
> In both cases all of the job's tasks get executed within the one task manager's jvm. Are there any downsides to doing #2 instead of #1?
>
> I ask this question because of current issues related to watermarks with Kafka sources [1] [2] and changing parallelism with savepoints [3]. I'm writing a Flink job that processes events from Kafka topics that have 12 partitions. I'm wondering if I should just set the job parallelism=12 and make numberOfTaskSlots sum to 12 across however many task managers I set up. It seems like watermarks would work properly then, and I could effectively change job parallelism using the number of task managers (e.g. 1 TM with slots=12, or 2 TMs with slots=6, or 12 TMs with slots=1, etc).
>
> Am I missing any important details that would make this a bad idea? It seems like a bit of abuse of numberOfTaskSlots, but also seems like a fairly simple solution to a few current issues.
>
> Thanks,
> Zach
>
> [1] http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Kafka-partition-alignment-for-event-time-tt4782.html
> [2] https://issues.apache.org/jira/browse/FLINK-3375
> [3] http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Changing-parallelism-tt4967.html
>

Reply | Threaded
Open this post in threaded view
|

Re: Using numberOfTaskSlots to control parallelism

Ufuk Celebi
On Sat, Feb 20, 2016 at 10:12 AM, Aljoscha Krettek <[hidden email]> wrote:
> IMHO the only change for 2) is that you possibly get better machine utilization because it will use more parallel threads.  So I think it’s a valid approach.
>
> @Ufuk, could there be problems with the number of network buffers? I think not, because the connections are multiplexed in one channel, is this correct?

I would not expect it to become a problem. If it does, it's easy to
resolve by throwing a little more memory at the problem. [1]

– Ufuk

[1] https://ci.apache.org/projects/flink/flink-docs-master/setup/config.html#configuring-the-network-buffers
Reply | Threaded
Open this post in threaded view
|

Re: Using numberOfTaskSlots to control parallelism

Zach Cox
Thanks for the input Aljoscha and Ufuk! I will try out the #2 approach and report back.

Thanks,
Zach


On Sat, Feb 20, 2016 at 7:26 AM Ufuk Celebi <[hidden email]> wrote:
On Sat, Feb 20, 2016 at 10:12 AM, Aljoscha Krettek <[hidden email]> wrote:
> IMHO the only change for 2) is that you possibly get better machine utilization because it will use more parallel threads.  So I think it’s a valid approach.
>
> @Ufuk, could there be problems with the number of network buffers? I think not, because the connections are multiplexed in one channel, is this correct?

I would not expect it to become a problem. If it does, it's easy to
resolve by throwing a little more memory at the problem. [1]

– Ufuk

[1] https://ci.apache.org/projects/flink/flink-docs-master/setup/config.html#configuring-the-network-buffers