tasks running in parallel beyond configured parallelism/slots

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

tasks running in parallel beyond configured parallelism/slots

Antony Mayi
Hi,

I am new to Flink and bit confused about the execution pipeline of my Flink job. I run it on cluster of three task managers (flink 1.1.2) each configured with just single slot. I submit my job with parallelism set to 3.

This is the global plan (low res - just to show the initial forking): http://pasteboard.co/weyMrFlZl.png

This is a detail of the front part: http://pasteboard.co/wez3DVvfW.png

My confusion is how comes all the parallel operations in the second column (10 operations) are being executed at the same time if there should be capacity for max of 3 running at once? Also they are all executed mostly on same node while the others are idle.

Thanks for anything useful,
Antony.
Reply | Threaded
Open this post in threaded view
|

Re: tasks running in parallel beyond configured parallelism/slots

Aljoscha Krettek
Hi,
Flink operators will not always (in fact almost never) run in a single slot. Mostly the whole parallel sub-slice of a pipeline can run in one slot, so in your case you get three parallel instances for every operator in your topology and then one instance of each operator will sit in a slot.

Cheers,
Aljoscha

On Thu, 9 Feb 2017 at 12:33 Antony Mayi <[hidden email]> wrote:
Hi,

I am new to Flink and bit confused about the execution pipeline of my Flink job. I run it on cluster of three task managers (flink 1.1.2) each configured with just single slot. I submit my job with parallelism set to 3.

This is the global plan (low res - just to show the initial forking): http://pasteboard.co/weyMrFlZl.png

This is a detail of the front part: http://pasteboard.co/wez3DVvfW.png

My confusion is how comes all the parallel operations in the second column (10 operations) are being executed at the same time if there should be capacity for max of 3 running at once? Also they are all executed mostly on same node while the others are idle.

Thanks for anything useful,
Antony.