Re: Flink 1.6 Yarn Session behavior

Posted by Tzu-Li (Gordon) Tai on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/Flink-1-6-Yarn-Session-behavior-tp26074p26081.html

Hi,

I'm forwarding this question to Gary (CC'ed), who most likely would have an answer for your question here.

Cheers,
Gordon

On Wed, Feb 13, 2019 at 8:33 AM Jins George <[hidden email]> wrote:

Hello community,

I am trying to  upgrade a  Flink Yarn session cluster running BEAM pipelines  from version 1.2.0 to 1.6.3.

Here is my session start command: yarn-session.sh -d -n 4  -jm 1024 -tm 3072 -s 7

Because of the dynamic resource allocation,  no taskmanager gets created initially. Now once I submit a job with parallelism 5, I see that 1 task-manager gets created and all 5 parallel instances are scheduled on the same taskmanager( because I have 7 slots).  This can create hot spot as only one physical node ( out of 4 in my case) is utilized for processing.

I noticed the legacy mode, which would provision all task managers at cluster creation, but since legacy mode is expected to go away soon, I didn't want to try that route.

Is there a way I can configure the multiple jobs or parallel instances of same job spread across all the available Yarn nodes and continue using the 'new' mode ?

Thanks,

Jins George