Flink TaskManagers do not start until job is submitted in YARN

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Flink TaskManagers do not start until job is submitted in YARN

suraj7
Hi,

I am using Amazon EMR to run Flink Cluster on YARN. My setup consists of
m4.large instances for 1 master and 2 core nodes. I have started the Flink
Cluster on YARN with the command: flink-yarn-session -n 2 -d -tm 4096 -s 4.

Flink Job Manager and Application Manager starts but there are no Task
Managers running. The Flink Web interface shows 0 for task managers, task
slots and slots available. However when I submit a job to flink cluster,
then Task Managers get allocated and the job runs and the Web UI shows
correct values as expected and goes back to 0 once the job is complete.

I would like Task Managers to be running even when no Job is submitted as I
want to use Flink's REST API to monitor and modify parallelism based on the
available slots value while scaling Core Nodes.

Is there a configuration that I've missed which prevents Task Managers from
running all the time?

Thanks,
Suraj



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Flink TaskManagers do not start until job is submitted in YARN

Dawid Wysakowicz-2
Hi Suraj,

As far as I know this was changed with FLIP-6 to allow dynamic resource
allocation.

Till, cced might know if there is a switch to restore old behavior or
are there plans to support it.

Best,

Dawid

On 24/09/18 12:24, suraj7 wrote:

> Hi,
>
> I am using Amazon EMR to run Flink Cluster on YARN. My setup consists of
> m4.large instances for 1 master and 2 core nodes. I have started the Flink
> Cluster on YARN with the command: flink-yarn-session -n 2 -d -tm 4096 -s 4.
>
> Flink Job Manager and Application Manager starts but there are no Task
> Managers running. The Flink Web interface shows 0 for task managers, task
> slots and slots available. However when I submit a job to flink cluster,
> then Task Managers get allocated and the job runs and the Web UI shows
> correct values as expected and goes back to 0 once the job is complete.
>
> I would like Task Managers to be running even when no Job is submitted as I
> want to use Flink's REST API to monitor and modify parallelism based on the
> available slots value while scaling Core Nodes.
>
> Is there a configuration that I've missed which prevents Task Managers from
> running all the time?
>
> Thanks,
> Suraj
>
>
>
> --
> Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/


signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Flink TaskManagers do not start until job is submitted in YARN

Till Rohrmann
Hi Suraj,

at the moment Flink's new mode does not support such a behaviour. There are plans to set a min number of running TaskManagers which won't be released. But no work has been done in this direction yet, afaik. If you want, then you can help the community with this effort.

Cheers,
Till

On Mon, Sep 24, 2018 at 3:07 PM Dawid Wysakowicz <[hidden email]> wrote:
Hi Suraj,

As far as I know this was changed with FLIP-6 to allow dynamic resource
allocation.

Till, cced might know if there is a switch to restore old behavior or
are there plans to support it.

Best,

Dawid

On 24/09/18 12:24, suraj7 wrote:
> Hi,
>
> I am using Amazon EMR to run Flink Cluster on YARN. My setup consists of
> m4.large instances for 1 master and 2 core nodes. I have started the Flink
> Cluster on YARN with the command: flink-yarn-session -n 2 -d -tm 4096 -s 4.
>
> Flink Job Manager and Application Manager starts but there are no Task
> Managers running. The Flink Web interface shows 0 for task managers, task
> slots and slots available. However when I submit a job to flink cluster,
> then Task Managers get allocated and the job runs and the Web UI shows
> correct values as expected and goes back to 0 once the job is complete.
>
> I would like Task Managers to be running even when no Job is submitted as I
> want to use Flink's REST API to monitor and modify parallelism based on the
> available slots value while scaling Core Nodes.
>
> Is there a configuration that I've missed which prevents Task Managers from
> running all the time?
>
> Thanks,
> Suraj
>
>
>
> --
> Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/


Reply | Threaded
Open this post in threaded view
|

Re: Flink TaskManagers do not start until job is submitted in YARN

suraj7
Thanks for the clarification, Dawid and Till.

@Till We have a few streaming jobs that need to be running all the time and
we plan on using the modify tool to update parallelism of jobs as we scale
the cluster in and out and knowing total slots value is crucial to this
workflow.

As Dawid pointed out, is there a switch to restore the old behavior?
If not, is there a way to find/predict total slots value from YARN metrics?
Are you aware of any such workflow?

Thanks,
Suraj



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Flink TaskManagers do not start until job is submitted in YARN

Till Rohrmann
With Flink 1.5.x and 1.6.x you can put `mode: legacy` into your flink-conf.yaml and it will start the old mode. Then you have the old behaviour.

What do you mean with total slots? The current number of total slots? With resource elasticity this number can of course change because if you don't have enough slots, then Flink will try to start a new TaskExecutor.

Cheers,
Till

On Mon, Sep 24, 2018 at 7:11 PM suraj7 <[hidden email]> wrote:
Thanks for the clarification, Dawid and Till.

@Till We have a few streaming jobs that need to be running all the time and
we plan on using the modify tool to update parallelism of jobs as we scale
the cluster in and out and knowing total slots value is crucial to this
workflow.

As Dawid pointed out, is there a switch to restore the old behavior?
If not, is there a way to find/predict total slots value from YARN metrics?
Are you aware of any such workflow?

Thanks,
Suraj



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Flink TaskManagers do not start until job is submitted in YARN

suraj7
Hi Till,

What I was ideally looking for was to have a completely managed service for
Flink via AWS EMR in which YARN Cluster would be completely dedicated to
only one Flink Session and as the EMR scales in and out, EMR/YARN would
add/remove TMs accordingly. I could then get the value of total task slots
across all running TMs from Flink REST API and change my Job parallelism
accordingly.

As I understand, this kind of feature is not currently available and new TMs
will not be started as EMR scales out. The only way to get new TMs would be
to scale the EMR cluster to the required size and change Job parallelism and
expect Flink and YARN to take care of spawning new TMs with Flink's Resource
elasticity.

So, now we planning to poll YARN REST API to get current Active nodes which
will be equal to the number of TMs(Flink will be configured to run 1 TM per
EC2 instance) the cluster is capable of running and then modify Job
parallelism accordingly. Can you validate this strategy and/or suggest
something better?

Thanks,
Suraj



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/