Hi,
I am using Amazon EMR to run Flink Cluster on YARN. My setup consists of m4.large instances for 1 master and 2 core nodes. I have started the Flink Cluster on YARN with the command: flink-yarn-session -n 2 -d -tm 4096 -s 4. Flink Job Manager and Application Manager starts but there are no Task Managers running. The Flink Web interface shows 0 for task managers, task slots and slots available. However when I submit a job to flink cluster, then Task Managers get allocated and the job runs and the Web UI shows correct values as expected and goes back to 0 once the job is complete. I would like Task Managers to be running even when no Job is submitted as I want to use Flink's REST API to monitor and modify parallelism based on the available slots value while scaling Core Nodes. Is there a configuration that I've missed which prevents Task Managers from running all the time? Thanks, Suraj -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ |
Hi Suraj,
As far as I know this was changed with FLIP-6 to allow dynamic resource allocation. Till, cced might know if there is a switch to restore old behavior or are there plans to support it. Best, Dawid On 24/09/18 12:24, suraj7 wrote: > Hi, > > I am using Amazon EMR to run Flink Cluster on YARN. My setup consists of > m4.large instances for 1 master and 2 core nodes. I have started the Flink > Cluster on YARN with the command: flink-yarn-session -n 2 -d -tm 4096 -s 4. > > Flink Job Manager and Application Manager starts but there are no Task > Managers running. The Flink Web interface shows 0 for task managers, task > slots and slots available. However when I submit a job to flink cluster, > then Task Managers get allocated and the job runs and the Web UI shows > correct values as expected and goes back to 0 once the job is complete. > > I would like Task Managers to be running even when no Job is submitted as I > want to use Flink's REST API to monitor and modify parallelism based on the > available slots value while scaling Core Nodes. > > Is there a configuration that I've missed which prevents Task Managers from > running all the time? > > Thanks, > Suraj > > > > -- > Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ signature.asc (849 bytes) Download Attachment |
Hi Suraj, at the moment Flink's new mode does not support such a behaviour. There are plans to set a min number of running TaskManagers which won't be released. But no work has been done in this direction yet, afaik. If you want, then you can help the community with this effort. Cheers, Till On Mon, Sep 24, 2018 at 3:07 PM Dawid Wysakowicz <[hidden email]> wrote: Hi Suraj, |
Thanks for the clarification, Dawid and Till.
@Till We have a few streaming jobs that need to be running all the time and we plan on using the modify tool to update parallelism of jobs as we scale the cluster in and out and knowing total slots value is crucial to this workflow. As Dawid pointed out, is there a switch to restore the old behavior? If not, is there a way to find/predict total slots value from YARN metrics? Are you aware of any such workflow? Thanks, Suraj -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ |
With Flink 1.5.x and 1.6.x you can put `mode: legacy` into your flink-conf.yaml and it will start the old mode. Then you have the old behaviour. What do you mean with total slots? The current number of total slots? With resource elasticity this number can of course change because if you don't have enough slots, then Flink will try to start a new TaskExecutor. Cheers, Till On Mon, Sep 24, 2018 at 7:11 PM suraj7 <[hidden email]> wrote: Thanks for the clarification, Dawid and Till. |
Hi Till,
What I was ideally looking for was to have a completely managed service for Flink via AWS EMR in which YARN Cluster would be completely dedicated to only one Flink Session and as the EMR scales in and out, EMR/YARN would add/remove TMs accordingly. I could then get the value of total task slots across all running TMs from Flink REST API and change my Job parallelism accordingly. As I understand, this kind of feature is not currently available and new TMs will not be started as EMR scales out. The only way to get new TMs would be to scale the EMR cluster to the required size and change Job parallelism and expect Flink and YARN to take care of spawning new TMs with Flink's Resource elasticity. So, now we planning to poll YARN REST API to get current Active nodes which will be equal to the number of TMs(Flink will be configured to run 1 TM per EC2 instance) the cluster is capable of running and then modify Job parallelism accordingly. Can you validate this strategy and/or suggest something better? Thanks, Suraj -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ |
Free forum by Nabble | Edit this page |