Utilising EMR's master node

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Utilising EMR's master node

Averell
Hello everyone,

I'm trying to run Flink on AWS EMR following the guides from  Flink doc
<https://ci.apache.org/projects/flink/flink-docs-master/ops/deployment/yarn_setup.html#run-a-single-flink-job-on-hadoop-yarn>  
and from  AWS
<https://docs.aws.amazon.com/emr/latest/ReleaseGuide/flink-configure.html>
, and it looks like the EMR master is never used, neither for JM nor TM.
"bin/yarn-session.sh -q" only shows the core nodes. We are only running
Flink on that EMR, so it is wasting of resources.

So, is there any way to use the master node for the job, at least for the JM
only?

If that is not possible, should I have different hardware configurations
between the master node and core nodes (smaller server for the master)?

Thanks and best regards,
Averell




--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Utilising EMR's master node

Gary Yao-2
Hi Averell,

According to the AWS documentation [1], the master node only runs the YARN
ResourceManager and the HDFS NameNode. Containers can only by launched on
nodes that are running the YARN NodeManager [2]. Therefore, if you want TMs or
JMs to be launched on your EMR master node, you have to start the NodeManager
process there but I do not know how well this is supported by AWS EMR.

You can choose a smaller server for the master node but keep in mind that it is
running the HDFS NameNode as well. The hardware requirements will therefore
partially depend on the HDFS workload.

Best,
Gary

[1] https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-plan-instances.html
[2] https://hadoop.apache.org/docs/r2.8.0/hadoop-yarn/hadoop-yarn-site/NodeManager.html

On Mon, Sep 17, 2018 at 5:22 AM, Averell <[hidden email]> wrote:
Hello everyone,

I'm trying to run Flink on AWS EMR following the guides from  Flink doc
<https://ci.apache.org/projects/flink/flink-docs-master/ops/deployment/yarn_setup.html#run-a-single-flink-job-on-hadoop-yarn
and from  AWS
<https://docs.aws.amazon.com/emr/latest/ReleaseGuide/flink-configure.html>
, and it looks like the EMR master is never used, neither for JM nor TM.
"bin/yarn-session.sh -q" only shows the core nodes. We are only running
Flink on that EMR, so it is wasting of resources.

So, is there any way to use the master node for the job, at least for the JM
only?

If that is not possible, should I have different hardware configurations
between the master node and core nodes (smaller server for the master)?

Thanks and best regards,
Averell




--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Reply | Threaded
Open this post in threaded view
|

Re: Utilising EMR's master node

Averell
Thank you Gary.

Regarding the option to use a smaller server for the master node, when
starting a flink job, I would get an error like the following;

/Caused by: org.apache.flink.configuration.IllegalConfigurationException:
*The number of virtual cores per node were configured with 16 but Yarn only
has 4 virtual cores available*. Please note that the number of virtual cores
is set to the number of task slots by default unless configured in the Flink
config with 'yarn.containers.vcores.'/

To get around that error, I need to start the job from one of the core node.
Should that be an expected behaviour?

Thanks and regards,
Averell



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Utilising EMR's master node

Gary Yao-2
Hi Averell,

Flink compares the number of user selected vcores to the vcores configured in
the yarn-site.xml of the submitting node, i.e., in your case the master node.
If there are not enough configured vcores, the client throws an exception.
This behavior is not ideal and I found an old JIRA ticket for it [1]. We could
either remove this check, or – as the original ticket suggests – reuse the
logic from "yarn-session.sh -q" to determine if there is enough capacity in
the cluster.

As a workaround, you can set in the yarn-site.xml

    yarn.nodemanager.resource.cpu-vcores

to 16, or alternatively run multiple smaller TaskManagers on each node [2].

Best,
Gary

[1] https://issues.apache.org/jira/browse/FLINK-5542
[2] https://github.com/apache/flink/blob/09abba37c7d760236c2ba002fa4a3aac11c2641b/flink-yarn/src/main/java/org/apache/flink/yarn/AbstractYarnClusterDescriptor.java#L288

On Tue, Sep 18, 2018 at 4:43 AM, Averell <[hidden email]> wrote:
Thank you Gary.

Regarding the option to use a smaller server for the master node, when
starting a flink job, I would get an error like the following;

/Caused by: org.apache.flink.configuration.IllegalConfigurationException:
*The number of virtual cores per node were configured with 16 but Yarn only
has 4 virtual cores available*. Please note that the number of virtual cores
is set to the number of task slots by default unless configured in the Flink
config with 'yarn.containers.vcores.'/

To get around that error, I need to start the job from one of the core node.
Should that be an expected behaviour?

Thanks and regards,

Reply | Threaded
Open this post in threaded view
|

Re: Utilising EMR's master node

Averell
Hi Gary,
Thanks for your help.

Regarding TM configurations, in term of performance, when my 2 servers have
16 vcores each, should I have 2 TMs with 16GB mem, 16 task slots each, or 8
TMs with 4GB mem and 4 task slots each?

Thanks and regards,
Averell



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Utilising EMR's master node

Gary Yao-2
Hi Averell,

There is no general answer to your question. If you are running more TMs, you
get better isolation between different Flink jobs because one TM is backed by
one JVM [1]. However, every TMs brings additional overhead (heartbeating,
running more threads, etc.) [1]. It also depends on the maximum heap memory
requirements of your operators. Data skew and non-parallel operators can
cause uneven memory requirements.

For capacity planning, you can also take a look at Robert Metzger's slides
[2].

Best,
Gary

[1] https://ci.apache.org/projects/flink/flink-docs-stable/concepts/runtime.html#task-slots-and-resources
[2] https://www.slideshare.net/FlinkForward/flink-forward-berlin-2017-robert-metzger-keep-it-going-how-to-reliably-and-efficiently-operate-apache-flink

On Thu, Sep 20, 2018 at 1:13 AM Averell <[hidden email]> wrote:
Hi Gary,
Thanks for your help.

Regarding TM configurations, in term of performance, when my 2 servers have
16 vcores each, should I have 2 TMs with 16GB mem, 16 task slots each, or 8
TMs with 4GB mem and 4 task slots each?

Thanks and regards,
Averell



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Utilising EMR's master node

Averell
Thank you Gary.
Regarding your previous suggestion to to change the configuration regarding
to the number of vcores on the EMR master node, I tried and found one
funny/bad behaviour as following:
 * hardware onfiguration: master node: 4vcores + 8GB ram, 2x executors with
16vcores + 32GB ram each.
 * Flink launch parameters: -yn 2 -ys 16 -ytm 4g...
4 TMs were created, with 2 of them were used (0 free slots) and two others
not used (16 free slots). The bad thing is most of the time 2 free TMs are
on a same machine, and two occupied ones are on the other machine.
If I dont change the Hadoop configurations then still 4 TMs created, but the
occupied ones are always on two different servers.

I'm not sure whether that's EMR's issue, or YARN's or Flink's.

Thanks and regards,
Averell



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Utilising EMR's master node

Gary Yao-2
Hi Averell,

It is up to the YARN scheduler on which hosts the containers are started.

What Flink version are you using? I assume you are using 1.4 or earlier
because you are specifying a fixed number of TMs. If you launch Flink with -yn
2, you should be only seeing 2 TMs in total (not 4). Are you starting two
clusters?

Beginning with Flink 1.5, -yn is obsolete because resources are acquired
dynamically, and it is not well-defined in what order TM slots are exhausted
[1].

Best,
Gary

[1] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Flink-1-5-job-distribution-over-cluster-nodes-td23364.html

On Wed, Sep 26, 2018 at 9:25 AM Averell <[hidden email]> wrote:
Thank you Gary.
Regarding your previous suggestion to to change the configuration regarding
to the number of vcores on the EMR master node, I tried and found one
funny/bad behaviour as following:
 * hardware onfiguration: master node: 4vcores + 8GB ram, 2x executors with
16vcores + 32GB ram each.
 * Flink launch parameters: -yn 2 -ys 16 -ytm 4g...
4 TMs were created, with 2 of them were used (0 free slots) and two others
not used (16 free slots). The bad thing is most of the time 2 free TMs are
on a same machine, and two occupied ones are on the other machine.
If I dont change the Hadoop configurations then still 4 TMs created, but the
occupied ones are always on two different servers.

I'm not sure whether that's EMR's issue, or YARN's or Flink's.

Thanks and regards,
Averell



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Utilising EMR's master node

Averell
Hi Gary,

Thanks for the information. I didn't know that -yn is obsolete :( I am using
Flink 1.6.
Not sure whether that's a bug when I tried to set -yn explicitly, but I
started only 1 cluster.

Thanks and regards,
Averell



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/