(DEPRECATED) Apache Flink User Mailing List archive.

Setting Flink Monitoring API Port on YARN Cluster

Classic

List

Threaded

4 messages Options

austin.ce

Sep 06, 2018; 10:33pm

Setting Flink Monitoring API Port on YARN Cluster

Hi everyone,

I'm running a YARN session on a cluster with one master and one core and would like to use the Monitoring API programmatically to submit jobs. I have found that the configuration variables are read but ignored when starting the session - it seems to choose a random port each run.

Here's a snippet from the startup logs:

2018-09-06 21:44:38,763 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: env.yarn.conf.dir, /etc/hadoop/conf

2018-09-06 21:44:38,764 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: env.hadoop.conf.dir, /etc/hadoop/conf

2018-09-06 21:44:38,765 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: rest.port, 44477

2018-09-06 21:44:38,765 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.web.port, 44477

2018-09-06 21:44:38,765 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: high-availability.jobmanager.port, 44477

2018-09-06 21:44:38,775 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - Found Yarn properties file under /tmp/.yarn-properties-hadoop.

2018-09-06 21:44:39,615 WARN org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

2018-09-06 21:44:39,799 INFO org.apache.flink.runtime.security.modules.HadoopModule - Hadoop user set to hadoop (auth:SIMPLE)

2018-09-06 21:44:40,045 INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at ip-10-2-3-71.ec2.internal/10.2.3.71:8032

2018-09-06 21:44:40,312 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - Cluster specification: ClusterSpecification{masterMemoryMB=1024, taskManagerMemoryMB=4096, numberTaskManagers=1, slotsPerTaskManager=1}

2018-09-06 21:44:43,564 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - Submitting application master application_1536250520330_0007

2018-09-06 21:44:43,802 INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1536250520330_0007

2018-09-06 21:44:43,802 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - Waiting for the cluster to be allocated

2018-09-06 21:44:43,804 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - Deploying cluster, current state ACCEPTED

2018-09-06 21:44:48,326 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - YARN application has been deployed successfully.

2018-09-06 21:44:48,326 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - The Flink YARN client has been started in detached mode. In order to stop Flink on YARN, use the following command or a YARN web interface to stop it:

yarn application -kill application_1536250520330_0007

Please also note that the temporary files of the YARN session in the home directory will not be removed.

2018-09-06 21:44:48,821 INFO org.apache.flink.runtime.rest.RestClient - Rest client endpoint started.

Flink JobManager is now running on ip-10-2-3-25.ec2.internal:38683 with leader id 00000000-0000-0000-0000-000000000000.

JobManager Web Interface: http://ip-10-2-3-25.ec2.internal:38683

I'm setting both the rest.port and jobmanager.web.port, but both are ignored. Has anyone seen this before?

Thanks!

tison

Sep 06, 2018; 10:59pm

Re: Setting Flink Monitoring API Port on YARN Cluster

Hi Austin,

`rest.port` is the latest config option to configure "The port that the server listens on / the client connects to.", with deprecated key `web.port` which is with deprecated key `jobmanager.web.port`, so it is enough to config `rest.port` only (at least for 1.6). However, in your case the configuration should have worked.

Since Flink recognizes configuration from both flink-conf.yaml and command-line, it would be helpful if you show us how you do the setting.

Best,

tison.

Austin Cawley-Edwards <[hidden email]> 于2018年9月7日周五上午6:33写道：

Hi everyone,

I'm running a YARN session on a cluster with one master and one core and would like to use the Monitoring API programmatically to submit jobs. I have found that the configuration variables are read but ignored when starting the session - it seems to choose a random port each run.

Here's a snippet from the startup logs:

2018-09-06 21:44:38,763 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: env.yarn.conf.dir, /etc/hadoop/conf
2018-09-06 21:44:38,764 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: env.hadoop.conf.dir, /etc/hadoop/conf
2018-09-06 21:44:38,765 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: rest.port, 44477
2018-09-06 21:44:38,765 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.web.port, 44477
2018-09-06 21:44:38,765 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: high-availability.jobmanager.port, 44477
2018-09-06 21:44:38,775 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - Found Yarn properties file under /tmp/.yarn-properties-hadoop.
2018-09-06 21:44:39,615 WARN org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2018-09-06 21:44:39,799 INFO org.apache.flink.runtime.security.modules.HadoopModule - Hadoop user set to hadoop (auth:SIMPLE)
2018-09-06 21:44:40,045 INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at ip-10-2-3-71.ec2.internal/10.2.3.71:8032
2018-09-06 21:44:40,312 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - Cluster specification: ClusterSpecification{masterMemoryMB=1024, taskManagerMemoryMB=4096, numberTaskManagers=1, slotsPerTaskManager=1}
2018-09-06 21:44:43,564 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - Submitting application master application_1536250520330_0007
2018-09-06 21:44:43,802 INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1536250520330_0007
2018-09-06 21:44:43,802 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - Waiting for the cluster to be allocated
2018-09-06 21:44:43,804 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - Deploying cluster, current state ACCEPTED
2018-09-06 21:44:48,326 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - YARN application has been deployed successfully.
2018-09-06 21:44:48,326 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - The Flink YARN client has been started in detached mode. In order to stop Flink on YARN, use the following command or a YARN web interface to stop it:
yarn application -kill application_1536250520330_0007
Please also note that the temporary files of the YARN session in the home directory will not be removed.
2018-09-06 21:44:48,821 INFO org.apache.flink.runtime.rest.RestClient - Rest client endpoint started.
Flink JobManager is now running on ip-10-2-3-25.ec2.internal:38683 with leader id 00000000-0000-0000-0000-000000000000.
JobManager Web Interface: http://ip-10-2-3-25.ec2.internal:38683

I'm setting both the rest.port and jobmanager.web.port, but both are ignored. Has anyone seen this before?

Thanks!

... [show rest of quote]

Gary Yao-2

Sep 07, 2018; 6:24am

Re: Setting Flink Monitoring API Port on YARN Cluster

In reply to this post by austin.ce

Hi Austin,

The config options rest.port, jobmanager.web.port, etc. are intentionally
ignored on YARN. The port should be chosen randomly to avoid conflicts with
other containers [1]. I do not see a way how you can set a fixed port at the
moment but there is a related ticket for that [2]. The Flink CLI determines
the hostname and port from the YARN ApplicationReport [3][4] – you can do the
same.

Best,
Gary

[1] https://github.com/apache/flink/blob/d036417985d3e2b1ca63909007db9710e842abf4/flink-yarn/src/main/java/org/apache/flink/yarn/entrypoint/YarnEntrypointUtils.java#L103

[2] https://issues.apache.org/jira/browse/FLINK-5758

[3] https://github.com/apache/flink/blob/d036417985d3e2b1ca63909007db9710e842abf4/flink-yarn/src/main/java/org/apache/flink/yarn/AbstractYarnClusterDescriptor.java#L387

[4] https://hadoop.apache.org/docs/r2.8.3/api/org/apache/hadoop/yarn/api/records/ApplicationReport.html#getRpcPort()

On Fri, Sep 7, 2018 at 12:33 AM, Austin Cawley-Edwards <[hidden email]> wrote:

Hi everyone,

I'm running a YARN session on a cluster with one master and one core and would like to use the Monitoring API programmatically to submit jobs. I have found that the configuration variables are read but ignored when starting the session - it seems to choose a random port each run.

Here's a snippet from the startup logs:

2018-09-06 21:44:38,763 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: env.yarn.conf.dir, /etc/hadoop/conf
2018-09-06 21:44:38,764 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: env.hadoop.conf.dir, /etc/hadoop/conf
2018-09-06 21:44:38,765 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: rest.port, 44477
2018-09-06 21:44:38,765 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.web.port, 44477
2018-09-06 21:44:38,765 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: high-availability.jobmanager.port, 44477
2018-09-06 21:44:38,775 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - Found Yarn properties file under /tmp/.yarn-properties-hadoop.
2018-09-06 21:44:39,615 WARN org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2018-09-06 21:44:39,799 INFO org.apache.flink.runtime.security.modules.HadoopModule - Hadoop user set to hadoop (auth:SIMPLE)
2018-09-06 21:44:40,045 INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at ip-10-2-3-71.ec2.internal/10.2.3.71:8032
2018-09-06 21:44:40,312 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - Cluster specification: ClusterSpecification{masterMemoryMB=1024, taskManagerMemoryMB=4096, numberTaskManagers=1, slotsPerTaskManager=1}
2018-09-06 21:44:43,564 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - Submitting application master application_1536250520330_0007
2018-09-06 21:44:43,802 INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1536250520330_0007
2018-09-06 21:44:43,802 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - Waiting for the cluster to be allocated
2018-09-06 21:44:43,804 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - Deploying cluster, current state ACCEPTED
2018-09-06 21:44:48,326 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - YARN application has been deployed successfully.
2018-09-06 21:44:48,326 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - The Flink YARN client has been started in detached mode. In order to stop Flink on YARN, use the following command or a YARN web interface to stop it:
yarn application -kill application_1536250520330_0007
Please also note that the temporary files of the YARN session in the home directory will not be removed.
2018-09-06 21:44:48,821 INFO org.apache.flink.runtime.rest.RestClient - Rest client endpoint started.
Flink JobManager is now running on ip-10-2-3-25.ec2.internal:38683 with leader id 00000000-0000-0000-0000-000000000000.
JobManager Web Interface: http://ip-10-2-3-25.ec2.internal:38683

I'm setting both the rest.port and jobmanager.web.port, but both are ignored. Has anyone seen this before?

Thanks!

... [show rest of quote]

austin.ce

Sep 07, 2018; 1:37pm

Re: Setting Flink Monitoring API Port on YARN Cluster

Hi Gary,

Thank you so much for the detailed explanation and links. Extremely helpful. For all others interested, this is also available through the YARN CLI command `yarn application -status {appId}`.

Once again, thanks for your help!

Austin

On Fri, Sep 7, 2018, 2:24 AM Gary Yao <[hidden email]> wrote:

Hi Austin,

The config options rest.port, jobmanager.web.port, etc. are intentionally
ignored on YARN. The port should be chosen randomly to avoid conflicts with
other containers [1]. I do not see a way how you can set a fixed port at the
moment but there is a related ticket for that [2]. The Flink CLI determines
the hostname and port from the YARN ApplicationReport [3][4] – you can do the
same.

Best,
Gary

[1] https://github.com/apache/flink/blob/d036417985d3e2b1ca63909007db9710e842abf4/flink-yarn/src/main/java/org/apache/flink/yarn/entrypoint/YarnEntrypointUtils.java#L103

[2] https://issues.apache.org/jira/browse/FLINK-5758

[3] https://github.com/apache/flink/blob/d036417985d3e2b1ca63909007db9710e842abf4/flink-yarn/src/main/java/org/apache/flink/yarn/AbstractYarnClusterDescriptor.java#L387

[4] https://hadoop.apache.org/docs/r2.8.3/api/org/apache/hadoop/yarn/api/records/ApplicationReport.html#getRpcPort()

On Fri, Sep 7, 2018 at 12:33 AM, Austin Cawley-Edwards <[hidden email]> wrote:
Hi everyone,

I'm running a YARN session on a cluster with one master and one core and would like to use the Monitoring API programmatically to submit jobs. I have found that the configuration variables are read but ignored when starting the session - it seems to choose a random port each run.

Here's a snippet from the startup logs:

2018-09-06 21:44:38,763 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: env.yarn.conf.dir, /etc/hadoop/conf
2018-09-06 21:44:38,764 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: env.hadoop.conf.dir, /etc/hadoop/conf
2018-09-06 21:44:38,765 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: rest.port, 44477
2018-09-06 21:44:38,765 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.web.port, 44477
2018-09-06 21:44:38,765 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: high-availability.jobmanager.port, 44477
2018-09-06 21:44:38,775 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - Found Yarn properties file under /tmp/.yarn-properties-hadoop.
2018-09-06 21:44:39,615 WARN org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2018-09-06 21:44:39,799 INFO org.apache.flink.runtime.security.modules.HadoopModule - Hadoop user set to hadoop (auth:SIMPLE)
2018-09-06 21:44:40,045 INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at ip-10-2-3-71.ec2.internal/10.2.3.71:8032
2018-09-06 21:44:40,312 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - Cluster specification: ClusterSpecification{masterMemoryMB=1024, taskManagerMemoryMB=4096, numberTaskManagers=1, slotsPerTaskManager=1}
2018-09-06 21:44:43,564 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - Submitting application master application_1536250520330_0007
2018-09-06 21:44:43,802 INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1536250520330_0007
2018-09-06 21:44:43,802 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - Waiting for the cluster to be allocated
2018-09-06 21:44:43,804 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - Deploying cluster, current state ACCEPTED
2018-09-06 21:44:48,326 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - YARN application has been deployed successfully.
2018-09-06 21:44:48,326 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - The Flink YARN client has been started in detached mode. In order to stop Flink on YARN, use the following command or a YARN web interface to stop it:
yarn application -kill application_1536250520330_0007
Please also note that the temporary files of the YARN session in the home directory will not be removed.
2018-09-06 21:44:48,821 INFO org.apache.flink.runtime.rest.RestClient - Rest client endpoint started.
Flink JobManager is now running on ip-10-2-3-25.ec2.internal:38683 with leader id 00000000-0000-0000-0000-000000000000.
JobManager Web Interface: http://ip-10-2-3-25.ec2.internal:38683

I'm setting both the rest.port and jobmanager.web.port, but both are ignored. Has anyone seen this before?

Thanks!

... [show rest of quote]

... [show rest of quote]