Hi everyone,
I'm running a YARN session on a cluster with one master and one core and would like to use the Monitoring API programmatically to submit jobs. I have found that the configuration variables are read but ignored when starting the session - it seems to choose a random port each run. Here's a snippet from the startup logs: 2018-09-06 21:44:38,763 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: env.yarn.conf.dir, /etc/hadoop/conf 2018-09-06 21:44:38,764 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: env.hadoop.conf.dir, /etc/hadoop/conf 2018-09-06 21:44:38,765 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: rest.port, 44477 2018-09-06 21:44:38,765 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.web.port, 44477 2018-09-06 21:44:38,765 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: high-availability.jobmanager.port, 44477 2018-09-06 21:44:38,775 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - Found Yarn properties file under /tmp/.yarn-properties-hadoop. 2018-09-06 21:44:39,615 WARN org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2018-09-06 21:44:39,799 INFO org.apache.flink.runtime.security.modules.HadoopModule - Hadoop user set to hadoop (auth:SIMPLE) 2018-09-06 21:44:40,045 INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at ip-10-2-3-71.ec2.internal/10.2.3.71:8032 2018-09-06 21:44:40,312 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - Cluster specification: ClusterSpecification{masterMemoryMB=1024, taskManagerMemoryMB=4096, numberTaskManagers=1, slotsPerTaskManager=1} 2018-09-06 21:44:43,564 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - Submitting application master application_1536250520330_0007 2018-09-06 21:44:43,802 INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1536250520330_0007 2018-09-06 21:44:43,802 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - Waiting for the cluster to be allocated 2018-09-06 21:44:43,804 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - Deploying cluster, current state ACCEPTED 2018-09-06 21:44:48,326 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - YARN application has been deployed successfully. 2018-09-06 21:44:48,326 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - The Flink YARN client has been started in detached mode. In order to stop Flink on YARN, use the following command or a YARN web interface to stop it: yarn application -kill application_1536250520330_0007 Please also note that the temporary files of the YARN session in the home directory will not be removed. 2018-09-06 21:44:48,821 INFO org.apache.flink.runtime.rest.RestClient - Rest client endpoint started. Flink JobManager is now running on ip-10-2-3-25.ec2.internal:38683 with leader id 00000000-0000-0000-0000-000000000000. JobManager Web Interface: http://ip-10-2-3-25.ec2.internal:38683 I'm setting both the rest.port and jobmanager.web.port, but both are ignored. Has anyone seen this before? Thanks! |
Hi Austin, `rest.port` is the latest config option to configure "The port that the server listens on / the client connects to.", with deprecated key `web.port` which is with deprecated key `jobmanager.web.port`, so it is enough to config `rest.port` only (at least for 1.6). However, in your case the configuration should have worked. Since Flink recognizes configuration from both flink-conf.yaml and command-line, it would be helpful if you show us how you do the setting. Best, tison. Austin Cawley-Edwards <[hidden email]> 于2018年9月7日周五 上午6:33写道:
|
In reply to this post by austin.ce
Hi Austin, The config options rest.port, jobmanager.web.port, etc. are intentionally ignored on YARN. The port should be chosen randomly to avoid conflicts with other containers [1]. I do not see a way how you can set a fixed port at the moment but there is a related ticket for that [2]. The Flink CLI determines the hostname and port from the YARN ApplicationReport [3][4] – you can do the same. Best, Gary [1] https://github.com/apache/flink/blob/d036417985d3e2b1ca63909007db9710e842abf4/flink-yarn/src/main/java/org/apache/flink/yarn/entrypoint/YarnEntrypointUtils.java#L103 [2] https://issues.apache.org/jira/browse/FLINK-5758 [3] https://github.com/apache/flink/blob/d036417985d3e2b1ca63909007db9710e842abf4/flink-yarn/src/main/java/org/apache/flink/yarn/AbstractYarnClusterDescriptor.java#L387 [4] https://hadoop.apache.org/docs/r2.8.3/api/org/apache/hadoop/yarn/api/records/ApplicationReport.html#getRpcPort() On Fri, Sep 7, 2018 at 12:33 AM, Austin Cawley-Edwards <[hidden email]> wrote:
|
Hi Gary,
Thank you so much for the detailed explanation and links. Extremely helpful. For all others interested, this is also available through the YARN CLI command `yarn application -status {appId}`. Once again, thanks for your help! Austin On Fri, Sep 7, 2018, 2:24 AM Gary Yao <[hidden email]> wrote:
|
Free forum by Nabble | Edit this page |