Flink YARN job manager web port

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Flink YARN job manager web port

Shannon Carey
The documentation states: "The ports Flink is using for its services are the standard ports configured by the user + the application id as an offset"

When I launch Flink via YARN in an AWS EMR cluster, stdout says:
JobManager Web Interface: http://ip-xxx.us-west-2.compute.internal:20888/proxy/application_1461178294210_0010/

I need to be able to create an IAM Security Group that allows access to the JobManager web interface so that I can make use of it. However, I am confused about how port 20888 is chosen. Based on the code, I would have guessed that it would use the same port as given by: "yarn application -status application_1461178294210_0010". However, that's not the case (they don't match). It gives "Tracking-URL : http://ip-xxx.us-west-2.compute.internal:36495"

On the other hand, I see that YarnApplicationMasterRunner sets the port to 0, which InetSocketAddress says results in "A port number of zero will let the system pick up an ephemeral port in a bind operation."

I couldn't find anything in the code that adds an offset to a port. Changing the value of "jobmanager.web.port" appears to have no effect. The documentation on "Running Flink on YARN behind Firewalls" only talks about the JobManager and BlobServer ports.

Does Flink need logic to allow users to specify a range of ports for jobmanager.web.port in the same way as is done in BootstrapTools#startActorSystem? If so, I am happy to make that contribution!

-Shannon
Reply | Threaded
Open this post in threaded view
|

Re: Flink YARN job manager web port

Till Rohrmann
Hi Shannon,

if you need this feature (assigning range of web server ports) for your use case, then we would have to add it. If you want to do it, then it would help us a lot.

I think the documentation is a bit outdated here. The port is either chosen from the range of ports or a ephemeral port is picked up. I'll create a JIRA to fix the documentation.

Cheers,
Till

On Thu, Apr 21, 2016 at 10:55 PM, Shannon Carey <[hidden email]> wrote:
The documentation states: "The ports Flink is using for its services are the standard ports configured by the user + the application id as an offset"

When I launch Flink via YARN in an AWS EMR cluster, stdout says:

I need to be able to create an IAM Security Group that allows access to the JobManager web interface so that I can make use of it. However, I am confused about how port 20888 is chosen. Based on the code, I would have guessed that it would use the same port as given by: "yarn application -status application_1461178294210_0010". However, that's not the case (they don't match). It gives "Tracking-URL : http://ip-xxx.us-west-2.compute.internal:36495"

On the other hand, I see that YarnApplicationMasterRunner sets the port to 0, which InetSocketAddress says results in "A port number of zero will let the system pick up an ephemeral port in a bind operation."

I couldn't find anything in the code that adds an offset to a port. Changing the value of "jobmanager.web.port" appears to have no effect. The documentation on "Running Flink on YARN behind Firewalls" only talks about the JobManager and BlobServer ports.

Does Flink need logic to allow users to specify a range of ports for jobmanager.web.port in the same way as is done in BootstrapTools#startActorSystem? If so, I am happy to make that contribution!

-Shannon