Breakage in Flink CLI in 1.5.0

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

Breakage in Flink CLI in 1.5.0

Sampath Bhat
Hello

I'm using Flink 1.5.0 version and Flink CLI to submit jar to flink cluster.

In flink 1.4.2 only job manager rpc address and job manager rpc port were sufficient for flink client to connect to job manager and submit the job.

But in flink 1.5.0 the flink client additionally requires the rest.address and rest.port for submitting the job to job manager. What is the advantage of this new method over the 1.4.2 method of submitting job?

Moreover if we make rest.port = -1 the web server will not be instantiated then how should we submit the job?

Regards
Sampath
Reply | Threaded
Open this post in threaded view
|

Re: Breakage in Flink CLI in 1.5.0

Chesnay Schepler
In 1.5 we reworked the job-submission to go through the REST API instead of akka.

I believe the jobmanager rpc port shouldn't be necessary anymore, the rpc address is still required due to some technical implementations; it may be that you can set this to some arbitrary value however.

As a result the REST API (i.e. the web server) must be running in order to submit jobs.

On 19.06.2018 14:12, Sampath Bhat wrote:
Hello

I'm using Flink 1.5.0 version and Flink CLI to submit jar to flink cluster.

In flink 1.4.2 only job manager rpc address and job manager rpc port were sufficient for flink client to connect to job manager and submit the job.

But in flink 1.5.0 the flink client additionally requires the rest.address and rest.port for submitting the job to job manager. What is the advantage of this new method over the 1.4.2 method of submitting job?

Moreover if we make rest.port = -1 the web server will not be instantiated then how should we submit the job?

Regards
Sampath


Reply | Threaded
Open this post in threaded view
|

Re: Breakage in Flink CLI in 1.5.0

Sampath Bhat
Hi Chesnay

If REST API (i.e. the web server) is mandatory for submitting jobs then why is there an option to set rest.port to -1? I think it should be mandatory to set some valid port for rest.port and make sure flink job manager does not come up if valid port is not set for rest.port? Or else there must be some way to submit jobs even if REST API (i.e. the web server) is not instantiated.

If jobmanger.rpc.address is not required for flink client then why is it still looking for that property in flink-conf.yaml? Isn't it not a bug? Because if we comment out the jobmanger.rpc.address and jobmanger.rpc.port then flink client will not be able to submit the job.


On Tue, Jun 19, 2018 at 5:49 PM, Chesnay Schepler <[hidden email]> wrote:
In 1.5 we reworked the job-submission to go through the REST API instead of akka.

I believe the jobmanager rpc port shouldn't be necessary anymore, the rpc address is still required due to some technical implementations; it may be that you can set this to some arbitrary value however.

As a result the REST API (i.e. the web server) must be running in order to submit jobs.


On 19.06.2018 14:12, Sampath Bhat wrote:
Hello

I'm using Flink 1.5.0 version and Flink CLI to submit jar to flink cluster.

In flink 1.4.2 only job manager rpc address and job manager rpc port were sufficient for flink client to connect to job manager and submit the job.

But in flink 1.5.0 the flink client additionally requires the rest.address and rest.port for submitting the job to job manager. What is the advantage of this new method over the 1.4.2 method of submitting job?

Moreover if we make rest.port = -1 the web server will not be instantiated then how should we submit the job?

Regards
Sampath



Reply | Threaded
Open this post in threaded view
|

Re: Breakage in Flink CLI in 1.5.0

Sampath Bhat
Hi Chesnay

Adding on to this point you made - " the rpc address is still required due to some technical implementations; it may be that you can set this to some arbitrary value however."

For job submission to happen successfully we should give specific rpc address and not any arbitrary value. If any arbitrary value is given the job submission fails with the following error -
org.apache.flink.client.deployment.ClusterRetrieveException: Couldn't retrieve standalone cluster
        at org.apache.flink.client.deployment.StandaloneClusterDescriptor.retrieve(StandaloneClusterDescriptor.java:51)
        at org.apache.flink.client.deployment.StandaloneClusterDescriptor.retrieve(StandaloneClusterDescriptor.java:31)
        at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:249)
        at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:210)
        at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1020)
        at org.apache.flink.client.cli.CliFrontend.lambda$main$9(CliFrontend.java:1096)
        at org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
        at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1096)
Caused by: java.net.UnknownHostException: flinktest-flink-jobmanager1233445: Name or service not known
 (Random name flinktest-flink-jobmanager1233445)
        at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
        at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928)
        at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323)
        at java.net.InetAddress.getAllByName0(InetAddress.java:1276)
        at java.net.InetAddress.getAllByName(InetAddress.java:1192)
        at java.net.InetAddress.getAllByName(InetAddress.java:1126)
        at java.net.InetAddress.getByName(InetAddress.java:1076)
        at org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils.getRpcUrl(AkkaRpcServiceUtils.java:171)
        at org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils.getRpcUrl(AkkaRpcServiceUtils.java:136)
        at org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createHighAvailabilityServices(HighAvailabilityServicesUtils.java:83)
        at org.apache.flink.client.program.ClusterClient.<init>(ClusterClient.java:158)
        at org.apache.flink.client.program.rest.RestClusterClient.<init>(RestClusterClient.java:184)
        at org.apache.flink.client.program.rest.RestClusterClient.<init>(RestClusterClient.java:157)
        at org.apache.flink.client.deployment.StandaloneClusterDescriptor.retrieve(StandaloneClusterDescriptor.java:49)
        ... 7 more


On Wed, Jun 20, 2018 at 11:18 AM, Sampath Bhat <[hidden email]> wrote:
Hi Chesnay

If REST API (i.e. the web server) is mandatory for submitting jobs then why is there an option to set rest.port to -1? I think it should be mandatory to set some valid port for rest.port and make sure flink job manager does not come up if valid port is not set for rest.port? Or else there must be some way to submit jobs even if REST API (i.e. the web server) is not instantiated.

If jobmanger.rpc.address is not required for flink client then why is it still looking for that property in flink-conf.yaml? Isn't it not a bug? Because if we comment out the jobmanger.rpc.address and jobmanger.rpc.port then flink client will not be able to submit the job.


On Tue, Jun 19, 2018 at 5:49 PM, Chesnay Schepler <[hidden email]> wrote:
In 1.5 we reworked the job-submission to go through the REST API instead of akka.

I believe the jobmanager rpc port shouldn't be necessary anymore, the rpc address is still required due to some technical implementations; it may be that you can set this to some arbitrary value however.

As a result the REST API (i.e. the web server) must be running in order to submit jobs.


On 19.06.2018 14:12, Sampath Bhat wrote:
Hello

I'm using Flink 1.5.0 version and Flink CLI to submit jar to flink cluster.

In flink 1.4.2 only job manager rpc address and job manager rpc port were sufficient for flink client to connect to job manager and submit the job.

But in flink 1.5.0 the flink client additionally requires the rest.address and rest.port for submitting the job to job manager. What is the advantage of this new method over the 1.4.2 method of submitting job?

Moreover if we make rest.port = -1 the web server will not be instantiated then how should we submit the job?

Regards
Sampath




Reply | Threaded
Open this post in threaded view
|

Re: Breakage in Flink CLI in 1.5.0

Chesnay Schepler
I was worried this might be the case.

The rest.port handling was simply copied from the legacy web-server,
which explicitly allowed shutting it down.
It may (I'm not entirely sure) also not be necessary for all deployment
modes; for example if the job is baked into the job/taskmanager images.

I'm not quite sure whether the rpc address is actually required for the
REST job submission, or only since we still rely partly on some legacy
code (ClusterClient). Maybe Till (cc) knows the answer to that.

> Adding on to this point you made - " the rpc address is still *required *due
> to some technical implementations; it may be that you can set this to some
> arbitrary value however."
>
> For job submission to happen successfully we should give specific rpc
> address and not any arbitrary value. If any arbitrary value is given the
> job submission fails with the following error -
> org.apache.flink.client.deployment.ClusterRetrieveException: Couldn't
> retrieve standalone cluster
>          at
> org.apache.flink.client.deployment.StandaloneClusterDescriptor.retrieve(StandaloneClusterDescriptor.java:51)
>          at
> org.apache.flink.client.deployment.StandaloneClusterDescriptor.retrieve(StandaloneClusterDescriptor.java:31)
>          at
> org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:249)
>          at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:210)
>          at
> org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1020)
>          at
> org.apache.flink.client.cli.CliFrontend.lambda$main$9(CliFrontend.java:1096)
>          at
> org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
>          at
> org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1096)
> Caused by: java.net.UnknownHostException: flinktest-flink-jobmanager1233445:
> Name or service not known
>   (Random name flinktest-flink-jobmanager1233445)
>          at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
>          at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928)
>          at
> java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323)
>          at java.net.InetAddress.getAllByName0(InetAddress.java:1276)
>          at java.net.InetAddress.getAllByName(InetAddress.java:1192)
>          at java.net.InetAddress.getAllByName(InetAddress.java:1126)
>          at java.net.InetAddress.getByName(InetAddress.java:1076)
>          at
> org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils.getRpcUrl(AkkaRpcServiceUtils.java:171)
>          at
> org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils.getRpcUrl(AkkaRpcServiceUtils.java:136)
>          at
> org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createHighAvailabilityServices(HighAvailabilityServicesUtils.java:83)
>          at
> org.apache.flink.client.program.ClusterClient.<init>(ClusterClient.java:158)
>          at
> org.apache.flink.client.program.rest.RestClusterClient.<init>(RestClusterClient.java:184)
>          at
> org.apache.flink.client.program.rest.RestClusterClient.<init>(RestClusterClient.java:157)
>          at
> org.apache.flink.client.deployment.StandaloneClusterDescriptor.retrieve(StandaloneClusterDescriptor.java:49)
>          ... 7 more
>
>
> On Wed, Jun 20, 2018 at 11:18 AM, Sampath Bhat <[hidden email]>
> wrote:
>
>> Hi Chesnay
>>
>> If REST API (i.e. the web server) is mandatory for submitting jobs then
>> why is there an option to set rest.port to -1? I think it should be
>> mandatory to set some valid port for rest.port and make sure flink job
>> manager does not come up if valid port is not set for rest.port? Or else
>> there must be some way to submit jobs even if REST API (i.e. the web
>> server) is not instantiated.
>>
>> If jobmanger.rpc.address is not required for flink client then why is it
>> still looking for that property in flink-conf.yaml? Isn't it not a bug?
>> Because if we comment out the jobmanger.rpc.address and jobmanger.rpc.port
>> then flink client will not be able to submit the job.
>>
>>
>> On Tue, Jun 19, 2018 at 5:49 PM, Chesnay Schepler <[hidden email]>
>> wrote:
>>
>>> In 1.5 we reworked the job-submission to go through the REST API instead
>>> of akka.
>>>
>>> I believe the jobmanager rpc port shouldn't be necessary anymore, the rpc
>>> address is still *required *due to some technical implementations; it
>>> may be that you can set this to some arbitrary value however.
>>>
>>> As a result the REST API (i.e. the web server) must be running in order
>>> to submit jobs.
>>>
>>>
>>> On 19.06.2018 14:12, Sampath Bhat wrote:
>>>
>>> Hello
>>>
>>> I'm using Flink 1.5.0 version and Flink CLI to submit jar to flink
>>> cluster.
>>>
>>> In flink 1.4.2 only job manager rpc address and job manager rpc port were
>>> sufficient for flink client to connect to job manager and submit the job.
>>>
>>> But in flink 1.5.0 the flink client additionally requires the
>>> rest.address and rest.port for submitting the job to job manager. What is
>>> the advantage of this new method over the 1.4.2 method of submitting job?
>>>
>>> Moreover if we make rest.port = -1 the web server will not be
>>> instantiated then how should we submit the job?
>>>
>>> Regards
>>> Sampath
>>>
>>>
>>>

Reply | Threaded
Open this post in threaded view
|

Re: Breakage in Flink CLI in 1.5.0

Till Rohrmann
Hi Sampath,

it is no longer possible to not start the rest server endpoint by setting rest.port to -1. If you do this, then the cluster won't start. The comment in the flink-conf.yaml holds only true for the legacy mode.

In non-HA setups we need the jobmanager.rpc.address to derive the hostname of the rest server. The jobmanager.rpc.port is no longer needed for the client but only for the other cluster components (TMs). When using the HA mode, then every address will be retrieved from ZooKeeper.

I hope this clarifies things.

Cheers,
Till

On Wed, Jun 20, 2018 at 9:24 AM Chesnay Schepler <[hidden email]> wrote:
I was worried this might be the case.

The rest.port handling was simply copied from the legacy web-server,
which explicitly allowed shutting it down.
It may (I'm not entirely sure) also not be necessary for all deployment
modes; for example if the job is baked into the job/taskmanager images.

I'm not quite sure whether the rpc address is actually required for the
REST job submission, or only since we still rely partly on some legacy
code (ClusterClient). Maybe Till (cc) knows the answer to that.

> Adding on to this point you made - " the rpc address is still *required *due
> to some technical implementations; it may be that you can set this to some
> arbitrary value however."
>
> For job submission to happen successfully we should give specific rpc
> address and not any arbitrary value. If any arbitrary value is given the
> job submission fails with the following error -
> org.apache.flink.client.deployment.ClusterRetrieveException: Couldn't
> retrieve standalone cluster
>          at
> org.apache.flink.client.deployment.StandaloneClusterDescriptor.retrieve(StandaloneClusterDescriptor.java:51)
>          at
> org.apache.flink.client.deployment.StandaloneClusterDescriptor.retrieve(StandaloneClusterDescriptor.java:31)
>          at
> org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:249)
>          at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:210)
>          at
> org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1020)
>          at
> org.apache.flink.client.cli.CliFrontend.lambda$main$9(CliFrontend.java:1096)
>          at
> org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
>          at
> org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1096)
> Caused by: java.net.UnknownHostException: flinktest-flink-jobmanager1233445:
> Name or service not known
>   (Random name flinktest-flink-jobmanager1233445)
>          at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
>          at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928)
>          at
> java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323)
>          at java.net.InetAddress.getAllByName0(InetAddress.java:1276)
>          at java.net.InetAddress.getAllByName(InetAddress.java:1192)
>          at java.net.InetAddress.getAllByName(InetAddress.java:1126)
>          at java.net.InetAddress.getByName(InetAddress.java:1076)
>          at
> org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils.getRpcUrl(AkkaRpcServiceUtils.java:171)
>          at
> org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils.getRpcUrl(AkkaRpcServiceUtils.java:136)
>          at
> org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createHighAvailabilityServices(HighAvailabilityServicesUtils.java:83)
>          at
> org.apache.flink.client.program.ClusterClient.<init>(ClusterClient.java:158)
>          at
> org.apache.flink.client.program.rest.RestClusterClient.<init>(RestClusterClient.java:184)
>          at
> org.apache.flink.client.program.rest.RestClusterClient.<init>(RestClusterClient.java:157)
>          at
> org.apache.flink.client.deployment.StandaloneClusterDescriptor.retrieve(StandaloneClusterDescriptor.java:49)
>          ... 7 more
>
>
> On Wed, Jun 20, 2018 at 11:18 AM, Sampath Bhat <[hidden email]>
> wrote:
>
>> Hi Chesnay
>>
>> If REST API (i.e. the web server) is mandatory for submitting jobs then
>> why is there an option to set rest.port to -1? I think it should be
>> mandatory to set some valid port for rest.port and make sure flink job
>> manager does not come up if valid port is not set for rest.port? Or else
>> there must be some way to submit jobs even if REST API (i.e. the web
>> server) is not instantiated.
>>
>> If jobmanger.rpc.address is not required for flink client then why is it
>> still looking for that property in flink-conf.yaml? Isn't it not a bug?
>> Because if we comment out the jobmanger.rpc.address and jobmanger.rpc.port
>> then flink client will not be able to submit the job.
>>
>>
>> On Tue, Jun 19, 2018 at 5:49 PM, Chesnay Schepler <[hidden email]>
>> wrote:
>>
>>> In 1.5 we reworked the job-submission to go through the REST API instead
>>> of akka.
>>>
>>> I believe the jobmanager rpc port shouldn't be necessary anymore, the rpc
>>> address is still *required *due to some technical implementations; it
>>> may be that you can set this to some arbitrary value however.
>>>
>>> As a result the REST API (i.e. the web server) must be running in order
>>> to submit jobs.
>>>
>>>
>>> On 19.06.2018 14:12, Sampath Bhat wrote:
>>>
>>> Hello
>>>
>>> I'm using Flink 1.5.0 version and Flink CLI to submit jar to flink
>>> cluster.
>>>
>>> In flink 1.4.2 only job manager rpc address and job manager rpc port were
>>> sufficient for flink client to connect to job manager and submit the job.
>>>
>>> But in flink 1.5.0 the flink client additionally requires the
>>> rest.address and rest.port for submitting the job to job manager. What is
>>> the advantage of this new method over the 1.4.2 method of submitting job?
>>>
>>> Moreover if we make rest.port = -1 the web server will not be
>>> instantiated then how should we submit the job?
>>>
>>> Regards
>>> Sampath
>>>
>>>
>>>

Reply | Threaded
Open this post in threaded view
|

Re: Breakage in Flink CLI in 1.5.0

Chesnay Schepler
Shouldn't the non-HA case be covered by rest.address?

On 20.06.2018 09:40, Till Rohrmann wrote:
Hi Sampath,

it is no longer possible to not start the rest server endpoint by setting rest.port to -1. If you do this, then the cluster won't start. The comment in the flink-conf.yaml holds only true for the legacy mode.

In non-HA setups we need the jobmanager.rpc.address to derive the hostname of the rest server. The jobmanager.rpc.port is no longer needed for the client but only for the other cluster components (TMs). When using the HA mode, then every address will be retrieved from ZooKeeper.

I hope this clarifies things.

Cheers,
Till

On Wed, Jun 20, 2018 at 9:24 AM Chesnay Schepler <[hidden email]> wrote:
I was worried this might be the case.

The rest.port handling was simply copied from the legacy web-server,
which explicitly allowed shutting it down.
It may (I'm not entirely sure) also not be necessary for all deployment
modes; for example if the job is baked into the job/taskmanager images.

I'm not quite sure whether the rpc address is actually required for the
REST job submission, or only since we still rely partly on some legacy
code (ClusterClient). Maybe Till (cc) knows the answer to that.

> Adding on to this point you made - " the rpc address is still *required *due
> to some technical implementations; it may be that you can set this to some
> arbitrary value however."
>
> For job submission to happen successfully we should give specific rpc
> address and not any arbitrary value. If any arbitrary value is given the
> job submission fails with the following error -
> org.apache.flink.client.deployment.ClusterRetrieveException: Couldn't
> retrieve standalone cluster
>          at
> org.apache.flink.client.deployment.StandaloneClusterDescriptor.retrieve(StandaloneClusterDescriptor.java:51)
>          at
> org.apache.flink.client.deployment.StandaloneClusterDescriptor.retrieve(StandaloneClusterDescriptor.java:31)
>          at
> org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:249)
>          at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:210)
>          at
> org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1020)
>          at
> org.apache.flink.client.cli.CliFrontend.lambda$main$9(CliFrontend.java:1096)
>          at
> org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
>          at
> org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1096)
> Caused by: java.net.UnknownHostException: flinktest-flink-jobmanager1233445:
> Name or service not known
>   (Random name flinktest-flink-jobmanager1233445)
>          at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
>          at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928)
>          at
> java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323)
>          at java.net.InetAddress.getAllByName0(InetAddress.java:1276)
>          at java.net.InetAddress.getAllByName(InetAddress.java:1192)
>          at java.net.InetAddress.getAllByName(InetAddress.java:1126)
>          at java.net.InetAddress.getByName(InetAddress.java:1076)
>          at
> org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils.getRpcUrl(AkkaRpcServiceUtils.java:171)
>          at
> org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils.getRpcUrl(AkkaRpcServiceUtils.java:136)
>          at
> org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createHighAvailabilityServices(HighAvailabilityServicesUtils.java:83)
>          at
> org.apache.flink.client.program.ClusterClient.<init>(ClusterClient.java:158)
>          at
> org.apache.flink.client.program.rest.RestClusterClient.<init>(RestClusterClient.java:184)
>          at
> org.apache.flink.client.program.rest.RestClusterClient.<init>(RestClusterClient.java:157)
>          at
> org.apache.flink.client.deployment.StandaloneClusterDescriptor.retrieve(StandaloneClusterDescriptor.java:49)
>          ... 7 more
>
>
> On Wed, Jun 20, 2018 at 11:18 AM, Sampath Bhat <[hidden email]>
> wrote:
>
>> Hi Chesnay
>>
>> If REST API (i.e. the web server) is mandatory for submitting jobs then
>> why is there an option to set rest.port to -1? I think it should be
>> mandatory to set some valid port for rest.port and make sure flink job
>> manager does not come up if valid port is not set for rest.port? Or else
>> there must be some way to submit jobs even if REST API (i.e. the web
>> server) is not instantiated.
>>
>> If jobmanger.rpc.address is not required for flink client then why is it
>> still looking for that property in flink-conf.yaml? Isn't it not a bug?
>> Because if we comment out the jobmanger.rpc.address and jobmanger.rpc.port
>> then flink client will not be able to submit the job.
>>
>>
>> On Tue, Jun 19, 2018 at 5:49 PM, Chesnay Schepler <[hidden email]>
>> wrote:
>>
>>> In 1.5 we reworked the job-submission to go through the REST API instead
>>> of akka.
>>>
>>> I believe the jobmanager rpc port shouldn't be necessary anymore, the rpc
>>> address is still *required *due to some technical implementations; it
>>> may be that you can set this to some arbitrary value however.
>>>
>>> As a result the REST API (i.e. the web server) must be running in order
>>> to submit jobs.
>>>
>>>
>>> On 19.06.2018 14:12, Sampath Bhat wrote:
>>>
>>> Hello
>>>
>>> I'm using Flink 1.5.0 version and Flink CLI to submit jar to flink
>>> cluster.
>>>
>>> In flink 1.4.2 only job manager rpc address and job manager rpc port were
>>> sufficient for flink client to connect to job manager and submit the job.
>>>
>>> But in flink 1.5.0 the flink client additionally requires the
>>> rest.address and rest.port for submitting the job to job manager. What is
>>> the advantage of this new method over the 1.4.2 method of submitting job?
>>>
>>> Moreover if we make rest.port = -1 the web server will not be
>>> instantiated then how should we submit the job?
>>>
>>> Regards
>>> Sampath
>>>
>>>
>>>


Reply | Threaded
Open this post in threaded view
|

Re: Breakage in Flink CLI in 1.5.0

Till Rohrmann
It will, but it defaults to jobmanager.rpc.address if no rest.address has been specified.

On Wed, Jun 20, 2018 at 9:49 AM Chesnay Schepler <[hidden email]> wrote:
Shouldn't the non-HA case be covered by rest.address?

On 20.06.2018 09:40, Till Rohrmann wrote:
Hi Sampath,

it is no longer possible to not start the rest server endpoint by setting rest.port to -1. If you do this, then the cluster won't start. The comment in the flink-conf.yaml holds only true for the legacy mode.

In non-HA setups we need the jobmanager.rpc.address to derive the hostname of the rest server. The jobmanager.rpc.port is no longer needed for the client but only for the other cluster components (TMs). When using the HA mode, then every address will be retrieved from ZooKeeper.

I hope this clarifies things.

Cheers,
Till

On Wed, Jun 20, 2018 at 9:24 AM Chesnay Schepler <[hidden email]> wrote:
I was worried this might be the case.

The rest.port handling was simply copied from the legacy web-server,
which explicitly allowed shutting it down.
It may (I'm not entirely sure) also not be necessary for all deployment
modes; for example if the job is baked into the job/taskmanager images.

I'm not quite sure whether the rpc address is actually required for the
REST job submission, or only since we still rely partly on some legacy
code (ClusterClient). Maybe Till (cc) knows the answer to that.

> Adding on to this point you made - " the rpc address is still *required *due
> to some technical implementations; it may be that you can set this to some
> arbitrary value however."
>
> For job submission to happen successfully we should give specific rpc
> address and not any arbitrary value. If any arbitrary value is given the
> job submission fails with the following error -
> org.apache.flink.client.deployment.ClusterRetrieveException: Couldn't
> retrieve standalone cluster
>          at
> org.apache.flink.client.deployment.StandaloneClusterDescriptor.retrieve(StandaloneClusterDescriptor.java:51)
>          at
> org.apache.flink.client.deployment.StandaloneClusterDescriptor.retrieve(StandaloneClusterDescriptor.java:31)
>          at
> org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:249)
>          at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:210)
>          at
> org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1020)
>          at
> org.apache.flink.client.cli.CliFrontend.lambda$main$9(CliFrontend.java:1096)
>          at
> org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
>          at
> org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1096)
> Caused by: java.net.UnknownHostException: flinktest-flink-jobmanager1233445:
> Name or service not known
>   (Random name flinktest-flink-jobmanager1233445)
>          at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
>          at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928)
>          at
> java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323)
>          at java.net.InetAddress.getAllByName0(InetAddress.java:1276)
>          at java.net.InetAddress.getAllByName(InetAddress.java:1192)
>          at java.net.InetAddress.getAllByName(InetAddress.java:1126)
>          at java.net.InetAddress.getByName(InetAddress.java:1076)
>          at
> org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils.getRpcUrl(AkkaRpcServiceUtils.java:171)
>          at
> org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils.getRpcUrl(AkkaRpcServiceUtils.java:136)
>          at
> org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createHighAvailabilityServices(HighAvailabilityServicesUtils.java:83)
>          at
> org.apache.flink.client.program.ClusterClient.<init>(ClusterClient.java:158)
>          at
> org.apache.flink.client.program.rest.RestClusterClient.<init>(RestClusterClient.java:184)
>          at
> org.apache.flink.client.program.rest.RestClusterClient.<init>(RestClusterClient.java:157)
>          at
> org.apache.flink.client.deployment.StandaloneClusterDescriptor.retrieve(StandaloneClusterDescriptor.java:49)
>          ... 7 more
>
>
> On Wed, Jun 20, 2018 at 11:18 AM, Sampath Bhat <[hidden email]>
> wrote:
>
>> Hi Chesnay
>>
>> If REST API (i.e. the web server) is mandatory for submitting jobs then
>> why is there an option to set rest.port to -1? I think it should be
>> mandatory to set some valid port for rest.port and make sure flink job
>> manager does not come up if valid port is not set for rest.port? Or else
>> there must be some way to submit jobs even if REST API (i.e. the web
>> server) is not instantiated.
>>
>> If jobmanger.rpc.address is not required for flink client then why is it
>> still looking for that property in flink-conf.yaml? Isn't it not a bug?
>> Because if we comment out the jobmanger.rpc.address and jobmanger.rpc.port
>> then flink client will not be able to submit the job.
>>
>>
>> On Tue, Jun 19, 2018 at 5:49 PM, Chesnay Schepler <[hidden email]>
>> wrote:
>>
>>> In 1.5 we reworked the job-submission to go through the REST API instead
>>> of akka.
>>>
>>> I believe the jobmanager rpc port shouldn't be necessary anymore, the rpc
>>> address is still *required *due to some technical implementations; it
>>> may be that you can set this to some arbitrary value however.
>>>
>>> As a result the REST API (i.e. the web server) must be running in order
>>> to submit jobs.
>>>
>>>
>>> On 19.06.2018 14:12, Sampath Bhat wrote:
>>>
>>> Hello
>>>
>>> I'm using Flink 1.5.0 version and Flink CLI to submit jar to flink
>>> cluster.
>>>
>>> In flink 1.4.2 only job manager rpc address and job manager rpc port were
>>> sufficient for flink client to connect to job manager and submit the job.
>>>
>>> But in flink 1.5.0 the flink client additionally requires the
>>> rest.address and rest.port for submitting the job to job manager. What is
>>> the advantage of this new method over the 1.4.2 method of submitting job?
>>>
>>> Moreover if we make rest.port = -1 the web server will not be
>>> instantiated then how should we submit the job?
>>>
>>> Regards
>>> Sampath
>>>
>>>
>>>


Reply | Threaded
Open this post in threaded view
|

Re: Breakage in Flink CLI in 1.5.0

Sampath Bhat
Hello Till

Thanks for clarification. But I've few questions based on your reply.

In non-HA setups we need the jobmanager.rpc.address to derive the hostname of the rest server.
why is there dependency on jobmanager.rpc.address to get the hostname rest server? This holds good only for normal deployments such as on bare metal, virtual machine where flink cluster runs as another process in a machine. But if we try deploy flink on kubernetes then there could be possiblity where jobmanager.rpc.address and rest.address different from each other.

So if rest.address is not provided in flink-conf.yaml then looking for jobmanager.rpc.address for deriving the hostname of rest server makes sense, but when the user has already provided the rest.address but flink still looks into
jobmanager.rpc.address for getting hostname of rest server is an unwanted dependency IMO.

In HA setup the rpc.address is obtained from zookeeper so user need not worry about unnecessary properties while submitting job.

On Wed, Jun 20, 2018 at 1:25 PM, Till Rohrmann <[hidden email]> wrote:
It will, but it defaults to jobmanager.rpc.address if no rest.address has been specified.

On Wed, Jun 20, 2018 at 9:49 AM Chesnay Schepler <[hidden email]> wrote:
Shouldn't the non-HA case be covered by rest.address?

On 20.06.2018 09:40, Till Rohrmann wrote:
Hi Sampath,

it is no longer possible to not start the rest server endpoint by setting rest.port to -1. If you do this, then the cluster won't start. The comment in the flink-conf.yaml holds only true for the legacy mode.

In non-HA setups we need the jobmanager.rpc.address to derive the hostname of the rest server. The jobmanager.rpc.port is no longer needed for the client but only for the other cluster components (TMs). When using the HA mode, then every address will be retrieved from ZooKeeper.

I hope this clarifies things.

Cheers,
Till

On Wed, Jun 20, 2018 at 9:24 AM Chesnay Schepler <[hidden email]> wrote:
I was worried this might be the case.

The rest.port handling was simply copied from the legacy web-server,
which explicitly allowed shutting it down.
It may (I'm not entirely sure) also not be necessary for all deployment
modes; for example if the job is baked into the job/taskmanager images.

I'm not quite sure whether the rpc address is actually required for the
REST job submission, or only since we still rely partly on some legacy
code (ClusterClient). Maybe Till (cc) knows the answer to that.

> Adding on to this point you made - " the rpc address is still *required *due
> to some technical implementations; it may be that you can set this to some
> arbitrary value however."
>
> For job submission to happen successfully we should give specific rpc
> address and not any arbitrary value. If any arbitrary value is given the
> job submission fails with the following error -
> org.apache.flink.client.deployment.ClusterRetrieveException: Couldn't
> retrieve standalone cluster
>          at
> org.apache.flink.client.deployment.StandaloneClusterDescriptor.retrieve(StandaloneClusterDescriptor.java:51)
>          at
> org.apache.flink.client.deployment.StandaloneClusterDescriptor.retrieve(StandaloneClusterDescriptor.java:31)
>          at
> org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:249)
>          at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:210)
>          at
> org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1020)
>          at
> org.apache.flink.client.cli.CliFrontend.lambda$main$9(CliFrontend.java:1096)
>          at
> org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
>          at
> org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1096)
> Caused by: java.net.UnknownHostException: flinktest-flink-jobmanager1233445:
> Name or service not known
>   (Random name flinktest-flink-jobmanager1233445)
>          at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
>          at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928)
>          at
> java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323)
>          at java.net.InetAddress.getAllByName0(InetAddress.java:1276)
>          at java.net.InetAddress.getAllByName(InetAddress.java:1192)
>          at java.net.InetAddress.getAllByName(InetAddress.java:1126)
>          at java.net.InetAddress.getByName(InetAddress.java:1076)
>          at
> org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils.getRpcUrl(AkkaRpcServiceUtils.java:171)
>          at
> org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils.getRpcUrl(AkkaRpcServiceUtils.java:136)
>          at
> org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createHighAvailabilityServices(HighAvailabilityServicesUtils.java:83)
>          at
> org.apache.flink.client.program.ClusterClient.<init>(ClusterClient.java:158)
>          at
> org.apache.flink.client.program.rest.RestClusterClient.<init>(RestClusterClient.java:184)
>          at
> org.apache.flink.client.program.rest.RestClusterClient.<init>(RestClusterClient.java:157)
>          at
> org.apache.flink.client.deployment.StandaloneClusterDescriptor.retrieve(StandaloneClusterDescriptor.java:49)
>          ... 7 more
>
>
> On Wed, Jun 20, 2018 at 11:18 AM, Sampath Bhat <[hidden email]>
> wrote:
>
>> Hi Chesnay
>>
>> If REST API (i.e. the web server) is mandatory for submitting jobs then
>> why is there an option to set rest.port to -1? I think it should be
>> mandatory to set some valid port for rest.port and make sure flink job
>> manager does not come up if valid port is not set for rest.port? Or else
>> there must be some way to submit jobs even if REST API (i.e. the web
>> server) is not instantiated.
>>
>> If jobmanger.rpc.address is not required for flink client then why is it
>> still looking for that property in flink-conf.yaml? Isn't it not a bug?
>> Because if we comment out the jobmanger.rpc.address and jobmanger.rpc.port
>> then flink client will not be able to submit the job.
>>
>>
>> On Tue, Jun 19, 2018 at 5:49 PM, Chesnay Schepler <[hidden email]>
>> wrote:
>>
>>> In 1.5 we reworked the job-submission to go through the REST API instead
>>> of akka.
>>>
>>> I believe the jobmanager rpc port shouldn't be necessary anymore, the rpc
>>> address is still *required *due to some technical implementations; it
>>> may be that you can set this to some arbitrary value however.
>>>
>>> As a result the REST API (i.e. the web server) must be running in order
>>> to submit jobs.
>>>
>>>
>>> On 19.06.2018 14:12, Sampath Bhat wrote:
>>>
>>> Hello
>>>
>>> I'm using Flink 1.5.0 version and Flink CLI to submit jar to flink
>>> cluster.
>>>
>>> In flink 1.4.2 only job manager rpc address and job manager rpc port were
>>> sufficient for flink client to connect to job manager and submit the job.
>>>
>>> But in flink 1.5.0 the flink client additionally requires the
>>> rest.address and rest.port for submitting the job to job manager. What is
>>> the advantage of this new method over the 1.4.2 method of submitting job?
>>>
>>> Moreover if we make rest.port = -1 the web server will not be
>>> instantiated then how should we submit the job?
>>>
>>> Regards
>>> Sampath
>>>
>>>
>>>



Reply | Threaded
Open this post in threaded view
|

Re: Breakage in Flink CLI in 1.5.0

Till Rohrmann
Hi,

if the rest.address is different from the jobmanager.rpc.address, then you should specify that in the flink-conf.yaml and Flink will connect to rest.address. Only if rest.address is not specified, the system will fall back to use the jobmanager.rpc.address. Currently, the rest server endpoint runs in the same JVM as the cluster entrypoint and all JobMasters.

Cheers,
Till

On Thu, Jun 21, 2018 at 8:46 AM Sampath Bhat <[hidden email]> wrote:
Hello Till

Thanks for clarification. But I've few questions based on your reply.

In non-HA setups we need the jobmanager.rpc.address to derive the hostname
of the rest server.
why is there dependency on jobmanager.rpc.address to get the hostname rest
server? This holds good only for normal deployments such as on bare metal,
virtual machine where flink cluster runs as another process in a machine.
But if we try deploy flink on kubernetes then there could be possiblity
where jobmanager.rpc.address and rest.address different from each other.

So if rest.address is not provided in flink-conf.yaml then looking for
jobmanager.rpc.address for deriving the hostname of rest server makes
sense, but when the user has already provided the rest.address but flink
still looks into jobmanager.rpc.address for getting hostname of rest server
is an unwanted dependency IMO.

In HA setup the rpc.address is obtained from zookeeper so user need not
worry about unnecessary properties while submitting job.

On Wed, Jun 20, 2018 at 1:25 PM, Till Rohrmann <[hidden email]> wrote:

> It will, but it defaults to jobmanager.rpc.address if no rest.address has
> been specified.
>
> On Wed, Jun 20, 2018 at 9:49 AM Chesnay Schepler <[hidden email]>
> wrote:
>
>> Shouldn't the non-HA case be covered by rest.address?
>>
>> On 20.06.2018 09:40, Till Rohrmann wrote:
>>
>> Hi Sampath,
>>
>> it is no longer possible to not start the rest server endpoint by setting
>> rest.port to -1. If you do this, then the cluster won't start. The comment
>> in the flink-conf.yaml holds only true for the legacy mode.
>>
>> In non-HA setups we need the jobmanager.rpc.address to derive the
>> hostname of the rest server. The jobmanager.rpc.port is no longer needed
>> for the client but only for the other cluster components (TMs). When using
>> the HA mode, then every address will be retrieved from ZooKeeper.
>>
>> I hope this clarifies things.
>>
>> Cheers,
>> Till
>>
>> On Wed, Jun 20, 2018 at 9:24 AM Chesnay Schepler <[hidden email]>
>> wrote:
>>
>>> I was worried this might be the case.
>>>
>>> The rest.port handling was simply copied from the legacy web-server,
>>> which explicitly allowed shutting it down.
>>> It may (I'm not entirely sure) also not be necessary for all deployment
>>> modes; for example if the job is baked into the job/taskmanager images.
>>>
>>> I'm not quite sure whether the rpc address is actually required for the
>>> REST job submission, or only since we still rely partly on some legacy
>>> code (ClusterClient). Maybe Till (cc) knows the answer to that.
>>>
>>> > Adding on to this point you made - " the rpc address is still
>>> *required *due
>>> > to some technical implementations; it may be that you can set this to
>>> some
>>> > arbitrary value however."
>>> >
>>> > For job submission to happen successfully we should give specific rpc
>>> > address and not any arbitrary value. If any arbitrary value is given
>>> the
>>> > job submission fails with the following error -
>>> > org.apache.flink.client.deployment.ClusterRetrieveException: Couldn't
>>> > retrieve standalone cluster
>>> >          at
>>> > org.apache.flink.client.deployment.StandaloneClusterDescriptor.
>>> retrieve(StandaloneClusterDescriptor.java:51)
>>> >          at
>>> > org.apache.flink.client.deployment.StandaloneClusterDescriptor.
>>> retrieve(StandaloneClusterDescriptor.java:31)
>>> >          at
>>> > org.apache.flink.client.cli.CliFrontend.runProgram(
>>> CliFrontend.java:249)
>>> >          at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.
>>> java:210)
>>> >          at
>>> > org.apache.flink.client.cli.CliFrontend.parseParameters(
>>> CliFrontend.java:1020)
>>> >          at
>>> > org.apache.flink.client.cli.CliFrontend.lambda$main$9(
>>> CliFrontend.java:1096)
>>> >          at
>>> > org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(
>>> NoOpSecurityContext.java:30)
>>> >          at
>>> > org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1096)
>>> > Caused by: java.net.UnknownHostException: flinktest-flink-
>>> jobmanager1233445:
>>> > Name or service not known
>>> >   (Random name flinktest-flink-jobmanager1233445)
>>> >          at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
>>> >          at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.
>>> java:928)
>>> >          at
>>> > java.net.InetAddress.getAddressesFromNameService(
>>> InetAddress.java:1323)
>>> >          at java.net.InetAddress.getAllByName0(InetAddress.java:1276)
>>> >          at java.net.InetAddress.getAllByName(InetAddress.java:1192)
>>> >          at java.net.InetAddress.getAllByName(InetAddress.java:1126)
>>> >          at java.net.InetAddress.getByName(InetAddress.java:1076)
>>> >          at
>>> > org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils.
>>> getRpcUrl(AkkaRpcServiceUtils.java:171)
>>> >          at
>>> > org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils.
>>> getRpcUrl(AkkaRpcServiceUtils.java:136)
>>> >          at
>>> > org.apache.flink.runtime.highavailability.
>>> HighAvailabilityServicesUtils.createHighAvailabilityServices(
>>> HighAvailabilityServicesUtils.java:83)
>>> >          at
>>> > org.apache.flink.client.program.ClusterClient.<init>(
>>> ClusterClient.java:158)
>>> >          at
>>> > org.apache.flink.client.program.rest.RestClusterClient.<init>(
>>> RestClusterClient.java:184)
>>> >          at
>>> > org.apache.flink.client.program.rest.RestClusterClient.<init>(
>>> RestClusterClient.java:157)
>>> >          at
>>> > org.apache.flink.client.deployment.StandaloneClusterDescriptor.
>>> retrieve(StandaloneClusterDescriptor.java:49)
>>> >          ... 7 more
>>> >
>>> >
>>> > On Wed, Jun 20, 2018 at 11:18 AM, Sampath Bhat <
>>> [hidden email]>
>>> > wrote:
>>> >
>>> >> Hi Chesnay
>>> >>
>>> >> If REST API (i.e. the web server) is mandatory for submitting jobs
>>> then
>>> >> why is there an option to set rest.port to -1? I think it should be
>>> >> mandatory to set some valid port for rest.port and make sure flink job
>>> >> manager does not come up if valid port is not set for rest.port? Or
>>> else
>>> >> there must be some way to submit jobs even if REST API (i.e. the web
>>> >> server) is not instantiated.
>>> >>
>>> >> If jobmanger.rpc.address is not required for flink client then why is
>>> it
>>> >> still looking for that property in flink-conf.yaml? Isn't it not a
>>> bug?
>>> >> Because if we comment out the jobmanger.rpc.address and
>>> jobmanger.rpc.port
>>> >> then flink client will not be able to submit the job.
>>> >>
>>> >>
>>> >> On Tue, Jun 19, 2018 at 5:49 PM, Chesnay Schepler <[hidden email]
>>> >
>>> >> wrote:
>>> >>
>>> >>> In 1.5 we reworked the job-submission to go through the REST API
>>> instead
>>> >>> of akka.
>>> >>>
>>> >>> I believe the jobmanager rpc port shouldn't be necessary anymore,
>>> the rpc
>>> >>> address is still *required *due to some technical implementations; it
>>> >>> may be that you can set this to some arbitrary value however.
>>> >>>
>>> >>> As a result the REST API (i.e. the web server) must be running in
>>> order
>>> >>> to submit jobs.
>>> >>>
>>> >>>
>>> >>> On 19.06.2018 14:12, Sampath Bhat wrote:
>>> >>>
>>> >>> Hello
>>> >>>
>>> >>> I'm using Flink 1.5.0 version and Flink CLI to submit jar to flink
>>> >>> cluster.
>>> >>>
>>> >>> In flink 1.4.2 only job manager rpc address and job manager rpc port
>>> were
>>> >>> sufficient for flink client to connect to job manager and submit the
>>> job.
>>> >>>
>>> >>> But in flink 1.5.0 the flink client additionally requires the
>>> >>> rest.address and rest.port for submitting the job to job manager.
>>> What is
>>> >>> the advantage of this new method over the 1.4.2 method of submitting
>>> job?
>>> >>>
>>> >>> Moreover if we make rest.port = -1 the web server will not be
>>> >>> instantiated then how should we submit the job?
>>> >>>
>>> >>> Regards
>>> >>> Sampath
>>> >>>
>>> >>>
>>> >>>
>>>
>>>
>>
Reply | Threaded
Open this post in threaded view
|

Re: Breakage in Flink CLI in 1.5.0

Sampath Bhat
hi
Yes I've specified the rest.address for the flink client to connect to the rest.address and the rest.address is valid and working fine but my question is why am I supposed to give jobmanager.rpc.address for flink client to connect to flink cluster if flink client depends only on rest.address?

On Thu, Jun 21, 2018 at 12:41 PM, Till Rohrmann <[hidden email]> wrote:
Hi,

if the rest.address is different from the jobmanager.rpc.address, then you should specify that in the flink-conf.yaml and Flink will connect to rest.address. Only if rest.address is not specified, the system will fall back to use the jobmanager.rpc.address. Currently, the rest server endpoint runs in the same JVM as the cluster entrypoint and all JobMasters.

Cheers,
Till

On Thu, Jun 21, 2018 at 8:46 AM Sampath Bhat <[hidden email]> wrote:
Hello Till

Thanks for clarification. But I've few questions based on your reply.

In non-HA setups we need the jobmanager.rpc.address to derive the hostname
of the rest server.
why is there dependency on jobmanager.rpc.address to get the hostname rest
server? This holds good only for normal deployments such as on bare metal,
virtual machine where flink cluster runs as another process in a machine.
But if we try deploy flink on kubernetes then there could be possiblity
where jobmanager.rpc.address and rest.address different from each other.

So if rest.address is not provided in flink-conf.yaml then looking for
jobmanager.rpc.address for deriving the hostname of rest server makes
sense, but when the user has already provided the rest.address but flink
still looks into jobmanager.rpc.address for getting hostname of rest server
is an unwanted dependency IMO.

In HA setup the rpc.address is obtained from zookeeper so user need not
worry about unnecessary properties while submitting job.

On Wed, Jun 20, 2018 at 1:25 PM, Till Rohrmann <[hidden email]> wrote:

> It will, but it defaults to jobmanager.rpc.address if no rest.address has
> been specified.
>
> On Wed, Jun 20, 2018 at 9:49 AM Chesnay Schepler <[hidden email]>
> wrote:
>
>> Shouldn't the non-HA case be covered by rest.address?
>>
>> On 20.06.2018 09:40, Till Rohrmann wrote:
>>
>> Hi Sampath,
>>
>> it is no longer possible to not start the rest server endpoint by setting
>> rest.port to -1. If you do this, then the cluster won't start. The comment
>> in the flink-conf.yaml holds only true for the legacy mode.
>>
>> In non-HA setups we need the jobmanager.rpc.address to derive the
>> hostname of the rest server. The jobmanager.rpc.port is no longer needed
>> for the client but only for the other cluster components (TMs). When using
>> the HA mode, then every address will be retrieved from ZooKeeper.
>>
>> I hope this clarifies things.
>>
>> Cheers,
>> Till
>>
>> On Wed, Jun 20, 2018 at 9:24 AM Chesnay Schepler <[hidden email]>
>> wrote:
>>
>>> I was worried this might be the case.
>>>
>>> The rest.port handling was simply copied from the legacy web-server,
>>> which explicitly allowed shutting it down.
>>> It may (I'm not entirely sure) also not be necessary for all deployment
>>> modes; for example if the job is baked into the job/taskmanager images.
>>>
>>> I'm not quite sure whether the rpc address is actually required for the
>>> REST job submission, or only since we still rely partly on some legacy
>>> code (ClusterClient). Maybe Till (cc) knows the answer to that.
>>>
>>> > Adding on to this point you made - " the rpc address is still
>>> *required *due
>>> > to some technical implementations; it may be that you can set this to
>>> some
>>> > arbitrary value however."
>>> >
>>> > For job submission to happen successfully we should give specific rpc
>>> > address and not any arbitrary value. If any arbitrary value is given
>>> the
>>> > job submission fails with the following error -
>>> > org.apache.flink.client.deployment.ClusterRetrieveException: Couldn't
>>> > retrieve standalone cluster
>>> >          at
>>> > org.apache.flink.client.deployment.StandaloneClusterDescriptor.
>>> retrieve(StandaloneClusterDescriptor.java:51)
>>> >          at
>>> > org.apache.flink.client.deployment.StandaloneClusterDescriptor.
>>> retrieve(StandaloneClusterDescriptor.java:31)
>>> >          at
>>> > org.apache.flink.client.cli.CliFrontend.runProgram(
>>> CliFrontend.java:249)
>>> >          at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.
>>> java:210)
>>> >          at
>>> > org.apache.flink.client.cli.CliFrontend.parseParameters(
>>> CliFrontend.java:1020)
>>> >          at
>>> > org.apache.flink.client.cli.CliFrontend.lambda$main$9(
>>> CliFrontend.java:1096)
>>> >          at
>>> > org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(
>>> NoOpSecurityContext.java:30)
>>> >          at
>>> > org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1096)
>>> > Caused by: java.net.UnknownHostException: flinktest-flink-
>>> jobmanager1233445:
>>> > Name or service not known
>>> >   (Random name flinktest-flink-jobmanager1233445)
>>> >          at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
>>> >          at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.
>>> java:928)
>>> >          at
>>> > java.net.InetAddress.getAddressesFromNameService(
>>> InetAddress.java:1323)
>>> >          at java.net.InetAddress.getAllByName0(InetAddress.java:1276)
>>> >          at java.net.InetAddress.getAllByName(InetAddress.java:1192)
>>> >          at java.net.InetAddress.getAllByName(InetAddress.java:1126)
>>> >          at java.net.InetAddress.getByName(InetAddress.java:1076)
>>> >          at
>>> > org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils.
>>> getRpcUrl(AkkaRpcServiceUtils.java:171)
>>> >          at
>>> > org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils.
>>> getRpcUrl(AkkaRpcServiceUtils.java:136)
>>> >          at
>>> > org.apache.flink.runtime.highavailability.
>>> HighAvailabilityServicesUtils.createHighAvailabilityServices(
>>> HighAvailabilityServicesUtils.java:83)
>>> >          at
>>> > org.apache.flink.client.program.ClusterClient.<init>(
>>> ClusterClient.java:158)
>>> >          at
>>> > org.apache.flink.client.program.rest.RestClusterClient.<init>(
>>> RestClusterClient.java:184)
>>> >          at
>>> > org.apache.flink.client.program.rest.RestClusterClient.<init>(
>>> RestClusterClient.java:157)
>>> >          at
>>> > org.apache.flink.client.deployment.StandaloneClusterDescriptor.
>>> retrieve(StandaloneClusterDescriptor.java:49)
>>> >          ... 7 more
>>> >
>>> >
>>> > On Wed, Jun 20, 2018 at 11:18 AM, Sampath Bhat <
>>> [hidden email]>
>>> > wrote:
>>> >
>>> >> Hi Chesnay
>>> >>
>>> >> If REST API (i.e. the web server) is mandatory for submitting jobs
>>> then
>>> >> why is there an option to set rest.port to -1? I think it should be
>>> >> mandatory to set some valid port for rest.port and make sure flink job
>>> >> manager does not come up if valid port is not set for rest.port? Or
>>> else
>>> >> there must be some way to submit jobs even if REST API (i.e. the web
>>> >> server) is not instantiated.
>>> >>
>>> >> If jobmanger.rpc.address is not required for flink client then why is
>>> it
>>> >> still looking for that property in flink-conf.yaml? Isn't it not a
>>> bug?
>>> >> Because if we comment out the jobmanger.rpc.address and
>>> jobmanger.rpc.port
>>> >> then flink client will not be able to submit the job.
>>> >>
>>> >>
>>> >> On Tue, Jun 19, 2018 at 5:49 PM, Chesnay Schepler <[hidden email]
>>> >
>>> >> wrote:
>>> >>
>>> >>> In 1.5 we reworked the job-submission to go through the REST API
>>> instead
>>> >>> of akka.
>>> >>>
>>> >>> I believe the jobmanager rpc port shouldn't be necessary anymore,
>>> the rpc
>>> >>> address is still *required *due to some technical implementations; it
>>> >>> may be that you can set this to some arbitrary value however.
>>> >>>
>>> >>> As a result the REST API (i.e. the web server) must be running in
>>> order
>>> >>> to submit jobs.
>>> >>>
>>> >>>
>>> >>> On 19.06.2018 14:12, Sampath Bhat wrote:
>>> >>>
>>> >>> Hello
>>> >>>
>>> >>> I'm using Flink 1.5.0 version and Flink CLI to submit jar to flink
>>> >>> cluster.
>>> >>>
>>> >>> In flink 1.4.2 only job manager rpc address and job manager rpc port
>>> were
>>> >>> sufficient for flink client to connect to job manager and submit the
>>> job.
>>> >>>
>>> >>> But in flink 1.5.0 the flink client additionally requires the
>>> >>> rest.address and rest.port for submitting the job to job manager.
>>> What is
>>> >>> the advantage of this new method over the 1.4.2 method of submitting
>>> job?
>>> >>>
>>> >>> Moreover if we make rest.port = -1 the web server will not be
>>> >>> instantiated then how should we submit the job?
>>> >>>
>>> >>> Regards
>>> >>> Sampath
>>> >>>
>>> >>>
>>> >>>
>>>
>>>
>>

Reply | Threaded
Open this post in threaded view
|

Re: Breakage in Flink CLI in 1.5.0

Till Rohrmann
The reason why you still have to do it is because we still have to support the legacy mode where the client needs to know the JobManager RPC address. Once we remove the legacy mode, we could change the HighAvailabilityServices such that we have client facing HA services which only retrieve the rest server endpoint and cluster internal HA services which need to know the cluster components address at cluster startup.

Cheers,
Till

On Thu, Jun 21, 2018 at 11:38 AM Sampath Bhat <[hidden email]> wrote:
hi
Yes I've specified the rest.address for the flink client to connect to the rest.address and the rest.address is valid and working fine but my question is why am I supposed to give jobmanager.rpc.address for flink client to connect to flink cluster if flink client depends only on rest.address?

On Thu, Jun 21, 2018 at 12:41 PM, Till Rohrmann <[hidden email]> wrote:
Hi,

if the rest.address is different from the jobmanager.rpc.address, then you should specify that in the flink-conf.yaml and Flink will connect to rest.address. Only if rest.address is not specified, the system will fall back to use the jobmanager.rpc.address. Currently, the rest server endpoint runs in the same JVM as the cluster entrypoint and all JobMasters.

Cheers,
Till

On Thu, Jun 21, 2018 at 8:46 AM Sampath Bhat <[hidden email]> wrote:
Hello Till

Thanks for clarification. But I've few questions based on your reply.

In non-HA setups we need the jobmanager.rpc.address to derive the hostname
of the rest server.
why is there dependency on jobmanager.rpc.address to get the hostname rest
server? This holds good only for normal deployments such as on bare metal,
virtual machine where flink cluster runs as another process in a machine.
But if we try deploy flink on kubernetes then there could be possiblity
where jobmanager.rpc.address and rest.address different from each other.

So if rest.address is not provided in flink-conf.yaml then looking for
jobmanager.rpc.address for deriving the hostname of rest server makes
sense, but when the user has already provided the rest.address but flink
still looks into jobmanager.rpc.address for getting hostname of rest server
is an unwanted dependency IMO.

In HA setup the rpc.address is obtained from zookeeper so user need not
worry about unnecessary properties while submitting job.

On Wed, Jun 20, 2018 at 1:25 PM, Till Rohrmann <[hidden email]> wrote:

> It will, but it defaults to jobmanager.rpc.address if no rest.address has
> been specified.
>
> On Wed, Jun 20, 2018 at 9:49 AM Chesnay Schepler <[hidden email]>
> wrote:
>
>> Shouldn't the non-HA case be covered by rest.address?
>>
>> On 20.06.2018 09:40, Till Rohrmann wrote:
>>
>> Hi Sampath,
>>
>> it is no longer possible to not start the rest server endpoint by setting
>> rest.port to -1. If you do this, then the cluster won't start. The comment
>> in the flink-conf.yaml holds only true for the legacy mode.
>>
>> In non-HA setups we need the jobmanager.rpc.address to derive the
>> hostname of the rest server. The jobmanager.rpc.port is no longer needed
>> for the client but only for the other cluster components (TMs). When using
>> the HA mode, then every address will be retrieved from ZooKeeper.
>>
>> I hope this clarifies things.
>>
>> Cheers,
>> Till
>>
>> On Wed, Jun 20, 2018 at 9:24 AM Chesnay Schepler <[hidden email]>
>> wrote:
>>
>>> I was worried this might be the case.
>>>
>>> The rest.port handling was simply copied from the legacy web-server,
>>> which explicitly allowed shutting it down.
>>> It may (I'm not entirely sure) also not be necessary for all deployment
>>> modes; for example if the job is baked into the job/taskmanager images.
>>>
>>> I'm not quite sure whether the rpc address is actually required for the
>>> REST job submission, or only since we still rely partly on some legacy
>>> code (ClusterClient). Maybe Till (cc) knows the answer to that.
>>>
>>> > Adding on to this point you made - " the rpc address is still
>>> *required *due
>>> > to some technical implementations; it may be that you can set this to
>>> some
>>> > arbitrary value however."
>>> >
>>> > For job submission to happen successfully we should give specific rpc
>>> > address and not any arbitrary value. If any arbitrary value is given
>>> the
>>> > job submission fails with the following error -
>>> > org.apache.flink.client.deployment.ClusterRetrieveException: Couldn't
>>> > retrieve standalone cluster
>>> >          at
>>> > org.apache.flink.client.deployment.StandaloneClusterDescriptor.
>>> retrieve(StandaloneClusterDescriptor.java:51)
>>> >          at
>>> > org.apache.flink.client.deployment.StandaloneClusterDescriptor.
>>> retrieve(StandaloneClusterDescriptor.java:31)
>>> >          at
>>> > org.apache.flink.client.cli.CliFrontend.runProgram(
>>> CliFrontend.java:249)
>>> >          at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.
>>> java:210)
>>> >          at
>>> > org.apache.flink.client.cli.CliFrontend.parseParameters(
>>> CliFrontend.java:1020)
>>> >          at
>>> > org.apache.flink.client.cli.CliFrontend.lambda$main$9(
>>> CliFrontend.java:1096)
>>> >          at
>>> > org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(
>>> NoOpSecurityContext.java:30)
>>> >          at
>>> > org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1096)
>>> > Caused by: java.net.UnknownHostException: flinktest-flink-
>>> jobmanager1233445:
>>> > Name or service not known
>>> >   (Random name flinktest-flink-jobmanager1233445)
>>> >          at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
>>> >          at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.
>>> java:928)
>>> >          at
>>> > java.net.InetAddress.getAddressesFromNameService(
>>> InetAddress.java:1323)
>>> >          at java.net.InetAddress.getAllByName0(InetAddress.java:1276)
>>> >          at java.net.InetAddress.getAllByName(InetAddress.java:1192)
>>> >          at java.net.InetAddress.getAllByName(InetAddress.java:1126)
>>> >          at java.net.InetAddress.getByName(InetAddress.java:1076)
>>> >          at
>>> > org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils.
>>> getRpcUrl(AkkaRpcServiceUtils.java:171)
>>> >          at
>>> > org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils.
>>> getRpcUrl(AkkaRpcServiceUtils.java:136)
>>> >          at
>>> > org.apache.flink.runtime.highavailability.
>>> HighAvailabilityServicesUtils.createHighAvailabilityServices(
>>> HighAvailabilityServicesUtils.java:83)
>>> >          at
>>> > org.apache.flink.client.program.ClusterClient.<init>(
>>> ClusterClient.java:158)
>>> >          at
>>> > org.apache.flink.client.program.rest.RestClusterClient.<init>(
>>> RestClusterClient.java:184)
>>> >          at
>>> > org.apache.flink.client.program.rest.RestClusterClient.<init>(
>>> RestClusterClient.java:157)
>>> >          at
>>> > org.apache.flink.client.deployment.StandaloneClusterDescriptor.
>>> retrieve(StandaloneClusterDescriptor.java:49)
>>> >          ... 7 more
>>> >
>>> >
>>> > On Wed, Jun 20, 2018 at 11:18 AM, Sampath Bhat <
>>> [hidden email]>
>>> > wrote:
>>> >
>>> >> Hi Chesnay
>>> >>
>>> >> If REST API (i.e. the web server) is mandatory for submitting jobs
>>> then
>>> >> why is there an option to set rest.port to -1? I think it should be
>>> >> mandatory to set some valid port for rest.port and make sure flink job
>>> >> manager does not come up if valid port is not set for rest.port? Or
>>> else
>>> >> there must be some way to submit jobs even if REST API (i.e. the web
>>> >> server) is not instantiated.
>>> >>
>>> >> If jobmanger.rpc.address is not required for flink client then why is
>>> it
>>> >> still looking for that property in flink-conf.yaml? Isn't it not a
>>> bug?
>>> >> Because if we comment out the jobmanger.rpc.address and
>>> jobmanger.rpc.port
>>> >> then flink client will not be able to submit the job.
>>> >>
>>> >>
>>> >> On Tue, Jun 19, 2018 at 5:49 PM, Chesnay Schepler <[hidden email]
>>> >
>>> >> wrote:
>>> >>
>>> >>> In 1.5 we reworked the job-submission to go through the REST API
>>> instead
>>> >>> of akka.
>>> >>>
>>> >>> I believe the jobmanager rpc port shouldn't be necessary anymore,
>>> the rpc
>>> >>> address is still *required *due to some technical implementations; it
>>> >>> may be that you can set this to some arbitrary value however.
>>> >>>
>>> >>> As a result the REST API (i.e. the web server) must be running in
>>> order
>>> >>> to submit jobs.
>>> >>>
>>> >>>
>>> >>> On 19.06.2018 14:12, Sampath Bhat wrote:
>>> >>>
>>> >>> Hello
>>> >>>
>>> >>> I'm using Flink 1.5.0 version and Flink CLI to submit jar to flink
>>> >>> cluster.
>>> >>>
>>> >>> In flink 1.4.2 only job manager rpc address and job manager rpc port
>>> were
>>> >>> sufficient for flink client to connect to job manager and submit the
>>> job.
>>> >>>
>>> >>> But in flink 1.5.0 the flink client additionally requires the
>>> >>> rest.address and rest.port for submitting the job to job manager.
>>> What is
>>> >>> the advantage of this new method over the 1.4.2 method of submitting
>>> job?
>>> >>>
>>> >>> Moreover if we make rest.port = -1 the web server will not be
>>> >>> instantiated then how should we submit the job?
>>> >>>
>>> >>> Regards
>>> >>> Sampath
>>> >>>
>>> >>>
>>> >>>
>>>
>>>
>>

sen
Reply | Threaded
Open this post in threaded view
|

Re: Breakage in Flink CLI in 1.5.0

sen
In reply to this post by Till Rohrmann
Hi Till:
       
        So how can we get the right rest address and port when using HA mode
on Yarn?  I get it from the rest api "/jars ". But when I submit a job use
the flink run -m ,it failed .
       
        org.apache.flink.client.program.ProgramInvocationException: Could
not retrieve the execution result.
        at
org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:258)
        at
org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:464)
        at
org.apache.flink.streaming.api.environment.StreamContextEnvironment.execute(StreamContextEnvironment.java:66)
        at
org.apache.flink.streaming.api.scala.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.scala:654)
        at scala.Function0$class.apply$mcV$sp(Function0.scala:34)
        at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
        at scala.App$$anonfun$main$1.apply(App.scala:76)
        at scala.App$$anonfun$main$1.apply(App.scala:76)
        at scala.collection.immutable.List.foreach(List.scala:392)
        at
scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:35)
        at scala.App$class.main(App.scala:76)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at
org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:528)
        at
org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:420)
        at
org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:404)
        at
org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:798)
        at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:289)
        at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:215)
        at
org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1035)
        at
org.apache.flink.client.cli.CliFrontend.lambda$main$9(CliFrontend.java:1111)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1754)
        at
org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
        at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1111)
Caused by: org.apache.flink.runtime.client.JobSubmissionException: Failed to
submit JobGraph.
        at
org.apache.flink.client.program.rest.RestClusterClient.lambda$submitJob$8(RestClusterClient.java:371)
        at
java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:870)
        at
java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:852)
        at
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
        at
java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
        at
org.apache.flink.runtime.concurrent.FutureUtils.lambda$retryOperationWithDelay$5(FutureUtils.java:203)
        at
java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
        at
java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)
        at
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
        at
java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
        at
org.apache.flink.runtime.concurrent.FutureUtils$Timeout.run(FutureUtils.java:795)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
        at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.util.concurrent.TimeoutException
        ... 8 more




--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/