Flink client trying to submit jobs to old session cluster (which was killed)

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Flink client trying to submit jobs to old session cluster (which was killed)

Pankaj Chand
Hello,

When using Flink on YARN in session mode, each Flink job client would automatically know the YARN cluster to connect to. It says this somewhere in the documentation.

So, I killed the Flink session cluster by simply killing the YARN application using the "yarn kill" command. However, when starting a new Flink session cluster and trying to submit Flink jobs to yarn-session, Flink complains that the old cluster (it gave the port number and YARN application ID) is not available.

It seems like the details of the old cluster were still stored somewhere in Flink. So, I had to completely replace the Flink folder with a new one.

Does anyone know the proper way to kill a Flink+YARN session cluster to completely remove it so that jobs will get submitted to a new Flink session cluster?

Thanks,

Pankaj
Reply | Threaded
Open this post in threaded view
|

Re: Flink client trying to submit jobs to old session cluster (which was killed)

vino yang
Hi Pankaj,

Can you tell us what's Flink version do you use?  And can you share the Flink client and job manager log with us?

This information would help us to locate your problem.

Best,
Vino

Pankaj Chand <[hidden email]> 于2019年12月12日周四 下午7:08写道:
Hello,

When using Flink on YARN in session mode, each Flink job client would automatically know the YARN cluster to connect to. It says this somewhere in the documentation.

So, I killed the Flink session cluster by simply killing the YARN application using the "yarn kill" command. However, when starting a new Flink session cluster and trying to submit Flink jobs to yarn-session, Flink complains that the old cluster (it gave the port number and YARN application ID) is not available.

It seems like the details of the old cluster were still stored somewhere in Flink. So, I had to completely replace the Flink folder with a new one.

Does anyone know the proper way to kill a Flink+YARN session cluster to completely remove it so that jobs will get submitted to a new Flink session cluster?

Thanks,

Pankaj
Reply | Threaded
Open this post in threaded view
|

Re: Flink client trying to submit jobs to old session cluster (which was killed)

Kostas Kloudas-2
Hi Pankaj,

When you start a session cluster with the bin/yarn-session.sh script,
Flink will create the cluster and then write a "Yarn Properties file"
named ".yarn-properties-YOUR_USER_NAME" in the directory:
either the one specified by the option "yarn.properties-file.location"
in the flink-conf.yaml or in your local
System.getProperty("java.io.tmpdir"). This file will contain the
applicationId of the cluster and
it will be picked up by any future calls to `flink run`. Could you
check if this file exists and if it is updated every time you create a
cluster?

Thanks,
Kostas

On Thu, Dec 12, 2019 at 2:22 PM vino yang <[hidden email]> wrote:

>
> Hi Pankaj,
>
> Can you tell us what's Flink version do you use?  And can you share the Flink client and job manager log with us?
>
> This information would help us to locate your problem.
>
> Best,
> Vino
>
> Pankaj Chand <[hidden email]> 于2019年12月12日周四 下午7:08写道:
>>
>> Hello,
>>
>> When using Flink on YARN in session mode, each Flink job client would automatically know the YARN cluster to connect to. It says this somewhere in the documentation.
>>
>> So, I killed the Flink session cluster by simply killing the YARN application using the "yarn kill" command. However, when starting a new Flink session cluster and trying to submit Flink jobs to yarn-session, Flink complains that the old cluster (it gave the port number and YARN application ID) is not available.
>>
>> It seems like the details of the old cluster were still stored somewhere in Flink. So, I had to completely replace the Flink folder with a new one.
>>
>> Does anyone know the proper way to kill a Flink+YARN session cluster to completely remove it so that jobs will get submitted to a new Flink session cluster?
>>
>> Thanks,
>>
>> Pankaj
Reply | Threaded
Open this post in threaded view
|

Re: Flink client trying to submit jobs to old session cluster (which was killed)

Pankaj Chand
Vino and Kostas:

Thank you for the info!

I was using Flink 1.9.1 with Pre-bundled Hadoop 2.7.5.

Cloudlab has quarantined my cluster experiment without notice 😕, so I'll let you know if and when they allow me to access the files in the future.

regards,

Pankaj

On Thu, Dec 12, 2019 at 8:35 AM Kostas Kloudas <[hidden email]> wrote:
Hi Pankaj,

When you start a session cluster with the bin/yarn-session.sh script,
Flink will create the cluster and then write a "Yarn Properties file"
named ".yarn-properties-YOUR_USER_NAME" in the directory:
either the one specified by the option "yarn.properties-file.location"
in the flink-conf.yaml or in your local
System.getProperty("java.io.tmpdir"). This file will contain the
applicationId of the cluster and
it will be picked up by any future calls to `flink run`. Could you
check if this file exists and if it is updated every time you create a
cluster?

Thanks,
Kostas

On Thu, Dec 12, 2019 at 2:22 PM vino yang <[hidden email]> wrote:
>
> Hi Pankaj,
>
> Can you tell us what's Flink version do you use?  And can you share the Flink client and job manager log with us?
>
> This information would help us to locate your problem.
>
> Best,
> Vino
>
> Pankaj Chand <[hidden email]> 于2019年12月12日周四 下午7:08写道:
>>
>> Hello,
>>
>> When using Flink on YARN in session mode, each Flink job client would automatically know the YARN cluster to connect to. It says this somewhere in the documentation.
>>
>> So, I killed the Flink session cluster by simply killing the YARN application using the "yarn kill" command. However, when starting a new Flink session cluster and trying to submit Flink jobs to yarn-session, Flink complains that the old cluster (it gave the port number and YARN application ID) is not available.
>>
>> It seems like the details of the old cluster were still stored somewhere in Flink. So, I had to completely replace the Flink folder with a new one.
>>
>> Does anyone know the proper way to kill a Flink+YARN session cluster to completely remove it so that jobs will get submitted to a new Flink session cluster?
>>
>> Thanks,
>>
>> Pankaj
Reply | Threaded
Open this post in threaded view
|

Re: Flink client trying to submit jobs to old session cluster (which was killed)

Yang Wang
Hi Pankaj,

Always using `-yid` to submit a flink job to an existing yarn session cluster is a safe way. For example, `flink run -d -yid application_1234 examples/streaming/WordCount.jar`. Maybe the magic properties file will be removed in the future.



Best,

Yang


Pankaj Chand <[hidden email]> 于2019年12月13日周五 下午1:16写道:
Vino and Kostas:

Thank you for the info!

I was using Flink 1.9.1 with Pre-bundled Hadoop 2.7.5.

Cloudlab has quarantined my cluster experiment without notice 😕, so I'll let you know if and when they allow me to access the files in the future.

regards,

Pankaj

On Thu, Dec 12, 2019 at 8:35 AM Kostas Kloudas <[hidden email]> wrote:
Hi Pankaj,

When you start a session cluster with the bin/yarn-session.sh script,
Flink will create the cluster and then write a "Yarn Properties file"
named ".yarn-properties-YOUR_USER_NAME" in the directory:
either the one specified by the option "yarn.properties-file.location"
in the flink-conf.yaml or in your local
System.getProperty("java.io.tmpdir"). This file will contain the
applicationId of the cluster and
it will be picked up by any future calls to `flink run`. Could you
check if this file exists and if it is updated every time you create a
cluster?

Thanks,
Kostas

On Thu, Dec 12, 2019 at 2:22 PM vino yang <[hidden email]> wrote:
>
> Hi Pankaj,
>
> Can you tell us what's Flink version do you use?  And can you share the Flink client and job manager log with us?
>
> This information would help us to locate your problem.
>
> Best,
> Vino
>
> Pankaj Chand <[hidden email]> 于2019年12月12日周四 下午7:08写道:
>>
>> Hello,
>>
>> When using Flink on YARN in session mode, each Flink job client would automatically know the YARN cluster to connect to. It says this somewhere in the documentation.
>>
>> So, I killed the Flink session cluster by simply killing the YARN application using the "yarn kill" command. However, when starting a new Flink session cluster and trying to submit Flink jobs to yarn-session, Flink complains that the old cluster (it gave the port number and YARN application ID) is not available.
>>
>> It seems like the details of the old cluster were still stored somewhere in Flink. So, I had to completely replace the Flink folder with a new one.
>>
>> Does anyone know the proper way to kill a Flink+YARN session cluster to completely remove it so that jobs will get submitted to a new Flink session cluster?
>>
>> Thanks,
>>
>> Pankaj