(DEPRECATED) Apache Flink User Mailing List archive.

Issue with Flink not able to properly read the ResourceManager address for a HA setup

Classic

List

Threaded

2 messages Options

Sai Inampudi

Issue with Flink not able to properly read the ResourceManager address for a HA setup

Hi, I am trying to create a flink cluster on yarn, by running the following command but the logs[1] are showing that it is unable to properly connect to the ResourceManager

~/flink-1.5.4/bin/yarn-session.sh -n 5 -tm 2048 -s 4 -d -nm flink_yarn

I found a stackoverflow[2] post where someone mentioned that this could be a result of the flink's packaged hadoop version being different than the hadoop on the node and therefore the flink is not able to properly read the ResourceManager address for a HA setup. However, I confirmed the versions are the same in my case. I downloaded flink-1.5.4-bin-hadoop26-scala_2.11 and when I do a hadoop version on the node, I get Hadoop 2.6.0-cdh5.14.0. Would anyone have any ideas on what else the issue could be?

Additional info: The cluster I am running these on is kerberized so I am not sure if that plays into the issue that is being caused. I setup flink-conf to use kerberos ticket cache and did a kinit before trying to stand up the cluster. I verified the ticket cache was generated by doing a klist (logs in the gist [2])

[1] https://gist.github.com/sai-inampudi/9e1e823096d2685ed2282827432ef311
[2] https://stackoverflow.com/questions/32085990/error-with-kerberos-authentication-when-executing-flink-example-code-on-yarn-clu

Paul Lam

Re: Issue with Flink not able to properly read the ResourceManager address for a HA setup

Hi Sai,

It looks like the Hadoop config path is not correctly set. You could set the logging level in log4j-cli.properties to debug to get more informations.

Best,

Paul Lam

在 2018年12月20日，03:18，Sai Inampudi <[hidden email]> 写道：

Hi, I am trying to create a flink cluster on yarn, by running the following command but the logs[1] are showing that it is unable to properly connect to the ResourceManager

~/flink-1.5.4/bin/yarn-session.sh -n 5 -tm 2048 -s 4 -d -nm flink_yarn

I found a stackoverflow[2] post where someone mentioned that this could be a result of the flink's packaged hadoop version being different than the hadoop on the node and therefore the flink is not able to properly read the ResourceManager address for a HA setup. However, I confirmed the versions are the same in my case. I downloaded flink-1.5.4-bin-hadoop26-scala_2.11 and when I do a hadoop version on the node, I get Hadoop 2.6.0-cdh5.14.0. Would anyone have any ideas on what else the issue could be?

Additional info: The cluster I am running these on is kerberized so I am not sure if that plays into the issue that is being caused. I setup flink-conf to use kerberos ticket cache and did a kinit before trying to stand up the cluster. I verified the ticket cache was generated by doing a klist (logs in the gist [2])

[1] https://gist.github.com/sai-inampudi/9e1e823096d2685ed2282827432ef311
[2] https://stackoverflow.com/questions/32085990/error-with-kerberos-authentication-when-executing-flink-example-code-on-yarn-clu