UnknownHostException during start

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

UnknownHostException during start

Dominique Rondé-2
Dear all,

i got some trouble during the start of Flink in a Yarn-Container based
on Cloudera. I have a start script like that:

slaxxxx:/applvg/home/flink/mvp $ cat run.sh
export FLINK_HOME_DIR=/applvg/home/flink/mvp/flink-1.2.0/
export FLINK_JAR_DIR=/applvg/home/flink/mvp/cache
export YARN_CONF_DIR=/etc/hadoop/conf
export HADOOP_CONF_DIR=/etc/hadoop/conf


/applvg/home/flink/mvp/flink-1.2.0/bin/yarn-session.sh -n 4 -s 3 -st -jm
2048 -tm 2048 -qu root.mr-spark.avp -d

If I execute this script it looks like following:

sla09037:/applvg/home/flink/mvp $ ./run.sh
2017-05-11 15:13:24,541 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: jobmanager.rpc.address, localhost
2017-05-11 15:13:24,542 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: jobmanager.rpc.port, 6123
2017-05-11 15:13:24,542 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: jobmanager.heap.mb, 256
2017-05-11 15:13:24,543 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: taskmanager.heap.mb, 512
2017-05-11 15:13:24,543 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: taskmanager.numberOfTaskSlots, 1
2017-05-11 15:13:24,543 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: taskmanager.memory.preallocate, false
2017-05-11 15:13:24,543 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: parallelism.default, 1
2017-05-11 15:13:24,543 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: jobmanager.web.port, 8081
2017-05-11 15:13:24,571 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: jobmanager.rpc.address, localhost
2017-05-11 15:13:24,572 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: jobmanager.rpc.port, 6123
2017-05-11 15:13:24,572 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: jobmanager.heap.mb, 256
2017-05-11 15:13:24,572 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: taskmanager.heap.mb, 512
2017-05-11 15:13:24,572 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: taskmanager.numberOfTaskSlots, 1
2017-05-11 15:13:24,572 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: taskmanager.memory.preallocate, false
2017-05-11 15:13:24,572 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: parallelism.default, 1
2017-05-11 15:13:24,572 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: jobmanager.web.port, 8081
2017-05-11 15:13:25,000 INFO
org.apache.flink.runtime.security.modules.HadoopModule        - Hadoop
user set to [hidden email] (auth:KERBEROS)
2017-05-11 15:13:25,030 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: jobmanager.rpc.address, localhost
2017-05-11 15:13:25,030 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: jobmanager.rpc.port, 6123
2017-05-11 15:13:25,030 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: jobmanager.heap.mb, 256
2017-05-11 15:13:25,030 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: taskmanager.heap.mb, 512
2017-05-11 15:13:25,031 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: taskmanager.numberOfTaskSlots, 1
2017-05-11 15:13:25,031 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: taskmanager.memory.preallocate, false
2017-05-11 15:13:25,031 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: parallelism.default, 1
2017-05-11 15:13:25,031 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: jobmanager.web.port, 8081
2017-05-11 15:13:25,050 INFO
org.apache.flink.yarn.YarnClusterDescriptor                   - Using
values:
2017-05-11 15:13:25,051 INFO
org.apache.flink.yarn.YarnClusterDescriptor                   -  
TaskManager count = 4
2017-05-11 15:13:25,051 INFO
org.apache.flink.yarn.YarnClusterDescriptor                   -  
JobManager memory = 2048
2017-05-11 15:13:25,051 INFO
org.apache.flink.yarn.YarnClusterDescriptor                   -  
TaskManager memory = 2048
2017-05-11 15:13:25,903 WARN
org.apache.hadoop.util.NativeCodeLoader                       - Unable
to load native-hadoop library for your platform... using builtin-java
classes where applicable
2017-05-11 15:13:25,962 WARN
org.apache.flink.yarn.YarnClusterDescriptor                   - The
configuration directory ('/applvg/home/flink/mvp/flink-1.2.0/conf')
contains both LOG4J and Logback configuration files. Please delete or
rename one of them.
2017-05-11 15:13:25,972 INFO
org.apache.flink.yarn.Utils                                   - Copying
from file:/applvg/home/flink/mvp/flink-1.2.0/lib to
hdfs://nameservice1/user/flink/.flink/application_1493762518335_0216/lib
2017-05-11 15:13:27,522 INFO
org.apache.flink.yarn.Utils                                   - Copying
from file:/applvg/home/flink/mvp/flink-1.2.0/conf/log4j.properties to
hdfs://nameservice1/user/flink/.flink/application_1493762518335_0216/log4j.properties
2017-05-11 15:13:27,552 INFO
org.apache.flink.yarn.Utils                                   - Copying
from file:/applvg/home/flink/mvp/flink-1.2.0/conf/logback.xml to
hdfs://nameservice1/user/flink/.flink/application_1493762518335_0216/logback.xml
2017-05-11 15:13:27,584 INFO
org.apache.flink.yarn.Utils                                   - Copying
from
file:/applvg/home/flink/mvp/flink-1.2.0/lib/flink-dist_2.11-1.2.0.jar to
hdfs://nameservice1/user/flink/.flink/application_1493762518335_0216/flink-dist_2.11-1.2.0.jar
2017-05-11 15:13:28,508 INFO
org.apache.flink.yarn.Utils                                   - Copying
from /applvg/home/flink/mvp/flink-1.2.0/conf/flink-conf.yaml to
hdfs://nameservice1/user/flink/.flink/application_1493762518335_0216/flink-conf.yaml
2017-05-11 15:13:28,553 INFO
org.apache.flink.yarn.YarnClusterDescriptor                   - Adding
delegation token to the AM container..
2017-05-11 15:13:28,563 INFO
org.apache.hadoop.hdfs.DFSClient                              - Created
HDFS_DELEGATION_TOKEN token 27247 for flink on ha-hdfs:nameservice1
Error while deploying YARN cluster: Couldn't deploy Yarn cluster
java.lang.RuntimeException: Couldn't deploy Yarn cluster
        at
org.apache.flink.yarn.AbstractYarnClusterDescriptor.deploy(AbstractYarnClusterDescriptor.java:421)
        at
org.apache.flink.yarn.cli.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:620)
        at
org.apache.flink.yarn.cli.FlinkYarnSessionCli$1.call(FlinkYarnSessionCli.java:476)
        at
org.apache.flink.yarn.cli.FlinkYarnSessionCli$1.call(FlinkYarnSessionCli.java:473)
        at
org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)
        at
org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40)
        at
org.apache.flink.yarn.cli.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:473)
Caused by: java.lang.IllegalArgumentException:
java.net.UnknownHostException: lfrar256.srv.company;lfrar257.srv.company
        at
org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:374)
        at
org.apache.hadoop.crypto.key.kms.KMSClientProvider.getDelegationTokenService(KMSClientProvider.java:823)
        at
org.apache.hadoop.crypto.key.kms.KMSClientProvider.addDelegationTokens(KMSClientProvider.java:779)
        at
org.apache.hadoop.crypto.key.KeyProviderDelegationTokenExtension.addDelegationTokens(KeyProviderDelegationTokenExtension.java:86)
        at
org.apache.hadoop.hdfs.DistributedFileSystem.addDelegationTokens(DistributedFileSystem.java:2046)
        at
org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:121)
        at
org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:100)
        at
org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:80)
        at org.apache.flink.yarn.Utils.setTokensFor(Utils.java:154)
        at
org.apache.flink.yarn.AbstractYarnClusterDescriptor.deployInternal(AbstractYarnClusterDescriptor.java:753)
        at
org.apache.flink.yarn.AbstractYarnClusterDescriptor.deploy(AbstractYarnClusterDescriptor.java:419)
        ... 9 more
Caused by: java.net.UnknownHostException:
lfrarXXX1.srv.company;lfrarXXX2.srv.company
        ... 20 more
       
It seems that flink found these hosts here:
slaxxxxx:/applvg/home/flink/mvp $ grep -r
"lfrarXXX1.srv.company;lfrarXXX2.srv.company" /etc/hadoop/conf
/etc/hadoop/conf/core-site.xml:  
<value>kms://[hidden email];lfrarXXX2.srv.company:16000/kms</value>
/etc/hadoop/conf/hdfs-site.xml:  
<value>kms://[hidden email];lfrarXXX2.srv.company:16000/kms</value>

So I guess that flink got this connectionstrings from the
Cloudera-Config and "forget" to split it at the ";". So if i ping each
of those everything is working.

Maybe you have some hints to avoid this problem?

Best wishes
Dominiuqe

Reply | Threaded
Open this post in threaded view
|

Re: UnknownHostException during start

Till Rohrmann

Hi Dominique,

I’m not exactly sure but this looks more like a Hadoop or a Hadoop configuration problem to me. Could it be that the Hadoop version you’re running does not support the specification of multiple KMS servers via kms://[hidden email];lfrarXXX2.srv.company:16000/kms?

Cheers,
Till


On Thu, May 11, 2017 at 4:06 PM, Dominique Rondé <[hidden email]> wrote:
Dear all,

i got some trouble during the start of Flink in a Yarn-Container based
on Cloudera. I have a start script like that:

slaxxxx:/applvg/home/flink/mvp $ cat run.sh
export FLINK_HOME_DIR=/applvg/home/flink/mvp/flink-1.2.0/
export FLINK_JAR_DIR=/applvg/home/flink/mvp/cache
export YARN_CONF_DIR=/etc/hadoop/conf
export HADOOP_CONF_DIR=/etc/hadoop/conf


/applvg/home/flink/mvp/flink-1.2.0/bin/yarn-session.sh -n 4 -s 3 -st -jm
2048 -tm 2048 -qu root.mr-spark.avp -d

If I execute this script it looks like following:

sla09037:/applvg/home/flink/mvp $ ./run.sh
2017-05-11 15:13:24,541 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: jobmanager.rpc.address, localhost
2017-05-11 15:13:24,542 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: jobmanager.rpc.port, 6123
2017-05-11 15:13:24,542 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: jobmanager.heap.mb, 256
2017-05-11 15:13:24,543 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: taskmanager.heap.mb, 512
2017-05-11 15:13:24,543 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: taskmanager.numberOfTaskSlots, 1
2017-05-11 15:13:24,543 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: taskmanager.memory.preallocate, false
2017-05-11 15:13:24,543 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: parallelism.default, 1
2017-05-11 15:13:24,543 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: jobmanager.web.port, 8081
2017-05-11 15:13:24,571 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: jobmanager.rpc.address, localhost
2017-05-11 15:13:24,572 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: jobmanager.rpc.port, 6123
2017-05-11 15:13:24,572 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: jobmanager.heap.mb, 256
2017-05-11 15:13:24,572 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: taskmanager.heap.mb, 512
2017-05-11 15:13:24,572 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: taskmanager.numberOfTaskSlots, 1
2017-05-11 15:13:24,572 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: taskmanager.memory.preallocate, false
2017-05-11 15:13:24,572 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: parallelism.default, 1
2017-05-11 15:13:24,572 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: jobmanager.web.port, 8081
2017-05-11 15:13:25,000 INFO
org.apache.flink.runtime.security.modules.HadoopModule        - Hadoop
user set to [hidden email] (auth:KERBEROS)
2017-05-11 15:13:25,030 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: jobmanager.rpc.address, localhost
2017-05-11 15:13:25,030 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: jobmanager.rpc.port, 6123
2017-05-11 15:13:25,030 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: jobmanager.heap.mb, 256
2017-05-11 15:13:25,030 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: taskmanager.heap.mb, 512
2017-05-11 15:13:25,031 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: taskmanager.numberOfTaskSlots, 1
2017-05-11 15:13:25,031 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: taskmanager.memory.preallocate, false
2017-05-11 15:13:25,031 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: parallelism.default, 1
2017-05-11 15:13:25,031 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: jobmanager.web.port, 8081
2017-05-11 15:13:25,050 INFO
org.apache.flink.yarn.YarnClusterDescriptor                   - Using
values:
2017-05-11 15:13:25,051 INFO
org.apache.flink.yarn.YarnClusterDescriptor                   -
TaskManager count = 4
2017-05-11 15:13:25,051 INFO
org.apache.flink.yarn.YarnClusterDescriptor                   -
JobManager memory = 2048
2017-05-11 15:13:25,051 INFO
org.apache.flink.yarn.YarnClusterDescriptor                   -
TaskManager memory = 2048
2017-05-11 15:13:25,903 WARN
org.apache.hadoop.util.NativeCodeLoader                       - Unable
to load native-hadoop library for your platform... using builtin-java
classes where applicable
2017-05-11 15:13:25,962 WARN
org.apache.flink.yarn.YarnClusterDescriptor                   - The
configuration directory ('/applvg/home/flink/mvp/flink-1.2.0/conf')
contains both LOG4J and Logback configuration files. Please delete or
rename one of them.
2017-05-11 15:13:25,972 INFO
org.apache.flink.yarn.Utils                                   - Copying
from file:/applvg/home/flink/mvp/flink-1.2.0/lib to
hdfs://nameservice1/user/flink/.flink/application_1493762518335_0216/lib
2017-05-11 15:13:27,522 INFO
org.apache.flink.yarn.Utils                                   - Copying
from file:/applvg/home/flink/mvp/flink-1.2.0/conf/log4j.properties to
hdfs://nameservice1/user/flink/.flink/application_1493762518335_0216/log4j.properties
2017-05-11 15:13:27,552 INFO
org.apache.flink.yarn.Utils                                   - Copying
from file:/applvg/home/flink/mvp/flink-1.2.0/conf/logback.xml to
hdfs://nameservice1/user/flink/.flink/application_1493762518335_0216/logback.xml
2017-05-11 15:13:27,584 INFO
org.apache.flink.yarn.Utils                                   - Copying
from
file:/applvg/home/flink/mvp/flink-1.2.0/lib/flink-dist_2.11-1.2.0.jar to
hdfs://nameservice1/user/flink/.flink/application_1493762518335_0216/flink-dist_2.11-1.2.0.jar
2017-05-11 15:13:28,508 INFO
org.apache.flink.yarn.Utils                                   - Copying
from /applvg/home/flink/mvp/flink-1.2.0/conf/flink-conf.yaml to
hdfs://nameservice1/user/flink/.flink/application_1493762518335_0216/flink-conf.yaml
2017-05-11 15:13:28,553 INFO
org.apache.flink.yarn.YarnClusterDescriptor                   - Adding
delegation token to the AM container..
2017-05-11 15:13:28,563 INFO
org.apache.hadoop.hdfs.DFSClient                              - Created
HDFS_DELEGATION_TOKEN token 27247 for flink on ha-hdfs:nameservice1
Error while deploying YARN cluster: Couldn't deploy Yarn cluster
java.lang.RuntimeException: Couldn't deploy Yarn cluster
        at
org.apache.flink.yarn.AbstractYarnClusterDescriptor.deploy(AbstractYarnClusterDescriptor.java:421)
        at
org.apache.flink.yarn.cli.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:620)
        at
org.apache.flink.yarn.cli.FlinkYarnSessionCli$1.call(FlinkYarnSessionCli.java:476)
        at
org.apache.flink.yarn.cli.FlinkYarnSessionCli$1.call(FlinkYarnSessionCli.java:473)
        at
org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)
        at
org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40)
        at
org.apache.flink.yarn.cli.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:473)
Caused by: java.lang.IllegalArgumentException:
java.net.UnknownHostException: lfrar256.srv.company;lfrar257.srv.company
        at
org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:374)
        at
org.apache.hadoop.crypto.key.kms.KMSClientProvider.getDelegationTokenService(KMSClientProvider.java:823)
        at
org.apache.hadoop.crypto.key.kms.KMSClientProvider.addDelegationTokens(KMSClientProvider.java:779)
        at
org.apache.hadoop.crypto.key.KeyProviderDelegationTokenExtension.addDelegationTokens(KeyProviderDelegationTokenExtension.java:86)
        at
org.apache.hadoop.hdfs.DistributedFileSystem.addDelegationTokens(DistributedFileSystem.java:2046)
        at
org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:121)
        at
org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:100)
        at
org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:80)
        at org.apache.flink.yarn.Utils.setTokensFor(Utils.java:154)
        at
org.apache.flink.yarn.AbstractYarnClusterDescriptor.deployInternal(AbstractYarnClusterDescriptor.java:753)
        at
org.apache.flink.yarn.AbstractYarnClusterDescriptor.deploy(AbstractYarnClusterDescriptor.java:419)
        ... 9 more
Caused by: java.net.UnknownHostException:
lfrarXXX1.srv.company;lfrarXXX2.srv.company
        ... 20 more

It seems that flink found these hosts here:
slaxxxxx:/applvg/home/flink/mvp $ grep -r
"lfrarXXX1.srv.company;lfrarXXX2.srv.company" /etc/hadoop/conf
/etc/hadoop/conf/core-site.xml:
<value>kms://https@lfrarXXX1.srv.company;lfrarXXX2.srv.company:16000/kms</value>
/etc/hadoop/conf/hdfs-site.xml:
<value>kms://https@lfrarXXX1.srv.company;lfrarXXX2.srv.company:16000/kms</value>

So I guess that flink got this connectionstrings from the
Cloudera-Config and "forget" to split it at the ";". So if i ping each
of those everything is working.

Maybe you have some hints to avoid this problem?

Best wishes
Dominiuqe


Reply | Threaded
Open this post in threaded view
|

Re: UnknownHostException during start

Ted Yu
Dominique:
Which hadoop release are you using ?

Please pastebin the classpath.

Cheers

On Thu, May 11, 2017 at 7:27 AM, Till Rohrmann <[hidden email]> wrote:

Hi Dominique,

I’m not exactly sure but this looks more like a Hadoop or a Hadoop configuration problem to me. Could it be that the Hadoop version you’re running does not support the specification of multiple KMS servers via kms://[hidden email].company;lfrarXXX2.srv.company:16000/kms?

Cheers,
Till


On Thu, May 11, 2017 at 4:06 PM, Dominique Rondé <[hidden email]> wrote:
Dear all,

i got some trouble during the start of Flink in a Yarn-Container based
on Cloudera. I have a start script like that:

slaxxxx:/applvg/home/flink/mvp $ cat run.sh
export FLINK_HOME_DIR=/applvg/home/flink/mvp/flink-1.2.0/
export FLINK_JAR_DIR=/applvg/home/flink/mvp/cache
export YARN_CONF_DIR=/etc/hadoop/conf
export HADOOP_CONF_DIR=/etc/hadoop/conf


/applvg/home/flink/mvp/flink-1.2.0/bin/yarn-session.sh -n 4 -s 3 -st -jm
2048 -tm 2048 -qu root.mr-spark.avp -d

If I execute this script it looks like following:

sla09037:/applvg/home/flink/mvp $ ./run.sh
2017-05-11 15:13:24,541 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: jobmanager.rpc.address, localhost
2017-05-11 15:13:24,542 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: jobmanager.rpc.port, 6123
2017-05-11 15:13:24,542 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: jobmanager.heap.mb, 256
2017-05-11 15:13:24,543 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: taskmanager.heap.mb, 512
2017-05-11 15:13:24,543 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: taskmanager.numberOfTaskSlots, 1
2017-05-11 15:13:24,543 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: taskmanager.memory.preallocate, false
2017-05-11 15:13:24,543 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: parallelism.default, 1
2017-05-11 15:13:24,543 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: jobmanager.web.port, 8081
2017-05-11 15:13:24,571 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: jobmanager.rpc.address, localhost
2017-05-11 15:13:24,572 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: jobmanager.rpc.port, 6123
2017-05-11 15:13:24,572 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: jobmanager.heap.mb, 256
2017-05-11 15:13:24,572 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: taskmanager.heap.mb, 512
2017-05-11 15:13:24,572 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: taskmanager.numberOfTaskSlots, 1
2017-05-11 15:13:24,572 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: taskmanager.memory.preallocate, false
2017-05-11 15:13:24,572 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: parallelism.default, 1
2017-05-11 15:13:24,572 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: jobmanager.web.port, 8081
2017-05-11 15:13:25,000 INFO
org.apache.flink.runtime.security.modules.HadoopModule        - Hadoop
user set to [hidden email] (auth:KERBEROS)
2017-05-11 15:13:25,030 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: jobmanager.rpc.address, localhost
2017-05-11 15:13:25,030 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: jobmanager.rpc.port, 6123
2017-05-11 15:13:25,030 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: jobmanager.heap.mb, 256
2017-05-11 15:13:25,030 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: taskmanager.heap.mb, 512
2017-05-11 15:13:25,031 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: taskmanager.numberOfTaskSlots, 1
2017-05-11 15:13:25,031 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: taskmanager.memory.preallocate, false
2017-05-11 15:13:25,031 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: parallelism.default, 1
2017-05-11 15:13:25,031 INFO
org.apache.flink.configuration.GlobalConfiguration            - Loading
configuration property: jobmanager.web.port, 8081
2017-05-11 15:13:25,050 INFO
org.apache.flink.yarn.YarnClusterDescriptor                   - Using
values:
2017-05-11 15:13:25,051 INFO
org.apache.flink.yarn.YarnClusterDescriptor                   -
TaskManager count = 4
2017-05-11 15:13:25,051 INFO
org.apache.flink.yarn.YarnClusterDescriptor                   -
JobManager memory = 2048
2017-05-11 15:13:25,051 INFO
org.apache.flink.yarn.YarnClusterDescriptor                   -
TaskManager memory = 2048
2017-05-11 15:13:25,903 WARN
org.apache.hadoop.util.NativeCodeLoader                       - Unable
to load native-hadoop library for your platform... using builtin-java
classes where applicable
2017-05-11 15:13:25,962 WARN
org.apache.flink.yarn.YarnClusterDescriptor                   - The
configuration directory ('/applvg/home/flink/mvp/flink-1.2.0/conf')
contains both LOG4J and Logback configuration files. Please delete or
rename one of them.
2017-05-11 15:13:25,972 INFO
org.apache.flink.yarn.Utils                                   - Copying
from file:/applvg/home/flink/mvp/flink-1.2.0/lib to
hdfs://nameservice1/user/flink/.flink/application_1493762518335_0216/lib
2017-05-11 15:13:27,522 INFO
org.apache.flink.yarn.Utils                                   - Copying
from file:/applvg/home/flink/mvp/flink-1.2.0/conf/log4j.properties to
hdfs://nameservice1/user/flink/.flink/application_1493762518335_0216/log4j.properties
2017-05-11 15:13:27,552 INFO
org.apache.flink.yarn.Utils                                   - Copying
from file:/applvg/home/flink/mvp/flink-1.2.0/conf/logback.xml to
hdfs://nameservice1/user/flink/.flink/application_1493762518335_0216/logback.xml
2017-05-11 15:13:27,584 INFO
org.apache.flink.yarn.Utils                                   - Copying
from
file:/applvg/home/flink/mvp/flink-1.2.0/lib/flink-dist_2.11-1.2.0.jar to
hdfs://nameservice1/user/flink/.flink/application_1493762518335_0216/flink-dist_2.11-1.2.0.jar
2017-05-11 15:13:28,508 INFO
org.apache.flink.yarn.Utils                                   - Copying
from /applvg/home/flink/mvp/flink-1.2.0/conf/flink-conf.yaml to
hdfs://nameservice1/user/flink/.flink/application_1493762518335_0216/flink-conf.yaml
2017-05-11 15:13:28,553 INFO
org.apache.flink.yarn.YarnClusterDescriptor                   - Adding
delegation token to the AM container..
2017-05-11 15:13:28,563 INFO
org.apache.hadoop.hdfs.DFSClient                              - Created
HDFS_DELEGATION_TOKEN token 27247 for flink on ha-hdfs:nameservice1
Error while deploying YARN cluster: Couldn't deploy Yarn cluster
java.lang.RuntimeException: Couldn't deploy Yarn cluster
        at
org.apache.flink.yarn.AbstractYarnClusterDescriptor.deploy(AbstractYarnClusterDescriptor.java:421)
        at
org.apache.flink.yarn.cli.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:620)
        at
org.apache.flink.yarn.cli.FlinkYarnSessionCli$1.call(FlinkYarnSessionCli.java:476)
        at
org.apache.flink.yarn.cli.FlinkYarnSessionCli$1.call(FlinkYarnSessionCli.java:473)
        at
org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)
        at
org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40)
        at
org.apache.flink.yarn.cli.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:473)
Caused by: java.lang.IllegalArgumentException:
java.net.UnknownHostException: lfrar256.srv.company;lfrar257.srv.company
        at
org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:374)
        at
org.apache.hadoop.crypto.key.kms.KMSClientProvider.getDelegationTokenService(KMSClientProvider.java:823)
        at
org.apache.hadoop.crypto.key.kms.KMSClientProvider.addDelegationTokens(KMSClientProvider.java:779)
        at
org.apache.hadoop.crypto.key.KeyProviderDelegationTokenExtension.addDelegationTokens(KeyProviderDelegationTokenExtension.java:86)
        at
org.apache.hadoop.hdfs.DistributedFileSystem.addDelegationTokens(DistributedFileSystem.java:2046)
        at
org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:121)
        at
org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:100)
        at
org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:80)
        at org.apache.flink.yarn.Utils.setTokensFor(Utils.java:154)
        at
org.apache.flink.yarn.AbstractYarnClusterDescriptor.deployInternal(AbstractYarnClusterDescriptor.java:753)
        at
org.apache.flink.yarn.AbstractYarnClusterDescriptor.deploy(AbstractYarnClusterDescriptor.java:419)
        ... 9 more
Caused by: java.net.UnknownHostException:
lfrarXXX1.srv.company;lfrarXXX2.srv.company
        ... 20 more

It seems that flink found these hosts here:
slaxxxxx:/applvg/home/flink/mvp $ grep -r
"lfrarXXX1.srv.company;lfrarXXX2.srv.company" /etc/hadoop/conf
/etc/hadoop/conf/core-site.xml:
<value>kms://[hidden email]rv.company;lfrarXXX2.srv.company:16000/kms</value>
/etc/hadoop/conf/hdfs-site.xml:
<value>kms://[hidden email]rv.company;lfrarXXX2.srv.company:16000/kms</value>

So I guess that flink got this connectionstrings from the
Cloudera-Config and "forget" to split it at the ";". So if i ping each
of those everything is working.

Maybe you have some hints to avoid this problem?

Best wishes
Dominiuqe



Reply | Threaded
Open this post in threaded view
|

Re: UnknownHostException during start

wangpeibin
I meet the same problem and I'm using Hadoop 2.6.0-cdh5.7.1! thanks