Hello! We are running Flink on Yarn and we are currently getting the following error: 2019-08-23 06:11:01,534 WARN org.apache.hadoop.security.UserGroupInformation - PriviledgedActionException as:XXXX (auth:KERBEROS) cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
Invalid AMRMToken from appattempt_1564713228886_5299648_000001 2019-08-23 06:11:01,535 WARN org.apache.hadoop.ipc.Client - Exception encountered while connecting to the server : org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
Invalid AMRMToken from appattempt_1564713228886_5299648_000001 2019-08-23 06:11:01,536 WARN org.apache.hadoop.security.UserGroupInformation - PriviledgedActionException as: XXXX (auth:KERBEROS) cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
Invalid AMRMToken from appattempt_1564713228886_5299648_000001 2019-08-23 06:11:01,581 WARN org.apache.hadoop.io.retry.RetryInvocationHandler - Exception while invoking ApplicationMasterProtocolPBClientImpl.allocate
over rm0. Not retrying because Invalid or Cancelled Token org.apache.hadoop.security.token.SecretManager$InvalidToken: Invalid AMRMToken from appattempt_1564713228886_5299648_000001 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53) at org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:104) at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.allocate(ApplicationMasterProtocolPBClientImpl.java:79) at sun.reflect.GeneratedMethodAccessor37.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:288) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:206) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:188) at com.sun.proxy.$Proxy26.allocate(Unknown Source) at org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.allocate(AMRMClientImpl.java:277) at org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$HeartbeatThread.run(AMRMClientAsyncImpl.java:224) Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): Invalid AMRMToken from appattempt_1564713228886_5299648_000001 at org.apache.hadoop.ipc.Client.call(Client.java:1472) at org.apache.hadoop.ipc.Client.call(Client.java:1409) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:231) at com.sun.proxy.$Proxy25.allocate(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.allocate(ApplicationMasterProtocolPBClientImpl.java:77) ... 9 more The Flink cluster runs ok for a while but then after a day we get this error again. We haven’t made changes to our code so that’s why it’s hard to understand why all of a sudden we started to see this. We found this issue reported on Yarn
https://issues.apache.org/jira/browse/YARN-3103 but our version of Yarn already has that fix. Any help will be appreciated. Thank you, Juan |
Hi Juan, Have you tried Flink release built with Hadoop 2.7 or later version? If you are using Flink 1.8/1.9, it should be Pre-bundled Hadoop 2.7+ jar which can be found in the Flink download page. I think YARN-3103 is about AMRMClientImp.class and it is in the flink shaded hadoop jar. Thanks, Zhu Zhu Juan Gentile <[hidden email]> 于2019年8月23日周五 下午7:48写道:
|
This seems like your Kerberos server is starting to issue invalid token to your job manager. Can you share how your Kerberos setting is configured? This might also relate to how your KDC servers are configured. -- Rong On Fri, Aug 23, 2019 at 7:00 AM Zhu Zhu <[hidden email]> wrote:
|
Free forum by Nabble | Edit this page |