flink on yarn ha

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

flink on yarn ha

lining jing
flink version: 1.1.3

kill jobmanager, the job fail. Ha config did not work.
Reply | Threaded
Open this post in threaded view
|

Re: flink on yarn ha

Stephan Ewen
Hi!

Flink 1.1.4 and Flink 1.2 fixed a bunch of issues with HA, can you try those versions?

If these also have issues, could you share the logs of the JobManager?

Thanks!

On Tue, Feb 21, 2017 at 11:41 AM, lining jing <[hidden email]> wrote:
flink version: 1.1.3

kill jobmanager, the job fail. Ha config did not work.

Reply | Threaded
Open this post in threaded view
|

Re: flink on yarn ha

lining jing
Thanks, Stephan !  I will try it!

2017-02-21 21:42 GMT+08:00 Stephan Ewen <[hidden email]>:
Hi!

Flink 1.1.4 and Flink 1.2 fixed a bunch of issues with HA, can you try those versions?

If these also have issues, could you share the logs of the JobManager?

Thanks!

On Tue, Feb 21, 2017 at 11:41 AM, lining jing <[hidden email]> wrote:
flink version: 1.1.3

kill jobmanager, the job fail. Ha config did not work.


Reply | Threaded
Open this post in threaded view
|

Re: flink on yarn ha

lining jing
Hi,
I update flink from 1.1.3 to 1.2

but fail

this is jobManager error log

Failed toString() invocation on an object of type [org.apache.hadoop.yarn.api.records.impl.pb.LocalResourcePBImpl]
java.lang.NoSuchMethodError: org.apache.hadoop.security.proto.SecurityProtos.getDescriptor()Lorg/apache/flink/hadoop/shaded/com/google/protobuf/Des
criptors$FileDescriptor;
        at org.apache.hadoop.yarn.proto.YarnProtos.<clinit>(YarnProtos.java:47614)
        at org.apache.hadoop.yarn.proto.YarnProtos$LocalResourceProto.internalGetFieldAccessorTable(YarnProtos.java:11542)
        at org.apache.flink.hadoop.shaded.com.google.protobuf.GeneratedMessage.getAllFieldsMutable(GeneratedMessage.java:105)
        at org.apache.flink.hadoop.shaded.com.google.protobuf.GeneratedMessage.getAllFields(GeneratedMessage.java:153)
        at org.apache.flink.hadoop.shaded.com.google.protobuf.TextFormat$Printer.print(TextFormat.java:272)
        at org.apache.flink.hadoop.shaded.com.google.protobuf.TextFormat$Printer.access$400(TextFormat.java:248)
        at org.apache.flink.hadoop.shaded.com.google.protobuf.TextFormat.shortDebugString(TextFormat.java:88)
        at org.apache.hadoop.yarn.api.records.impl.pb.LocalResourcePBImpl.toString(LocalResourcePBImpl.java:77)
        at org.slf4j.helpers.MessageFormatter.safeObjectAppend(MessageFormatter.java:305)
        at org.slf4j.helpers.MessageFormatter.deeplyAppendParameter(MessageFormatter.java:277)
        at org.slf4j.helpers.MessageFormatter.arrayFormat(MessageFormatter.java:231)
        at org.slf4j.helpers.MessageFormatter.format(MessageFormatter.java:124)
        at org.slf4j.impl.Log4jLoggerAdapter.info(Log4jLoggerAdapter.java:322)
        at org.apache.flink.yarn.YarnApplicationMasterRunner.createTaskManagerContext(YarnApplicationMasterRunner.java:684)
        at org.apache.flink.yarn.YarnApplicationMasterRunner.runApplicationMaster(YarnApplicationMasterRunner.java:331)
        at org.apache.flink.yarn.YarnApplicationMasterRunner$1.call(YarnApplicationMasterRunner.java:203)
        at org.apache.flink.yarn.YarnApplicationMasterRunner$1.call(YarnApplicationMasterRunner.java:200)
        at org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
        at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40)
        at org.apache.flink.yarn.YarnApplicationMasterRunner.run(YarnApplicationMasterRunner.java:200)
        at org.apache.flink.yarn.YarnApplicationMasterRunner.main(YarnApplicationMasterRunner.java:124)

2017-02-22 8:44 GMT+08:00 lining jing <[hidden email]>:
Thanks, Stephan !  I will try it!

2017-02-21 21:42 GMT+08:00 Stephan Ewen <[hidden email]>:
Hi!

Flink 1.1.4 and Flink 1.2 fixed a bunch of issues with HA, can you try those versions?

If these also have issues, could you share the logs of the JobManager?

Thanks!

On Tue, Feb 21, 2017 at 11:41 AM, lining jing <[hidden email]> wrote:
flink version: 1.1.3

kill jobmanager, the job fail. Ha config did not work.



Reply | Threaded
Open this post in threaded view
|

Re: flink on yarn ha

rmetzger0
Hi,
This looks like a shading issue. Can you post the classpath the JobManager / AppMaster is logging on startup on the mailing list?
If seems that Hadoop loads an unshaded version of the SecurityProtos.

Maybe there is some hadoop version mixup.

Are you using a Hadoop distribution (like CDH or HDP) ?
Which Hadoop are you using?

Which Flink build are you using?


On Wed, Feb 22, 2017 at 4:09 AM, lining jing <[hidden email]> wrote:
Hi,
I update flink from 1.1.3 to 1.2

but fail

this is jobManager error log

Failed toString() invocation on an object of type [org.apache.hadoop.yarn.api.records.impl.pb.LocalResourcePBImpl]
java.lang.NoSuchMethodError: org.apache.hadoop.security.proto.SecurityProtos.getDescriptor()Lorg/apache/flink/hadoop/shaded/com/google/protobuf/Des
criptors$FileDescriptor;
        at org.apache.hadoop.yarn.proto.YarnProtos.<clinit>(YarnProtos.java:47614)
        at org.apache.hadoop.yarn.proto.YarnProtos$LocalResourceProto.internalGetFieldAccessorTable(YarnProtos.java:11542)
        at org.apache.flink.hadoop.shaded.com.google.protobuf.GeneratedMessage.getAllFieldsMutable(GeneratedMessage.java:105)
        at org.apache.flink.hadoop.shaded.com.google.protobuf.GeneratedMessage.getAllFields(GeneratedMessage.java:153)
        at org.apache.flink.hadoop.shaded.com.google.protobuf.TextFormat$Printer.print(TextFormat.java:272)
        at org.apache.flink.hadoop.shaded.com.google.protobuf.TextFormat$Printer.access$400(TextFormat.java:248)
        at org.apache.flink.hadoop.shaded.com.google.protobuf.TextFormat.shortDebugString(TextFormat.java:88)
        at org.apache.hadoop.yarn.api.records.impl.pb.LocalResourcePBImpl.toString(LocalResourcePBImpl.java:77)
        at org.slf4j.helpers.MessageFormatter.safeObjectAppend(MessageFormatter.java:305)
        at org.slf4j.helpers.MessageFormatter.deeplyAppendParameter(MessageFormatter.java:277)
        at org.slf4j.helpers.MessageFormatter.arrayFormat(MessageFormatter.java:231)
        at org.slf4j.helpers.MessageFormatter.format(MessageFormatter.java:124)
        at org.slf4j.impl.Log4jLoggerAdapter.info(Log4jLoggerAdapter.java:322)
        at org.apache.flink.yarn.YarnApplicationMasterRunner.createTaskManagerContext(YarnApplicationMasterRunner.java:684)
        at org.apache.flink.yarn.YarnApplicationMasterRunner.runApplicationMaster(YarnApplicationMasterRunner.java:331)
        at org.apache.flink.yarn.YarnApplicationMasterRunner$1.call(YarnApplicationMasterRunner.java:203)
        at org.apache.flink.yarn.YarnApplicationMasterRunner$1.call(YarnApplicationMasterRunner.java:200)
        at org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
        at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40)
        at org.apache.flink.yarn.YarnApplicationMasterRunner.run(YarnApplicationMasterRunner.java:200)
        at org.apache.flink.yarn.YarnApplicationMasterRunner.main(YarnApplicationMasterRunner.java:124)

2017-02-22 8:44 GMT+08:00 lining jing <[hidden email]>:
Thanks, Stephan !  I will try it!

2017-02-21 21:42 GMT+08:00 Stephan Ewen <[hidden email]>:
Hi!

Flink 1.1.4 and Flink 1.2 fixed a bunch of issues with HA, can you try those versions?

If these also have issues, could you share the logs of the JobManager?

Thanks!

On Tue, Feb 21, 2017 at 11:41 AM, lining jing <[hidden email]> wrote:
flink version: 1.1.3

kill jobmanager, the job fail. Ha config did not work.