FlinkSQL submit query and then the jobmanager failed.

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

FlinkSQL submit query and then the jobmanager failed.

yidan zhao
As the title, my query sql is very simple, it just select all columns from a hive table(version 1.2.1; orc format).  When the sql is submitted, after several seconds, the jobmanager is failed. Here is the Jobmanager's log.
Does anyone can help to this problem?
2021-01-24 04:41:24,952 ERROR org.apache.flink.runtime.util.FatalExitExceptionHandler      [] - FATAL: Thread 'flink-akka.actor.default-dispatcher-2' produced an uncaught exception. Stopping the process...

java.util.concurrent.CompletionException: org.apache.flink.util.FlinkRuntimeException: Failed to start the operator coordinators
at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:273) ~[?:1.8.0_251]
at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:280) ~[?:1.8.0_251]
at java.util.concurrent.CompletableFuture.uniRun(CompletableFuture.java:722) ~[?:1.8.0_251]
at java.util.concurrent.CompletableFuture.uniRunStage(CompletableFuture.java:731) ~[?:1.8.0_251]
at java.util.concurrent.CompletableFuture.thenRun(CompletableFuture.java:2023) ~[?:1.8.0_251]
at org.apache.flink.runtime.jobmaster.JobMaster.resetAndStartScheduler(JobMaster.java:935) ~[flink-dists-extended_2.11-1.12.0.jar:?]
at org.apache.flink.runtime.jobmaster.JobMaster.startJobExecution(JobMaster.java:801) ~[flink-dists-extended_2.11-1.12.0.jar:?]
at org.apache.flink.runtime.jobmaster.JobMaster.lambda$start$1(JobMaster.java:357) ~[flink-dists-extended_2.11-1.12.0.jar:?]
at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleCallAsync(AkkaRpcActor.java:383) ~[flink-dists-extended_2.11-1.12.0.jar:?]
at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:199) ~[flink-dists-extended_2.11-1.12.0.jar:?]
at org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:88) ~[flink-dists-extended_2.11-1.12.0.jar:?]
at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:154) ~[flink-dists-extended_2.11-1.12.0.jar:?]
at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26) [flink-dists-extended_2.11-1.12.0.jar:?]
at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21) [flink-dists-extended_2.11-1.12.0.jar:?]
at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123) [flink-dists-extended_2.11-1.12.0.jar:?]
at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21) [flink-dists-extended_2.11-1.12.0.jar:?]
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170) [flink-dists-extended_2.11-1.12.0.jar:?]
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) [flink-dists-extended_2.11-1.12.0.jar:?]
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) [flink-dists-extended_2.11-1.12.0.jar:?]
at akka.actor.Actor$class.aroundReceive(Actor.scala:517) [flink-dists-extended_2.11-1.12.0.jar:?]
at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225) [flink-dists-extended_2.11-1.12.0.jar:?]
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592) [flink-dists-extended_2.11-1.12.0.jar:?]
at akka.actor.ActorCell.invoke(ActorCell.scala:561) [flink-dists-extended_2.11-1.12.0.jar:?]
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258) [flink-dists-extended_2.11-1.12.0.jar:?]
at akka.dispatch.Mailbox.run(Mailbox.scala:225) [flink-dists-extended_2.11-1.12.0.jar:?]
at akka.dispatch.Mailbox.exec(Mailbox.scala:235) [flink-dists-extended_2.11-1.12.0.jar:?]
at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) [flink-dists-extended_2.11-1.12.0.jar:?]
at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) [flink-dists-extended_2.11-1.12.0.jar:?]
at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) [flink-dists-extended_2.11-1.12.0.jar:?]
at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) [flink-dists-extended_2.11-1.12.0.jar:?]
Caused by: org.apache.flink.util.FlinkRuntimeException: Failed to start the operator coordinators
at org.apache.flink.runtime.scheduler.SchedulerBase.startAllOperatorCoordinators(SchedulerBase.java:1100) ~[flink-dists-extended_2.11-1.12.0.jar:?]
at org.apache.flink.runtime.scheduler.SchedulerBase.startScheduling(SchedulerBase.java:567) ~[flink-dists-extended_2.11-1.12.0.jar:?]
at org.apache.flink.runtime.jobmaster.JobMaster.startScheduling(JobMaster.java:944) ~[flink-dists-extended_2.11-1.12.0.jar:?]
at java.util.concurrent.CompletableFuture.uniRun(CompletableFuture.java:719) ~[?:1.8.0_251]
... 27 more
Caused by: java.lang.ArrayIndexOutOfBoundsException: 1
at org.apache.hadoop.hive.common.ValidReadTxnList.readFromString(ValidReadTxnList.java:142) ~[?:?]
at org.apache.hadoop.hive.common.ValidReadTxnList.<init>(ValidReadTxnList.java:57) ~[?:?]
at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$Context.<init>(OrcInputFormat.java:421) ~[?:?]
at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:983) ~[?:?]
at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:1048) ~[?:?]
at org.apache.flink.connectors.hive.HiveSourceFileEnumerator.createInputSplits(HiveSourceFileEnumerator.java:86) ~[?:?]
at org.apache.flink.connectors.hive.HiveSourceFileEnumerator.enumerateSplits(HiveSourceFileEnumerator.java:57) ~[?:?]
at org.apache.flink.connector.file.src.AbstractFileSource.createEnumerator(AbstractFileSource.java:140) ~[flink-table_2.11-1.12.0.jar:1.12.0]
at org.apache.flink.connectors.hive.HiveSource.createEnumerator(HiveSource.java:115) ~[?:?]
at org.apache.flink.runtime.source.coordinator.SourceCoordinator.start(SourceCoordinator.java:119) ~[flink-dists-extended_2.11-1.12.0.jar:?]
at org.apache.flink.runtime.operators.coordination.RecreateOnResetOperatorCoordinator$DeferrableCoordinator.applyCall(RecreateOnResetOperatorCoordinator.java:308) ~[flink-dists-extended_2.11-1.12.0.jar:?]
at org.apache.flink.runtime.operators.coordination.RecreateOnResetOperatorCoordinator.start(RecreateOnResetOperatorCoordinator.java:72) ~[flink-dists-extended_2.11-1.12.0.jar:?]
at org.apache.flink.runtime.operators.coordination.OperatorCoordinatorHolder.start(OperatorCoordinatorHolder.java:182) ~[flink-dists-extended_2.11-1.12.0.jar:?]
at org.apache.flink.runtime.scheduler.SchedulerBase.startAllOperatorCoordinators(SchedulerBase.java:1094) ~[flink-dists-extended_2.11-1.12.0.jar:?]
at org.apache.flink.runtime.scheduler.SchedulerBase.startScheduling(SchedulerBase.java:567) ~[flink-dists-extended_2.11-1.12.0.jar:?]
at org.apache.flink.runtime.jobmaster.JobMaster.startScheduling(JobMaster.java:944) ~[flink-dists-extended_2.11-1.12.0.jar:?]
at java.util.concurrent.CompletableFuture.uniRun(CompletableFuture.java:719) ~[?:1.8.0_251]
... 27 more
2021-01-24 04:41:24,963 INFO org.apache.flink.runtime.blob.BlobServer [] - Stopped BLOB server at 0.0.0.0:13146
Reply | Threaded
Open this post in threaded view
|

Re: FlinkSQL submit query and then the jobmanager failed.

Matthias
Hi,
thanks for reaching out to the community. I'm not an Hive nor Orc format expert. But could it be that this is a configuration problem? The error is caused by an ArrayIndexOutOfBounds exception in ValidReadTxnList.readFromString on an array generated by splitting a String using colons as separators [1]. This method processes the value of the configuration parameter hive.txn.valid.txns. Could it be that this parameter is defined but not properly set (a value having no colon included at all might cause this exception for instance)? Or is this parameter not set by the user him-/herself?

Best,
Matthias


On Sun, Jan 24, 2021 at 2:48 PM 赵一旦 <[hidden email]> wrote:
As the title, my query sql is very simple, it just select all columns from a hive table(version 1.2.1; orc format).  When the sql is submitted, after several seconds, the jobmanager is failed. Here is the Jobmanager's log.
Does anyone can help to this problem?
2021-01-24 04:41:24,952 ERROR org.apache.flink.runtime.util.FatalExitExceptionHandler      [] - FATAL: Thread 'flink-akka.actor.default-dispatcher-2' produced an uncaught exception. Stopping the process...

java.util.concurrent.CompletionException: org.apache.flink.util.FlinkRuntimeException: Failed to start the operator coordinators
at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:273) ~[?:1.8.0_251]
at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:280) ~[?:1.8.0_251]
at java.util.concurrent.CompletableFuture.uniRun(CompletableFuture.java:722) ~[?:1.8.0_251]
at java.util.concurrent.CompletableFuture.uniRunStage(CompletableFuture.java:731) ~[?:1.8.0_251]
at java.util.concurrent.CompletableFuture.thenRun(CompletableFuture.java:2023) ~[?:1.8.0_251]
at org.apache.flink.runtime.jobmaster.JobMaster.resetAndStartScheduler(JobMaster.java:935) ~[flink-dists-extended_2.11-1.12.0.jar:?]
at org.apache.flink.runtime.jobmaster.JobMaster.startJobExecution(JobMaster.java:801) ~[flink-dists-extended_2.11-1.12.0.jar:?]
at org.apache.flink.runtime.jobmaster.JobMaster.lambda$start$1(JobMaster.java:357) ~[flink-dists-extended_2.11-1.12.0.jar:?]
at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleCallAsync(AkkaRpcActor.java:383) ~[flink-dists-extended_2.11-1.12.0.jar:?]
at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:199) ~[flink-dists-extended_2.11-1.12.0.jar:?]
at org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:88) ~[flink-dists-extended_2.11-1.12.0.jar:?]
at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:154) ~[flink-dists-extended_2.11-1.12.0.jar:?]
at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26) [flink-dists-extended_2.11-1.12.0.jar:?]
at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21) [flink-dists-extended_2.11-1.12.0.jar:?]
at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123) [flink-dists-extended_2.11-1.12.0.jar:?]
at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21) [flink-dists-extended_2.11-1.12.0.jar:?]
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170) [flink-dists-extended_2.11-1.12.0.jar:?]
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) [flink-dists-extended_2.11-1.12.0.jar:?]
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) [flink-dists-extended_2.11-1.12.0.jar:?]
at akka.actor.Actor$class.aroundReceive(Actor.scala:517) [flink-dists-extended_2.11-1.12.0.jar:?]
at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225) [flink-dists-extended_2.11-1.12.0.jar:?]
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592) [flink-dists-extended_2.11-1.12.0.jar:?]
at akka.actor.ActorCell.invoke(ActorCell.scala:561) [flink-dists-extended_2.11-1.12.0.jar:?]
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258) [flink-dists-extended_2.11-1.12.0.jar:?]
at akka.dispatch.Mailbox.run(Mailbox.scala:225) [flink-dists-extended_2.11-1.12.0.jar:?]
at akka.dispatch.Mailbox.exec(Mailbox.scala:235) [flink-dists-extended_2.11-1.12.0.jar:?]
at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) [flink-dists-extended_2.11-1.12.0.jar:?]
at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) [flink-dists-extended_2.11-1.12.0.jar:?]
at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) [flink-dists-extended_2.11-1.12.0.jar:?]
at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) [flink-dists-extended_2.11-1.12.0.jar:?]
Caused by: org.apache.flink.util.FlinkRuntimeException: Failed to start the operator coordinators
at org.apache.flink.runtime.scheduler.SchedulerBase.startAllOperatorCoordinators(SchedulerBase.java:1100) ~[flink-dists-extended_2.11-1.12.0.jar:?]
at org.apache.flink.runtime.scheduler.SchedulerBase.startScheduling(SchedulerBase.java:567) ~[flink-dists-extended_2.11-1.12.0.jar:?]
at org.apache.flink.runtime.jobmaster.JobMaster.startScheduling(JobMaster.java:944) ~[flink-dists-extended_2.11-1.12.0.jar:?]
at java.util.concurrent.CompletableFuture.uniRun(CompletableFuture.java:719) ~[?:1.8.0_251]
... 27 more
Caused by: java.lang.ArrayIndexOutOfBoundsException: 1
at org.apache.hadoop.hive.common.ValidReadTxnList.readFromString(ValidReadTxnList.java:142) ~[?:?]
at org.apache.hadoop.hive.common.ValidReadTxnList.<init>(ValidReadTxnList.java:57) ~[?:?]
at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$Context.<init>(OrcInputFormat.java:421) ~[?:?]
at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:983) ~[?:?]
at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:1048) ~[?:?]
at org.apache.flink.connectors.hive.HiveSourceFileEnumerator.createInputSplits(HiveSourceFileEnumerator.java:86) ~[?:?]
at org.apache.flink.connectors.hive.HiveSourceFileEnumerator.enumerateSplits(HiveSourceFileEnumerator.java:57) ~[?:?]
at org.apache.flink.connector.file.src.AbstractFileSource.createEnumerator(AbstractFileSource.java:140) ~[flink-table_2.11-1.12.0.jar:1.12.0]
at org.apache.flink.connectors.hive.HiveSource.createEnumerator(HiveSource.java:115) ~[?:?]
at org.apache.flink.runtime.source.coordinator.SourceCoordinator.start(SourceCoordinator.java:119) ~[flink-dists-extended_2.11-1.12.0.jar:?]
at org.apache.flink.runtime.operators.coordination.RecreateOnResetOperatorCoordinator$DeferrableCoordinator.applyCall(RecreateOnResetOperatorCoordinator.java:308) ~[flink-dists-extended_2.11-1.12.0.jar:?]
at org.apache.flink.runtime.operators.coordination.RecreateOnResetOperatorCoordinator.start(RecreateOnResetOperatorCoordinator.java:72) ~[flink-dists-extended_2.11-1.12.0.jar:?]
at org.apache.flink.runtime.operators.coordination.OperatorCoordinatorHolder.start(OperatorCoordinatorHolder.java:182) ~[flink-dists-extended_2.11-1.12.0.jar:?]
at org.apache.flink.runtime.scheduler.SchedulerBase.startAllOperatorCoordinators(SchedulerBase.java:1094) ~[flink-dists-extended_2.11-1.12.0.jar:?]
at org.apache.flink.runtime.scheduler.SchedulerBase.startScheduling(SchedulerBase.java:567) ~[flink-dists-extended_2.11-1.12.0.jar:?]
at org.apache.flink.runtime.jobmaster.JobMaster.startScheduling(JobMaster.java:944) ~[flink-dists-extended_2.11-1.12.0.jar:?]
at java.util.concurrent.CompletableFuture.uniRun(CompletableFuture.java:719) ~[?:1.8.0_251]
... 27 more
2021-01-24 04:41:24,963 INFO org.apache.flink.runtime.blob.BlobServer [] - Stopped BLOB server at 0.0.0.0:13146