The rpc invocation size exceeds the maximum akka framesize when the job was re submitted.

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

The rpc invocation size exceeds the maximum akka framesize when the job was re submitted.

Joshua Fan
hi,

We have a flink job platform which will resubmit the job when the job failed without platform user involvement. Today a resubmit failed because of the error below, I changed the akka.Frameszie, and the resubmit succeed. My question is, there is nothing change to the job, the jar, the program, or the arguments, why the error suddenly happened? 

java.io.IOException: The rpc invocation size exceeds the maximum akka framesize.
	at org.apache.flink.runtime.rpc.akka.AkkaInvocationHandler.createRpcInvocationMessage(AkkaInvocationHandler.java:247)
	at org.apache.flink.runtime.rpc.akka.AkkaInvocationHandler.invokeRpc(AkkaInvocationHandler.java:196)
	at org.apache.flink.runtime.rpc.akka.AkkaInvocationHandler.invoke(AkkaInvocationHandler.java:125)
	at com.sun.proxy.$Proxy28.submitTask(Unknown Source)
	at org.apache.flink.runtime.jobmaster.RpcTaskManagerGateway.submitTask(RpcTaskManagerGateway.java:99)
	at org.apache.flink.runtime.executiongraph.Execution.deploy(Execution.java:614)
	at org.apache.flink.runtime.executiongraph.ExecutionGraph.lambda$scheduleEager$2(ExecutionGraph.java:970)
	at java.util.concurrent.CompletableFuture.uniAccept(CompletableFuture.java:656)
	at java.util.concurrent.CompletableFuture$UniAccept.tryFire(CompletableFuture.java:632)
	at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
	at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962)
	at org.apache.flink.runtime.concurrent.FutureUtils$ResultConjunctFuture.handleCompletedFuture(FutureUtils.java:542)
	at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
	at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)
	at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
	at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962)
	at org.apache.flink.runtime.concurrent.FutureUtils$1.onComplete(FutureUtils.java:774)
	at akka.dispatch.OnComplete.internal(Future.scala:259)
	at akka.dispatch.OnComplete.internal(Future.scala:256)
	at akka.dispatch.japi$CallbackBridge.apply(Future.scala:186)
	at akka.dispatch.japi$CallbackBridge.apply(Future.scala:183)
	at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32)
	at org.apache.flink.runtime.concurrent.Executors$DirectExecutionContext.execute(Executors.java:83)
	at scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:40)
	at scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:248)
	at akka.pattern.PromiseActorRef.$bang(AskSupport.scala:534)
	at akka.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:19)
	at akka.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:18)
	at scala.concurrent.Future$$anonfun$andThen$1.apply(Future.scala:434)
	at scala.concurrent.Future$$anonfun$andThen$1.apply(Future.scala:433)
	at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32)
	at akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55)
	at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:91)
	at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91)
	at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91)
	at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72)
	at akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:90)
	at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:39)
	at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:415)
	at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
	at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
	at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
	at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

Thanks
Joshua
Reply | Threaded
Open this post in threaded view
|

Re: The rpc invocation size exceeds the maximum akka framesize when the job was re submitted.

Till Rohrmann
Hi Joshua,

this is hard to tell just from the stack trace. One thing I could imagine is that you regenerate the JobGraph by running the main method of the user code again and that the user code contains some non deterministic component which varies in size and influences what you need to ship as part of the user code functions. But this is just guessing.

Cheers,
Till

On Wed, Aug 19, 2020 at 8:48 AM Joshua Fan <[hidden email]> wrote:
hi,

We have a flink job platform which will resubmit the job when the job failed without platform user involvement. Today a resubmit failed because of the error below, I changed the akka.Frameszie, and the resubmit succeed. My question is, there is nothing change to the job, the jar, the program, or the arguments, why the error suddenly happened? 

java.io.IOException: The rpc invocation size exceeds the maximum akka framesize.
	at org.apache.flink.runtime.rpc.akka.AkkaInvocationHandler.createRpcInvocationMessage(AkkaInvocationHandler.java:247)
	at org.apache.flink.runtime.rpc.akka.AkkaInvocationHandler.invokeRpc(AkkaInvocationHandler.java:196)
	at org.apache.flink.runtime.rpc.akka.AkkaInvocationHandler.invoke(AkkaInvocationHandler.java:125)
	at com.sun.proxy.$Proxy28.submitTask(Unknown Source)
	at org.apache.flink.runtime.jobmaster.RpcTaskManagerGateway.submitTask(RpcTaskManagerGateway.java:99)
	at org.apache.flink.runtime.executiongraph.Execution.deploy(Execution.java:614)
	at org.apache.flink.runtime.executiongraph.ExecutionGraph.lambda$scheduleEager$2(ExecutionGraph.java:970)
	at java.util.concurrent.CompletableFuture.uniAccept(CompletableFuture.java:656)
	at java.util.concurrent.CompletableFuture$UniAccept.tryFire(CompletableFuture.java:632)
	at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
	at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962)
	at org.apache.flink.runtime.concurrent.FutureUtils$ResultConjunctFuture.handleCompletedFuture(FutureUtils.java:542)
	at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
	at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)
	at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
	at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962)
	at org.apache.flink.runtime.concurrent.FutureUtils$1.onComplete(FutureUtils.java:774)
	at akka.dispatch.OnComplete.internal(Future.scala:259)
	at akka.dispatch.OnComplete.internal(Future.scala:256)
	at akka.dispatch.japi$CallbackBridge.apply(Future.scala:186)
	at akka.dispatch.japi$CallbackBridge.apply(Future.scala:183)
	at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32)
	at org.apache.flink.runtime.concurrent.Executors$DirectExecutionContext.execute(Executors.java:83)
	at scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:40)
	at scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:248)
	at akka.pattern.PromiseActorRef.$bang(AskSupport.scala:534)
	at akka.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:19)
	at akka.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:18)
	at scala.concurrent.Future$$anonfun$andThen$1.apply(Future.scala:434)
	at scala.concurrent.Future$$anonfun$andThen$1.apply(Future.scala:433)
	at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32)
	at akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55)
	at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:91)
	at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91)
	at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91)
	at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72)
	at akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:90)
	at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:39)
	at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:415)
	at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
	at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
	at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
	at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

Thanks
Joshua
Reply | Threaded
Open this post in threaded view
|

Re: The rpc invocation size exceeds the maximum akka framesize when the job was re submitted.

shravan
Hi,

We are observing the same error as well with regard to "The rpc invocation
size exceeds the maximum akka framesize.", and have follow-up questions on
the same.

Why we face this issue, how can we know the expected size for which it is
failing? The error message does not indicate that. Does the operator state
have any impact on the expected Akka frame size? What is the impact of
increasing it?

Awaiting a response.

Regards,
Shravan



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/