Re: ayncIO & TM akka response
Posted by
Piotr Nowojski on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/ayncIO-TM-akka-response-tp17165p17177.html
Hi,
Please search the task manager logs for the potential reason of failure/disconnecting around the time when you got this error on the job manager. There should be some clearly visible exception.
Thanks, Piotrek
Hi there,
In recent, our production fink jobs observed some weird performance issue. When job tailing kafka source failed and try to catch up, asyncIO after event trigger get much higher load on task thread. Since each TM allocated two virtual CPU in docker, my assumption was akka message between JM and TM shouldn't be impacted.
What I observed was TM get closed and keep restart with same error message below. Any suggestion is appreciated!
org.apache.flink.runtime.io.network.netty.exception.RemoteTransportException: Connection unexpectedly closed by remote task manager 'xxxxxxx
/xxxxxxx
:5841'. This might indicate that the remote task manager was lost.
at org.apache.flink.runtime.io.network.netty.PartitionRequestClientHandler.channelInactive(PartitionRequestClientHandler.java:115)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:237)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:223)
at io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:237)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:223)
at io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:294)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:237)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:223)
at io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:829)
at io.netty.channel.AbstractChannel$AbstractUnsafe$7.run(AbstractChannel.java:610)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:748)