Hi
I am running a flink application with parallelism 64, I left the checkpoint timeout default value, which is 10minutes, the state size is less than 1MB, I am using the FsStateBackend. The application triggers some checkpoints but all of them fails due to "Checkpoint expired before completing”, I check the checkpoint history, found that there are 63 subtask acknowledge, but one left n/a, and also the alignment duration is quite long, about 5m27s. I want to know why there is one subtask does not acknowledge? And because the alignment duration is long, what will influent the alignment duration? Thank a lot. Best Henry |
Hi,
In my experience, this is most likely due to one sub-task is blocked doing some long-running operation. Try to run the task manager with some profiler (like VisualVM) and check for hot spot. Regards, Kien On 10/24/2018 4:02 PM, 徐涛 wrote: > Hi > I am running a flink application with parallelism 64, I left the checkpoint timeout default value, which is 10minutes, the state size is less than 1MB, I am using the FsStateBackend. > The application triggers some checkpoints but all of them fails due to "Checkpoint expired before completing”, I check the checkpoint history, found that there are 63 subtask acknowledge, but one left n/a, and also the alignment duration is quite long, about 5m27s. > I want to know why there is one subtask does not acknowledge? And because the alignment duration is long, what will influent the alignment duration? > Thank a lot. > > Best > Henry |
In reply to this post by 徐涛
Hi Henry, @Kien is right. Take a thread dump to see what was doing in the TaskManager. Also check whether gc happens frequently. Best, Hequn On Wed, Oct 24, 2018 at 5:03 PM 徐涛 <[hidden email]> wrote: Hi |
Hi Hequn & Kien, Finally the problem is solved. It is due to slow sink write. Because the job only have 2 tasks, I check the backpressure, found that the source has high backpressure, so I tried to improve the sink write. After that the end to end duration is below 1s and the checkpoint timeout is fixed. Best Henry
|
Hi Henry, Thanks for letting us know. On Thu, Oct 25, 2018 at 7:34 PM 徐涛 <[hidden email]> wrote:
|
Free forum by Nabble | Edit this page |