Flink 1.10 memory and backpressure

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Flink 1.10 memory and backpressure

Steven Nelson
We are working with a process and having some problems with backpressure.

The backpressure seems to be caused by a simple Window operation, which causes our checkpoints to fail.

What would be the recommendations for debugging the backpressure?
Reply | Threaded
Open this post in threaded view
|

Re: Flink 1.10 memory and backpressure

Zhijiang(wangzhijiang999)
Regarding the monitor of backpressure, you can refer to the document [1].

As for debugging the backpressure, one option is to trace the jstack of respective window task thread which causes the backpressure(almost has the maximum inqueue buffers).
After frequent tracing the jstack, you might find which execution (e.g. state access) costs much, then you can probably find the bottleneck.

Besides that, in release-1.11 the unaligned checkpoint is introduced and implemented to mainly resolve the checkpoint issue in the case of backkpressure. Maybe you can pay attention
to this feature and have a try for your case.

Best,
Zhiijiang

------------------------------------------------------------------
From:Steven Nelson <[hidden email]>
Send Time:2020年6月11日(星期四) 04:35
To:user <[hidden email]>
Subject:Flink 1.10 memory and backpressure

We are working with a process and having some problems with backpressure.

The backpressure seems to be caused by a simple Window operation, which causes our checkpoints to fail.

What would be the recommendations for debugging the backpressure?

Reply | Threaded
Open this post in threaded view
|

Re: Flink 1.10 memory and backpressure

Zhijiang(wangzhijiang999)

------------------------------------------------------------------
From:Zhijiang <[hidden email]>
Send Time:2020年6月11日(星期四) 11:32
To:Steven Nelson <[hidden email]>; user <[hidden email]>
Subject:Re: Flink 1.10 memory and backpressure

Regarding the monitor of backpressure, you can refer to the document [1].

As for debugging the backpressure, one option is to trace the jstack of respective window task thread which causes the backpressure(almost has the maximum inqueue buffers).
After frequent tracing the jstack, you might find which execution (e.g. state access) costs much, then you can probably find the bottleneck.

Besides that, in release-1.11 the unaligned checkpoint is introduced and implemented to mainly resolve the checkpoint issue in the case of backkpressure. Maybe you can pay attention
to this feature and have a try for your case.

Best,
Zhiijiang

------------------------------------------------------------------
From:Steven Nelson <[hidden email]>
Send Time:2020年6月11日(星期四) 04:35
To:user <[hidden email]>
Subject:Flink 1.10 memory and backpressure

We are working with a process and having some problems with backpressure.

The backpressure seems to be caused by a simple Window operation, which causes our checkpoints to fail.

What would be the recommendations for debugging the backpressure?