Iterations and back pressure problem

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Iterations and back pressure problem

spoganshev
We've tried using iterations feature and in case of significant load the job sometimes stalls and stops processing events due to high back pressure both in tasks that produces records for iteration and all the other inputs to this task. It looks like a back pressure loop the task can't handle all the incoming records, iteration sink loops back into this task and also gets back pressured. This is basically a "back pressure loop" which causes a complete job stoppage.

Is there a way to mitigate this (to guarantee such issue does not occur)?
Reply | Threaded
Open this post in threaded view
|

Re: Iterations and back pressure problem

Andrey Zagrebin-2

On Mon, Dec 24, 2018 at 7:16 PM Sergei Poganshev <[hidden email]> wrote:
We've tried using iterations feature and in case of significant load the job sometimes stalls and stops processing events due to high back pressure both in tasks that produces records for iteration and all the other inputs to this task. It looks like a back pressure loop the task can't handle all the incoming records, iteration sink loops back into this task and also gets back pressured. This is basically a "back pressure loop" which causes a complete job stoppage.

Is there a way to mitigate this (to guarantee such issue does not occur)?
Reply | Threaded
Open this post in threaded view
|

Re: Iterations and back pressure problem

Ken Krugler
Hi Sergey,

As Andrey noted, it’s a known issue with (currently) no good solution.

I talk a bit about how we worked around it on slide 26 of my Flink Forward talk on a Flink-based web crawler.

Basically we do some cheesy approximate monitoring of in-flight data, and throttle the key producer so that (hopefully) network buffers don’t fill up to the point of deadlock.

— Ken


On Dec 24, 2018, at 8:46 AM, Andrey Zagrebin <[hidden email]> wrote:


On Mon, Dec 24, 2018 at 7:16 PM Sergei Poganshev <[hidden email]> wrote:
We've tried using iterations feature and in case of significant load the job sometimes stalls and stops processing events due to high back pressure both in tasks that produces records for iteration and all the other inputs to this task. It looks like a back pressure loop the task can't handle all the incoming records, iteration sink loops back into this task and also gets back pressured. This is basically a "back pressure loop" which causes a complete job stoppage.

Is there a way to mitigate this (to guarantee such issue does not occur)?

--------------------------
Ken Krugler
+1 530-210-6378
http://www.scaleunlimited.com
Custom big data solutions & training
Flink, Solr, Hadoop, Cascading & Cassandra