Hi, As flink doesn't provide out-of-box support for autoscaling, can backpressure be considered as an alternative to it? Autoscaling allows us to add/remove nodes as load goes up/down. With backpressure, if load goes up system would signal upstream to release data slowly. So we don't need to add more hardware horizontally. Is it correct conceptually and practically? Manish |
I'd say the two can't be considered equivalent because the back pressure
does not "reach" back into the source system. It only goes as far back as the Flink source. So if the outside system produces data to fast into the queue from which Flink is reading this input would keep piling up. Best, Aljoscha On 06.05.20 07:05, Manish G wrote: > Hi, > > As flink doesn't provide out-of-box support for autoscaling, can > backpressure be considered as an alternative to it? > Autoscaling allows us to add/remove nodes as load goes up/down. > With backpressure, if load goes up system would signal upstream to release > data slowly. So we don't need to add more hardware horizontally. > Is it correct conceptually and practically? > > Manish > |
Hi Manish, while you could use backpressure and the resulting consumer lag to throttle the source and keep processing lag to a minimum, I'd personally see only very limited value. It assumes that you have an architecture where you can influence the input rate, which is probably only true if you generate data or you have some kind of sampling. You can use backpressure to build your own autoscaling though if you need some solution right away. That would involve full restarts (from savepoints) when resources are added/removed. There have been users that implemented it in a general way, but it's quite an effort. See also some talks of the virtual Flink forward from Netflix [1] and AWS [2]. There are currently different efforts in the community to introduce autoscaling at various conceptual levels. Afaik none of them make it in the upcoming 1.11 release, so there will be some Flink solution in fall, I guess. To workaround that, you could also define some alerts (backpressure over X% for the last Y min) and scale out manually. Depending on how many Flink clusters you have, I'd combine that with some oversizing to avoid doing the scale out too often (or during nighttime). It just may be cheaper to let some more resources run (costing let's say $50/day) than spend a month on implementing some custom autoscaling when the community will provide a solution in 4-5 months. I know, it's not an ideal situation but probably the more pragmatic way. On Wed, May 6, 2020 at 4:46 PM Aljoscha Krettek <[hidden email]> wrote: I'd say the two can't be considered equivalent because the back pressure -- Arvid Heise | Senior Java Developer Follow us @VervericaData -- Join Flink Forward - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbHRegistered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Toni) Cheng |
Free forum by Nabble | Edit this page |