Hi all, I'm looking into what happens when messages are ingested with timestamps far into the future (e.g. due to corruption or a wrong clock at the sender). I'm aware of the effect on watermarking, but another thing I'm concerned about is the performance impact of the extra windows this will create. If a Flink operator has many (perhaps hundreds or thousands) of windows open but not receiving any data (and never firing), will this degrade performance? Thanks, Joe |
Hi Joe, The main effect this should have is more state to be kept until the windows can be fired (and state purged). This would of course increase the time it takes to checkpoint the operator. I'm not sure if there will be significant runtime per-record impact caused by how windows are bookkeeped in data structures in the WindowOperator, maybe Aljoscha (cc'ed) can chime in here for anything. If it is certain that these windows will never fire (until far into the future) because the event-timestamps are in the first place corrupted, then it might make sense to have a way to drop windows based on some criteria. I'm not sure if that is supported in any way without triggers (since you mentioned that those windows might not receive any data), again Aljoscha might be able to provide more info here. Cheers, Gordon On Thu, May 21, 2020 at 7:02 PM Joe Malt <[hidden email]> wrote:
|
Hi,
I don't think this will immediately degrade performance. State is essentially stored in a HashMap (for the FileStateBackend) or RocksDB (for the RocksDB backend). If these data structures don't degrade with size then your performance also shouldn't degrade. There are of course some effects that would come into play. For example if the data grows so much that it negatively affects GC this would of course have an effect, but that doesn't come from the data structures as they are. I hope that helps! Best, Aljoscha On 22.05.20 06:23, Tzu-Li (Gordon) Tai wrote: > Hi Joe, > > The main effect this should have is more state to be kept until the windows > can be fired (and state purged). > This would of course increase the time it takes to checkpoint the operator. > > I'm not sure if there will be significant runtime per-record impact caused > by how windows are bookkeeped in data structures in the WindowOperator, > maybe Aljoscha (cc'ed) can chime in here for anything. > If it is certain that these windows will never fire (until far into the > future) because the event-timestamps are in the first place corrupted, then > it might make sense to have a way to drop windows based on some criteria. > I'm not sure if that is supported in any way without triggers (since you > mentioned that those windows might not receive any data), again Aljoscha > might be able to provide more info here. > > Cheers, > Gordon > > On Thu, May 21, 2020 at 7:02 PM Joe Malt <[hidden email]> wrote: > >> Hi all, >> >> I'm looking into what happens when messages are ingested with timestamps >> far into the future (e.g. due to corruption or a wrong clock at the sender). >> >> I'm aware of the effect on watermarking, but another thing I'm concerned >> about is the performance impact of the extra windows this will create. >> >> If a Flink operator has many (perhaps hundreds or thousands) of windows >> open but not receiving any data (and never firing), will this degrade >> performance? >> >> Thanks, >> Joe >> > |
Free forum by Nabble | Edit this page |