Hi,
I am trying to achieve a stream-to-stream join with big windows and are searching for a way to clean up state of old keys. I am already using a RichCoProcessFunction
I found there is already an existing ticket but I have doubts that a registration of a timer for every incoming event is feasible as the timers seem to reside in an in-memory queue. The task is somewhat similar to the following blog post: http://devblog.mediamath.com/real-time-streaming-attribution-using-apache-flink Is the implementation of a custom window operator a necessity for achieving such functionality Thanks a lot, Johannes |
Looping in Aljoscha and Kostas who are the expert on this. :-)
On Mon, Mar 6, 2017 at 6:06 PM, Johannes Schulte <[hidden email]> wrote: > Hi, > > I am trying to achieve a stream-to-stream join with big windows and are > searching for a way to clean up state of old keys. I am already using a > RichCoProcessFunction > > I found there is already an existing ticket > > https://issues.apache.org/jira/browse/FLINK-3089 > > but I have doubts that a registration of a timer for every incoming event is > feasible as the timers seem to reside in an in-memory queue. > > The task is somewhat similar to the following blog post: > http://devblog.mediamath.com/real-time-streaming-attribution-using-apache-flink > > Is the implementation of a custom window operator a necessity for achieving > such functionality > > Thanks a lot, > > Johannes > > |
Hi Johannes,
I think what you can do is not register a timer for every event but for every key, with a certain granularity. When that timer fires you check what you want to clean up for that key and maybe register another timer for the future. This way, the size of your timer state is bounded by your key cardinality and I think people have used Flink with timers/windows with key cardinalities of several 100 millions. Best, Aljoscha On Wed, Mar 8, 2017, at 14:37, Ufuk Celebi wrote: > Looping in Aljoscha and Kostas who are the expert on this. :-) > > On Mon, Mar 6, 2017 at 6:06 PM, Johannes Schulte > <[hidden email]> wrote: > > Hi, > > > > I am trying to achieve a stream-to-stream join with big windows and are > > searching for a way to clean up state of old keys. I am already using a > > RichCoProcessFunction > > > > I found there is already an existing ticket > > > > https://issues.apache.org/jira/browse/FLINK-3089 > > > > but I have doubts that a registration of a timer for every incoming event is > > feasible as the timers seem to reside in an in-memory queue. > > > > The task is somewhat similar to the following blog post: > > http://devblog.mediamath.com/real-time-streaming-attribution-using-apache-flink > > > > Is the implementation of a custom window operator a necessity for achieving > > such functionality > > > > Thanks a lot, > > > > Johannes > > > > |
Hey Aljoscha,
thank you for your reply. The amount and quality of response on this list are really great to see and a good way to learn. I will try this and see how this works out. Cheers, Johannes On Thu, Mar 9, 2017 at 3:55 PM, Aljoscha Krettek <[hidden email]> wrote: Hi Johannes, |
Free forum by Nabble | Edit this page |