Process event with last 1 hour, 1week and 1 Month data

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Process event with last 1 hour, 1week and 1 Month data

shashank734
Hi,

I have to process each event with last 1 hour , 1 week and 1 month data. Like how many times same ip occurred in last 1 month corresponding to that event. \

I think window is for fixed time i can't calculate with last 1 hour corresponding to current event.

If you have any clue please guide what should i use Table, ProcessFunction or global window. Or what approach should i take ?

--
Thanks Regards

SHASHANK AGARWAL
 ---  Trying to mobilize the things....

Reply | Threaded
Open this post in threaded view
|

Re: Process event with last 1 hour, 1week and 1 Month data

Aljoscha Krettek
Hi,

How would you evaluate such a query? I think the answer could be that you have to keep all that older data around so that you can evaluate when a new event arrives. In Flink, you could use a ProcessFunction for that and use a MapState that keeps events bucketed into one-week intervals. This way, can more efficiently iterate over the buckets that are required when evaluating a given event and you can also efficiently delete a complete bucket of older events once you know that they are not required anymore.

These are the relevant sections of the Flink doc:

Best,
Aljoscha
On 13. Jun 2017, at 15:27, shashank agarwal <[hidden email]> wrote:

Hi,

I have to process each event with last 1 hour , 1 week and 1 month data. Like how many times same ip occurred in last 1 month corresponding to that event. \

I think window is for fixed time i can't calculate with last 1 hour corresponding to current event.

If you have any clue please guide what should i use Table, ProcessFunction or global window. Or what approach should i take ?

--
Thanks Regards

SHASHANK AGARWAL
 ---  Trying to mobilize the things....


Reply | Threaded
Open this post in threaded view
|

Re: Process event with last 1 hour, 1week and 1 Month data

shashank734
Thanks Aljoscha Krettek I will try the same.

On Thu, Jun 15, 2017 at 3:11 PM, Aljoscha Krettek <[hidden email]> wrote:
Hi,

How would you evaluate such a query? I think the answer could be that you have to keep all that older data around so that you can evaluate when a new event arrives. In Flink, you could use a ProcessFunction for that and use a MapState that keeps events bucketed into one-week intervals. This way, can more efficiently iterate over the buckets that are required when evaluating a given event and you can also efficiently delete a complete bucket of older events once you know that they are not required anymore.

These are the relevant sections of the Flink doc:

Best,
Aljoscha

On 13. Jun 2017, at 15:27, shashank agarwal <[hidden email]> wrote:

<img width="0" height="0" class="m_7323564075265770751mailtrack-img" style="float:right" alt="" src="data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7">Hi,

I have to process each event with last 1 hour , 1 week and 1 month data. Like how many times same ip occurred in last 1 month corresponding to that event. \

I think window is for fixed time i can't calculate with last 1 hour corresponding to current event.

If you have any clue please guide what should i use Table, ProcessFunction or global window. Or what approach should i take ?

--
Thanks Regards

SHASHANK AGARWAL
 ---  Trying to mobilize the things....





--
Thanks Regards

SHASHANK AGARWAL
 ---  Trying to mobilize the things....