How to move event time forward using externally generated watermark message

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

How to move event time forward using externally generated watermark message

Manas Kale
Hi,
I have a scenario where I have an input event stream from various IoT devices. Every message on this stream can be of some eventType and has an eventTimestamp. Downstream, some business logic is implemented on this based on event time.
In case a device goes offline, what's the best way to indicate to this system that even time has progressed? Should I : 
  • Send a special message that contains only event time information, and write code to handle this message in all downstream operators?
  • Implement some processing time timer in the system that will tick the watermark forward if we don't see any message for some duration? I will still need to write code in downstream operators that handles this timer's trigger message.
I would prefer not writing code to handle special watermark messages. So does Flink provide any API level call that I can use to tick the watermark forward for all downstream operators when this special message is received / timer is fired?
Reply | Threaded
Open this post in threaded view
|

Re: How to move event time forward using externally generated watermark message

Arvid Heise-3
Hi Manas,

both are valid options.

I'd probably add a processing time timeout event in a process function, which will only trigger after no event has been received after 1 minute. In this way, you don't need to know which devices there are and just enqueue one timer per key (=device id).

After the process function, you'd need to reapply your watermark assigner as processing time and event time usually don't mix well and need to be explicitly resolved.

After the assigner, you can then simply filter out the timeout event and don't need to care in downstream operations.

On Mon, Mar 23, 2020 at 11:42 AM Manas Kale <[hidden email]> wrote:
Hi,
I have a scenario where I have an input event stream from various IoT devices. Every message on this stream can be of some eventType and has an eventTimestamp. Downstream, some business logic is implemented on this based on event time.
In case a device goes offline, what's the best way to indicate to this system that even time has progressed? Should I : 
  • Send a special message that contains only event time information, and write code to handle this message in all downstream operators?
  • Implement some processing time timer in the system that will tick the watermark forward if we don't see any message for some duration? I will still need to write code in downstream operators that handles this timer's trigger message.
I would prefer not writing code to handle special watermark messages. So does Flink provide any API level call that I can use to tick the watermark forward for all downstream operators when this special message is received / timer is fired?
Reply | Threaded
Open this post in threaded view
|

Re: How to move event time forward using externally generated watermark message

Manas Kale
Thanks for the help, Arvid!

On Tue, Mar 24, 2020 at 1:30 AM Arvid Heise <[hidden email]> wrote:
Hi Manas,

both are valid options.

I'd probably add a processing time timeout event in a process function, which will only trigger after no event has been received after 1 minute. In this way, you don't need to know which devices there are and just enqueue one timer per key (=device id).

After the process function, you'd need to reapply your watermark assigner as processing time and event time usually don't mix well and need to be explicitly resolved.

After the assigner, you can then simply filter out the timeout event and don't need to care in downstream operations.

On Mon, Mar 23, 2020 at 11:42 AM Manas Kale <[hidden email]> wrote:
Hi,
I have a scenario where I have an input event stream from various IoT devices. Every message on this stream can be of some eventType and has an eventTimestamp. Downstream, some business logic is implemented on this based on event time.
In case a device goes offline, what's the best way to indicate to this system that even time has progressed? Should I : 
  • Send a special message that contains only event time information, and write code to handle this message in all downstream operators?
  • Implement some processing time timer in the system that will tick the watermark forward if we don't see any message for some duration? I will still need to write code in downstream operators that handles this timer's trigger message.
I would prefer not writing code to handle special watermark messages. So does Flink provide any API level call that I can use to tick the watermark forward for all downstream operators when this special message is received / timer is fired?