Control Trigger behavior based on external datasource

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Control Trigger behavior based on external datasource

Hironori Ogibayashi
Hello,

I am using GlobalWindow and my custom trigger (similar to
ContinuousProcessingTimeTrigger).
In my trigger I want to control the TriggerResult based on external datasource.
That datasource has flags for each key which describes if stream  for that
key has been finished (so, can be purged).

I am thinking of two approaches, so could you give me some advise about
which is better, or are there any other better solutions?

1. Check datasource in onProcessingTime()

Query datasource (i.e. Redis) in onProcessingTime() and return FIRE or
FIRE_AND_PURGE based on the result.
Maybe I will create Jedis or JedisPool instance in the trigger's constructor?

2. External program periodically query datasource and send special
event for keys of finished stream.

The schema of the event will be the same as normal events in the
stream, but has special value in a field. So, the trigger will be able
to handle the event in onElement(). I need to filter that event
afterward so that it does not affect the computation result.

Thanks,
Hironori Ogibayashi
Reply | Threaded
Open this post in threaded view
|

Re: Control Trigger behavior based on external datasource

Till Rohrmann
Hi Hironori,

I would go with the second approach, because it is not guaranteed that all events of a given key have been received by the window operator if the data source says that all events for this key have been read. The events might still be in flight. Furthermore, it integrates more nicely with Flink's streaming model.

Cheers,
Till

On Tue, Apr 26, 2016 at 10:16 AM, Hironori Ogibayashi <[hidden email]> wrote:
Hello,

I am using GlobalWindow and my custom trigger (similar to
ContinuousProcessingTimeTrigger).
In my trigger I want to control the TriggerResult based on external datasource.
That datasource has flags for each key which describes if streamĀ  for that
key has been finished (so, can be purged).

I am thinking of two approaches, so could you give me some advise about
which is better, or are there any other better solutions?

1. Check datasource in onProcessingTime()

Query datasource (i.e. Redis) in onProcessingTime() and return FIRE or
FIRE_AND_PURGE based on the result.
Maybe I will create Jedis or JedisPool instance in the trigger's constructor?

2. External program periodically query datasource and send special
event for keys of finished stream.

The schema of the event will be the same as normal events in the
stream, but has special value in a field. So, the trigger will be able
to handle the event in onElement(). I need to filter that event
afterward so that it does not affect the computation result.

Thanks,
Hironori Ogibayashi

Reply | Threaded
Open this post in threaded view
|

Re: Control Trigger behavior based on external datasource

Hironori Ogibayashi
Till,

Thank you for your answer.
That's true that there is the case window operator have not received
all data for the key.
I will go with the second idea.

Thanks!

Hironori

2016-04-26 17:46 GMT+09:00 Till Rohrmann <[hidden email]>:

> Hi Hironori,
>
> I would go with the second approach, because it is not guaranteed that all
> events of a given key have been received by the window operator if the data
> source says that all events for this key have been read. The events might
> still be in flight. Furthermore, it integrates more nicely with Flink's
> streaming model.
>
> Cheers,
> Till
>
> On Tue, Apr 26, 2016 at 10:16 AM, Hironori Ogibayashi <[hidden email]>
> wrote:
>>
>> Hello,
>>
>> I am using GlobalWindow and my custom trigger (similar to
>> ContinuousProcessingTimeTrigger).
>> In my trigger I want to control the TriggerResult based on external
>> datasource.
>> That datasource has flags for each key which describes if stream  for that
>> key has been finished (so, can be purged).
>>
>> I am thinking of two approaches, so could you give me some advise about
>> which is better, or are there any other better solutions?
>>
>> 1. Check datasource in onProcessingTime()
>>
>> Query datasource (i.e. Redis) in onProcessingTime() and return FIRE or
>> FIRE_AND_PURGE based on the result.
>> Maybe I will create Jedis or JedisPool instance in the trigger's
>> constructor?
>>
>> 2. External program periodically query datasource and send special
>> event for keys of finished stream.
>>
>> The schema of the event will be the same as normal events in the
>> stream, but has special value in a field. So, the trigger will be able
>> to handle the event in onElement(). I need to filter that event
>> afterward so that it does not affect the computation result.
>>
>> Thanks,
>> Hironori Ogibayashi
>
>