Is there window trigger in Table API ?

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Is there window trigger in Table API ?

lu yj
Hello, 

I am using Table API to do some aggregation based on time window. 

In DataStream API, there is trigger to control when the aggregation function should be invoked. Is there similar thing in Table API? 

Because I am using large time window, like a day. I want the intermediate result every time a new event is aggregated. Is that possible?  And also, does it hold all the input data until the window ends?   

Thanks!
Reply | Threaded
Open this post in threaded view
|

Re: Is there window trigger in Table API ?

jincheng sun
Hi luyj,

Currently, TableAPI does not have the trigger, due to the behavior of the windows(unbounded, tumble, slide, session) is very clear.The behavior of each window is as follows:

   - Unbounded Window - Each set of keys is a grouping, and each event triggers a calculation.

   - Tumble Window - A tumbling window assigns rows to non-overlapping, continuous windows of fixed length. Each window outputs one calculation result.

  - Slide Window - A sliding window has a fixed size and slides by a specified slide interval. If the slide interval is smaller than the window size, sliding windows are overlapping. Each window outputs one calculation result.

  - Session Window - Session windows do not have a fixed size but their bounds are defined by an interval of inactivity, i.e., a session window is closes if no event appears for a defined gap period. Each window outputs one calculation result.

All of those windows are not hold all the input data, the calculations are incremental.

About your case, I think you can use `Unbounded Window` and group by a UDF(time) which return the day unit. e.g.:
table.groupBy(dateFormat('time, "%Y%d")).select('a.sum)
or
table
.select('a, dateFormat('time, "%Y%d").cast(Types.STRING) as 'ts)
.groupBy('ts)
.select('ts, 'a.sum)

Hope to help you!

Best,
Jincheng

lu yj <[hidden email]> 于2019年3月26日周二 下午4:17写道:
Hello, 

I am using Table API to do some aggregation based on time window. 

In DataStream API, there is trigger to control when the aggregation function should be invoked. Is there similar thing in Table API? 

Because I am using large time window, like a day. I want the intermediate result every time a new event is aggregated. Is that possible?  And also, does it hold all the input data until the window ends?   

Thanks!