Hi All,
The document said "a window is created as soon as the first element that should belong to this window arrives, and the window is completely removed when the time (event or processing time) passes its end timestamp plus the user-specified allowed lateness (see Allowed Lateness).". I am still confused. If the window contains only one element (which triggers the window creation), and no more elements come in during the window size (e.g. 1 minute), then when does the window function get invoked? after 1 minute? I mean, the window would finish either when any element indicates the watermark is larger than the window size, or, when the processing time (no matter for event-timed window or process-timed window) pass over the window size since the first element? |
Hi, this depends on the window type. Tumbling and Sliding Windows are (by default) aligned with the epoch time (1970-01-01 00:00:00).For example a tumbling window of 2 hour starts and ends every two hours, i.e., from 12:00:00 to 13:59:59.999, from 14:00:00 to 15:59:59.999, etc. So it might happen that the first element of a 2 hour tumbling window arrives at 13:59:59.000 and the window is closed 1 second later. However, there are also windows for which the first element defines the start time such as the built-in session window. You can also define custom windows like that. Best, Fabian 2017-12-12 7:57 GMT+01:00 Jinhua Luo <[hidden email]>: Hi All, |
OK, I see.
But what if a window contains no elements? Is it still get fired and invoke the window function? 2017-12-12 15:42 GMT+08:00 Fabian Hueske <[hidden email]>: > Hi, > > this depends on the window type. Tumbling and Sliding Windows are (by > default) aligned with the epoch time (1970-01-01 00:00:00). > For example a tumbling window of 2 hour starts and ends every two hours, > i.e., from 12:00:00 to 13:59:59.999, from 14:00:00 to 15:59:59.999, etc. > > The documentation says a window is created when an element arrives. This > does not imply that the start time of the window is the time of the first > element. > So it might happen that the first element of a 2 hour tumbling window > arrives at 13:59:59.000 and the window is closed 1 second later. > > However, there are also windows for which the first element defines the > start time such as the built-in session window. > You can also define custom windows like that. > > Best, Fabian > > 2017-12-12 7:57 GMT+01:00 Jinhua Luo <[hidden email]>: >> >> Hi All, >> >> The document said "a window is created as soon as the first element >> that should belong to this window arrives, and the window is >> completely removed when the time (event or processing time) passes its >> end timestamp plus the user-specified allowed lateness (see Allowed >> Lateness).". >> >> I am still confused. >> >> If the window contains only one element (which triggers the window >> creation), and no more elements come in during the window size (e.g. 1 >> minute), then when does the window function get invoked? after 1 >> minute? >> >> I mean, the window would finish either when any element indicates the >> watermark is larger than the window size, or, when the processing time >> (no matter for event-timed window or process-timed window) pass over >> the window size since the first element? > > |
No, that's exactly what is mean by "a window is created when the first element arrives". Otherwise, you'd have to fire empty windows for all possible keys (in case of a window operator on a keyed stream) which is obviously not possible. 2017-12-12 9:30 GMT+01:00 Jinhua Luo <[hidden email]>: OK, I see. |
If the window contains only one element, no more elements come in,
then by default (with EventTimeTrigger), the window would be fired by next element if that element advances watermark which passes the end of the window, correct? That is, even if the window ends at 12:30, then if no more element come in and advance watermark through 12:30, the window would not be fired; if the next element appears at 13:30, then the window would be fired, although it has been delayed for 1 hour, correct? 2017-12-12 16:53 GMT+08:00 Fabian Hueske <[hidden email]>: > No, that's exactly what is mean by "a window is created when the first > element arrives". > Otherwise, you'd have to fire empty windows for all possible keys (in case > of a window operator on a keyed stream) which is obviously not possible. > > 2017-12-12 9:30 GMT+01:00 Jinhua Luo <[hidden email]>: >> >> OK, I see. >> >> But what if a window contains no elements? Is it still get fired and >> invoke the window function? >> >> 2017-12-12 15:42 GMT+08:00 Fabian Hueske <[hidden email]>: >> > Hi, >> > >> > this depends on the window type. Tumbling and Sliding Windows are (by >> > default) aligned with the epoch time (1970-01-01 00:00:00). >> > For example a tumbling window of 2 hour starts and ends every two hours, >> > i.e., from 12:00:00 to 13:59:59.999, from 14:00:00 to 15:59:59.999, etc. >> > >> > The documentation says a window is created when an element arrives. This >> > does not imply that the start time of the window is the time of the >> > first >> > element. >> > So it might happen that the first element of a 2 hour tumbling window >> > arrives at 13:59:59.000 and the window is closed 1 second later. >> > >> > However, there are also windows for which the first element defines the >> > start time such as the built-in session window. >> > You can also define custom windows like that. >> > >> > Best, Fabian >> > >> > 2017-12-12 7:57 GMT+01:00 Jinhua Luo <[hidden email]>: >> >> >> >> Hi All, >> >> >> >> The document said "a window is created as soon as the first element >> >> that should belong to this window arrives, and the window is >> >> completely removed when the time (event or processing time) passes its >> >> end timestamp plus the user-specified allowed lateness (see Allowed >> >> Lateness).". >> >> >> >> I am still confused. >> >> >> >> If the window contains only one element (which triggers the window >> >> creation), and no more elements come in during the window size (e.g. 1 >> >> minute), then when does the window function get invoked? after 1 >> >> minute? >> >> >> >> I mean, the window would finish either when any element indicates the >> >> watermark is larger than the window size, or, when the processing time >> >> (no matter for event-timed window or process-timed window) pass over >> >> the window size since the first element? >> > >> > > > |
Unless I generate event-time watermark continuously regardless of elements?
Just like the doc does, it gives an example how to generate continuous watermark based on processing time (TimeLagWatermarkGenerator): https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/event_timestamps_watermarks.html#with-periodic-watermarks But if I use pre-defined event-time watermark generators which are purely based on elements, then what I worried in my last mail is true? https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/event_timestamp_extractors.html 2017-12-13 12:48 GMT+08:00 Jinhua Luo <[hidden email]>: > If the window contains only one element, no more elements come in, > then by default (with EventTimeTrigger), the window would be fired by > next element if that element advances watermark which passes the end > of the window, correct? > That is, even if the window ends at 12:30, then if no more element > come in and advance watermark through 12:30, the window would not be > fired; if the next element appears at 13:30, then the window would be > fired, although it has been delayed for 1 hour, correct? > > 2017-12-12 16:53 GMT+08:00 Fabian Hueske <[hidden email]>: >> No, that's exactly what is mean by "a window is created when the first >> element arrives". >> Otherwise, you'd have to fire empty windows for all possible keys (in case >> of a window operator on a keyed stream) which is obviously not possible. >> >> 2017-12-12 9:30 GMT+01:00 Jinhua Luo <[hidden email]>: >>> >>> OK, I see. >>> >>> But what if a window contains no elements? Is it still get fired and >>> invoke the window function? >>> >>> 2017-12-12 15:42 GMT+08:00 Fabian Hueske <[hidden email]>: >>> > Hi, >>> > >>> > this depends on the window type. Tumbling and Sliding Windows are (by >>> > default) aligned with the epoch time (1970-01-01 00:00:00). >>> > For example a tumbling window of 2 hour starts and ends every two hours, >>> > i.e., from 12:00:00 to 13:59:59.999, from 14:00:00 to 15:59:59.999, etc. >>> > >>> > The documentation says a window is created when an element arrives. This >>> > does not imply that the start time of the window is the time of the >>> > first >>> > element. >>> > So it might happen that the first element of a 2 hour tumbling window >>> > arrives at 13:59:59.000 and the window is closed 1 second later. >>> > >>> > However, there are also windows for which the first element defines the >>> > start time such as the built-in session window. >>> > You can also define custom windows like that. >>> > >>> > Best, Fabian >>> > >>> > 2017-12-12 7:57 GMT+01:00 Jinhua Luo <[hidden email]>: >>> >> >>> >> Hi All, >>> >> >>> >> The document said "a window is created as soon as the first element >>> >> that should belong to this window arrives, and the window is >>> >> completely removed when the time (event or processing time) passes its >>> >> end timestamp plus the user-specified allowed lateness (see Allowed >>> >> Lateness).". >>> >> >>> >> I am still confused. >>> >> >>> >> If the window contains only one element (which triggers the window >>> >> creation), and no more elements come in during the window size (e.g. 1 >>> >> minute), then when does the window function get invoked? after 1 >>> >> minute? >>> >> >>> >> I mean, the window would finish either when any element indicates the >>> >> watermark is larger than the window size, or, when the processing time >>> >> (no matter for event-timed window or process-timed window) pass over >>> >> the window size since the first element? >>> > >>> > >> >> |
Hi,
Yes, those last two comments about the watermark and window triggering are correct. The watermark either has to advance based on events or based on some continuous generation. Best, Aljoscha > On 13. Dec 2017, at 06:08, Jinhua Luo <[hidden email]> wrote: > > Unless I generate event-time watermark continuously regardless of elements? > > Just like the doc does, it gives an example how to generate continuous > watermark based on processing time (TimeLagWatermarkGenerator): > https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/event_timestamps_watermarks.html#with-periodic-watermarks > > But if I use pre-defined event-time watermark generators which are > purely based on elements, then what I worried in my last mail is true? > https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/event_timestamp_extractors.html > > > > 2017-12-13 12:48 GMT+08:00 Jinhua Luo <[hidden email]>: >> If the window contains only one element, no more elements come in, >> then by default (with EventTimeTrigger), the window would be fired by >> next element if that element advances watermark which passes the end >> of the window, correct? >> That is, even if the window ends at 12:30, then if no more element >> come in and advance watermark through 12:30, the window would not be >> fired; if the next element appears at 13:30, then the window would be >> fired, although it has been delayed for 1 hour, correct? >> >> 2017-12-12 16:53 GMT+08:00 Fabian Hueske <[hidden email]>: >>> No, that's exactly what is mean by "a window is created when the first >>> element arrives". >>> Otherwise, you'd have to fire empty windows for all possible keys (in case >>> of a window operator on a keyed stream) which is obviously not possible. >>> >>> 2017-12-12 9:30 GMT+01:00 Jinhua Luo <[hidden email]>: >>>> >>>> OK, I see. >>>> >>>> But what if a window contains no elements? Is it still get fired and >>>> invoke the window function? >>>> >>>> 2017-12-12 15:42 GMT+08:00 Fabian Hueske <[hidden email]>: >>>>> Hi, >>>>> >>>>> this depends on the window type. Tumbling and Sliding Windows are (by >>>>> default) aligned with the epoch time (1970-01-01 00:00:00). >>>>> For example a tumbling window of 2 hour starts and ends every two hours, >>>>> i.e., from 12:00:00 to 13:59:59.999, from 14:00:00 to 15:59:59.999, etc. >>>>> >>>>> The documentation says a window is created when an element arrives. This >>>>> does not imply that the start time of the window is the time of the >>>>> first >>>>> element. >>>>> So it might happen that the first element of a 2 hour tumbling window >>>>> arrives at 13:59:59.000 and the window is closed 1 second later. >>>>> >>>>> However, there are also windows for which the first element defines the >>>>> start time such as the built-in session window. >>>>> You can also define custom windows like that. >>>>> >>>>> Best, Fabian >>>>> >>>>> 2017-12-12 7:57 GMT+01:00 Jinhua Luo <[hidden email]>: >>>>>> >>>>>> Hi All, >>>>>> >>>>>> The document said "a window is created as soon as the first element >>>>>> that should belong to this window arrives, and the window is >>>>>> completely removed when the time (event or processing time) passes its >>>>>> end timestamp plus the user-specified allowed lateness (see Allowed >>>>>> Lateness).". >>>>>> >>>>>> I am still confused. >>>>>> >>>>>> If the window contains only one element (which triggers the window >>>>>> creation), and no more elements come in during the window size (e.g. 1 >>>>>> minute), then when does the window function get invoked? after 1 >>>>>> minute? >>>>>> >>>>>> I mean, the window would finish either when any element indicates the >>>>>> watermark is larger than the window size, or, when the processing time >>>>>> (no matter for event-timed window or process-timed window) pass over >>>>>> the window size since the first element? >>>>> >>>>> >>> >>> |
Free forum by Nabble | Edit this page |