Re: EventTimeSessionWindow firing too soon
Posted by
Aljoscha Krettek on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/EventTimeSessionWindow-firing-too-soon-tp35959p35996.html
Sorry, I now saw that this thread diverged. My mail client didn't pick
it up because someone messed up the subject of the thread.
On 16.06.20 14:06, Aljoscha Krettek wrote:
> Hi,
>
> what is the timescale of your data in Kafka. If you have data in there
> that spans more than ~30 minutes I would expect your windows to fire
> very soon after the job is started. Event time does not depend on a wall
> clock but instead advances with the time in the stream. As Flink
> advances through the data in Kafka so does event-time advance in step.
>
> Does that explain your situation?
>
> Best,
> Aljoscha
>
> On 15.06.20 16:49, Ori Popowski wrote:
>> I'm using Flink 1.10 on YARN, and I have a EventTimeSessionWindow with a
>> gap of 30 minutes.
>>
>> But as soon as I start the job, events are written to the sink (I can see
>> them in S3) even though 30 minutes have not passed.
>>
>> This is my job:
>>
>> val stream = senv
>> .addSource(new FlinkKafkaConsumer("…",
>> compressedEventDeserializer,
>> properties))
>> .filter(_.sessionId.nonEmpty)
>> .flatMap(_ match { case (_, events) => events })
>> .assignTimestampsAndWatermarks(new
>> TimestampExtractor[Event](Time.minutes(10)) {
>> override def extractTimestamp(element: Event): Long =
>> event.sequence / 1000 // microseconds
>> })
>> .keyBy(_.sessionId)
>> .window(EventTimeSessionWindows.withGap(Time.of(30, MINUTES)))
>> .process(myProcessWindowFunction)
>>
>> AsyncDataStream.unorderedWait(stream, myAsyncS3Writer, 30, SECONDS, 100)
>>
>> Any idea why it's happening?
>>
>