Hello! I have found that even though I am processing using event time and I provide an event time for all my events that the events produced in a RetractStream I create from a Table do not have timestamps. That is to say that I put a ProcessFunction on the RetractStream and ctx.timestamp() always returns null. I went a step further and defined the table to include a rowtime column, but that doesn't seem to make any difference. My SQL on the Table is roughly just:
select user, count(item) from table group by user Am I missing something? Is there any way to get a reasonable event timestamp on the events in the retract stream? <form method="post" target="_blank" onsubmit="try {return window.confirm("You are submitting information to an external page.\nAre you sure?");} catch (e) {return false;}"> Gregory Fee |
Hi Gregory, Event-time timestamps are handled a bit differently in Flink's SQL compared to the DataStream API.However, your case is a bit more complicated. Your query is performing a non-windowed aggregation, which is not a time-based operation but causes updates/retractions. Therefore, it is a bit more tricky to preserve the event-time timestamp because we must ensure that they are still aligned with the watermarks. There would be two queries that would ensure watermark alignment: - SELECT user, COUNT(*), MAX(rowtime) FROM t GROUP BY user; - SELECT user, COUNT(*), LAST_VAL(rowtime) FROM t GROUP BY user; These queries would forward the maximum (or last) event-time timestamp (rowtime) to the result table. However, none of these work in the current version or upcoming version of Flink. We also need to think about how timestamps would interact with retractions, because retractions should not be treated as late records. Best, Fabian 2018-03-23 0:02 GMT+01:00 Gregory Fee <[hidden email]>:
|
Free forum by Nabble | Edit this page |