Hi I have a linear streaming flow with a single source and multiple sinks to publish intermediate results. The time characteristic is Event Time and I am adding one AssignerWithPeriodicWatermarks immediately after the source. I need to add a different assigner, in the middle of the flow, to change the event time - i.e. extracting a different field from the record as event time. I am not sure I completely understand the implications of changing event time and watermark in the middle of a flow. Can anybody give me a hint or direct me to any relevant documentation? Lorenzo |
Hi Lorenzo,
If you want to learn how Flink uses watermarks, it is worth checking [1]. Now in a nutshell, what a watermark will do in a pipeline is that it may fire timers that you may have registered, or windows that you may have accumulated. If you have no time-sensitive operations between the first and the second watermark generators, then I do not think you have to worry (although it would help if you could share a bit more about your pipeline in order to have a more educated estimation). If you have windows, then your windows will fire and the emitted elements will have the timestamp of the end of the window. After the second watermark assigner, the watermarks coming from the previous one are discarded and they are not flowing in the pipeline anymore. You will only have the new watermarks. A problem may arise if, for example, the second watermark generator emits watermarks with smaller values than the first but the timestamps of the elements are assigned based on the progress of the first generator (e.g. windows fired) and now all your elements are considered "late". I hope that the above paint the big picture of what is happening in your pipeline. Again, I may be missing something so feel free to send more details about your pipeline so that we can help a bit more. Cheers, Kostas [1] https://ci.apache.org/projects/flink/flink-docs-release-1.11/concepts/timely-stream-processing.html On Wed, Jul 22, 2020 at 9:35 AM Lorenzo Nicora <[hidden email]> wrote: > > Hi > > I have a linear streaming flow with a single source and multiple sinks to publish intermediate results. > The time characteristic is Event Time and I am adding one AssignerWithPeriodicWatermarks immediately after the source. > I need to add a different assigner, in the middle of the flow, to change the event time - i.e. extracting a different field from the record as event time. > > I am not sure I completely understand the implications of changing event time and watermark in the middle of a flow. > > Can anybody give me a hint or direct me to any relevant documentation? > > Lorenzo |
Free forum by Nabble | Edit this page |