ProcessFunction's Event Timer not firing

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

ProcessFunction's Event Timer not firing

Fritz Budiyanto
Hi All,

I noticed if one of the slot's watermark not progressing, its impacting all slots processFunction timer and no timer are not firing.

In my example, I have Source parallelism set to 8 and Kafka partition is 4. The next operator is processFunction with parallelism of 8 +  event timer. I can see from the debug log that one of the slot's watermark is not progressing. As a result, all slot's timer in the process function are not firing. Is this expected behavior or issue? How do I prevent this condition?

Thanks,
Fritz
Reply | Threaded
Open this post in threaded view
|

Re: ProcessFunction's Event Timer not firing

Hequn Cheng
Hi Fritz,

Watermarks are merged on stream shuffles. If one of the input's watermark not progressing, they will not advance the event time at the operators. I think you should decrease the parallelism of source and make sure there are data in each of your source partition. 
Note that the Kafka source supports per-partition watermarking, which you can read more about here[1].

Best, Hequn


On Fri, Nov 9, 2018 at 1:56 AM Fritz Budiyanto <[hidden email]> wrote:
Hi All,

I noticed if one of the slot's watermark not progressing, its impacting all slots processFunction timer and no timer are not firing.

In my example, I have Source parallelism set to 8 and Kafka partition is 4. The next operator is processFunction with parallelism of 8 +  event timer. I can see from the debug log that one of the slot's watermark is not progressing. As a result, all slot's timer in the process function are not firing. Is this expected behavior or issue? How do I prevent this condition?

Thanks,
Fritz
Reply | Threaded
Open this post in threaded view
|

Re: ProcessFunction's Event Timer not firing

Fritz Budiyanto
Thanks Hequn for the pointer.

From what I read, I may also need to emit the timestamp regularly for all idle partitions to ensure watermark progression.

Fritz

On Nov 8, 2018, at 6:02 PM, Hequn Cheng <[hidden email]> wrote:

Hi Fritz,

Watermarks are merged on stream shuffles. If one of the input's watermark not progressing, they will not advance the event time at the operators. I think you should decrease the parallelism of source and make sure there are data in each of your source partition. 
Note that the Kafka source supports per-partition watermarking, which you can read more about here[1].

Best, Hequn


On Fri, Nov 9, 2018 at 1:56 AM Fritz Budiyanto <[hidden email]> wrote:
Hi All,

I noticed if one of the slot's watermark not progressing, its impacting all slots processFunction timer and no timer are not firing.

In my example, I have Source parallelism set to 8 and Kafka partition is 4. The next operator is processFunction with parallelism of 8 +  event timer. I can see from the debug log that one of the slot's watermark is not progressing. As a result, all slot's timer in the process function are not firing. Is this expected behavior or issue? How do I prevent this condition?

Thanks,
Fritz