Hi, I got a problem in Flink and need your help. I tried to use TimeCharacteristic.EvenTime, but the sink function never be executed. public class StreamingJob { Here is the CustomWatermarkEmitter. I tried to increase the lag number, but not worked. public class CustomWatermarkEmitter implements AssignerWithPeriodicWatermarks<BitRate> { Here is the entity BitRate, the logs are generated in time , sample log `4281_783_1520047769115` public BitRate(long eventTime, long gameId, long rate, long user) { |
Hi,
for periodically generated watermarks, you should use `ExecutionConfig.setAutoWatermarkInterval()` to set an interval. Hope that helps. Best, Xingcan
|
Hi, thanks for your reply. I have searched it in stackoverflow, and there is someone who has the some problem. From your advice, I tried the code. env.getConfig().setAutoWatermarkInterval(3 * 1000); And it calls the getCurrentWaterMark function each 3 seconds, but still no result come out. From the outputs ('water mark1520049229163'), I could see that the add method is called. But the no result from the sink function.
|
Hi sundy, 1. Some partition of your input kafka don't have data. Since window watermark is the min value of all it's inputs, if there are no data from one of it's inputs, window will never be triggered. You can set parallelism of your job to 1 to avoid this problem(PS: Maybe this bug is fixed now, but worth a try). 2. Only one record in the input. In this case, window can not be triggered either. You might think of it like the time has be stopped. To trigger the widow, you should read more data with watermark bigger than the window end. Hope it helps you. Best, Hequn 2018-03-03 13:06 GMT+08:00 sundy <[hidden email]>:
|
Hi Hequn Cheng, Finally I got the problem and find the way to define the correct WaterMark by your advice, thank you very much. The problem is that I set the watermark to the waterMark = maxEventTime - lag And the timeWindow is 10Seconds, But I generated the test records too quickly so the 10000 records are all in the window duration(my bad). So flink are waiting for new more numbers to close the window. Another one question is why I set 'env.setParallelism(1)’ and run the code in IDEA(mini Flink cluster) , but the getWatermark is called in 4 different threads? Which time is the getWaterMark function called? After the keyBy operation or after the source operation?
|
Hi sundy, It is strange that your configuration does not take effect. Do you set parallelism somewhere else? Maybe, you can refer to the kafka test case[1]. In this test case, line 229 set parallelism to 1 and works fine. Hope it helps you. On Sat, Mar 3, 2018 at 4:02 PM, sundy <[hidden email]> wrote:
|
In reply to this post by sundy
Thanks a lot, use env.setParallelism(1) before the source define works (I set it before the env.execute, so it did not take effect).
|
Free forum by Nabble | Edit this page |