watermark not advancing when reading kinesis historical data

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

watermark not advancing when reading kinesis historical data

Fanbin Bu
Hi,

i've been debugging this issue for several days now and still cant get it to work. I need to read the kinesis historical data (7 days) using Flink SQL. Here is my setup:

Flink version: 1.9.1
kinesis shard number: 32
Flink parallelism: 32
sql: select * from mytable (i purposely make this trivial to eliminate other variables)
kinesisGetRecordsInterval: 500
kinesisWatermarkLookahead: 3600000
kinesisWatermarkSync: 5000
enableWatermarkTracker: 1
kinesisWatermarkSyncQueueCapacity: 2000000

Here is the kinesis iterator age, miilisBehindLatest metric. Notice that when the shard millis behind latest is at around 88Million millisecond, there is bifurcation behavior. Some shard finished in 10 min while there are a few shards that are progressing really slow at a constant rate. I ruled out the data issue since I run the job at different days and it always acts like this pattern.

image.png


this is the log for the slowest shard:
2020-04-05 08:19:43,158  local watermark: 1585470569000, global watermark: 1585470545000, delta: 24000 timeouts: 0, emitter: queues: 1, empty: 1 0 timestamp: 1585470821000 size: 0
2020-04-05 08:20:43,196  local watermark: 1585483369000, global watermark: 1585481484000, delta: 1885000 timeouts: 0, emitter: queues: 1, empty: 1 0 timestamp: 1585483620000 size: 0
2020-04-05 08:21:43,236  local watermark: 1585495456000, global watermark: 1585492631000, delta: 2825000 timeouts: 0, emitter: queues: 1, empty: 0 0 timestamp: 1585495466000 size: 12610
2020-04-05 08:22:43,276  local watermark: 1585505759000, global watermark: 1585502892000, delta: 2867000 timeouts: 0, emitter: queues: 1, empty: 0 0 timestamp: 1585505769000 size: 19645
2020-04-05 08:23:43,315  local watermark: 1585512950000, global watermark: 1585511143000, delta: 1807000 timeouts: 0, emitter: queues: 1, empty: 1 0 timestamp: 1585513081000 size: 0
2020-04-05 08:24:43,355  local watermark: 1585518864000, global watermark: 1585516216000, delta: 2648000 timeouts: 0, emitter: queues: 1, empty: 1 0 timestamp: 1585518994000 size: 0
2020-04-05 08:25:43,391  local watermark: 1585526933000, global watermark: 1585524188000, delta: 2745000 timeouts: 0, emitter: queues: 1, empty: 0 0 timestamp: 1585526943000 size: 22815
2020-04-05 08:26:43,426  local watermark: 1585537556000, global watermark: 1585534931000, delta: 2625000 timeouts: 0, emitter: queues: 1, empty: 0 0 timestamp: 1585537566000 size: 24108
2020-04-05 08:27:43,462  local watermark: 1585548298000, global watermark: 1585545795000, delta: 2503000 timeouts: 0, emitter: queues: 1, empty: 0 0 timestamp: 1585548308000 size: 40414
2020-04-05 08:28:43,499  local watermark: 1585560867000, global watermark: 1585558484000, delta: 2383000 timeouts: 0, emitter: queues: 1, empty: 0 0 timestamp: 1585560877000 size: 22790
2020-04-05 08:29:43,533  local watermark: 1585571743000, global watermark: 1585568876000, delta: 2867000 timeouts: 0, emitter: queues: 1, empty: 0 0 timestamp: 1585571753000 size: 30451
2020-04-05 08:30:43,568  local watermark: 1585580311000, global watermark: 1585577325000, delta: 2986000 timeouts: 0, emitter: queues: 1, empty: 0 0 timestamp: 1585580321000 size: 5815
2020-04-05 08:31:43,648  local watermark: 1585587075000, global watermark: 1585583967000, delta: 3108000 timeouts: 0, emitter: queues: 1, empty: 0 0 timestamp: 1585587085000 size: 12759
2020-04-05 08:32:43,681  local watermark: 1585594321000, global watermark: 1585591449000, delta: 2872000 timeouts: 0, emitter: queues: 1, empty: 0 0 timestamp: 1585594331000 size: 11646
2020-04-05 08:33:43,810  local watermark: 1585601229000, global watermark: 1585599260000, delta: 1969000 timeouts: 0, emitter: queues: 1, empty: 1 0 timestamp: 1585601239000 size: 0
2020-04-05 08:34:43,836  local watermark: 1585606304000, global watermark: 1585604370000, delta: 1934000 timeouts: 0, emitter: queues: 1, empty: 1 0 timestamp: 1585606313000 size: 0
2020-04-05 08:35:43,867  local watermark: 1585613635000, global watermark: 1585611013000, delta: 2622000 timeouts: 0, emitter: queues: 1, empty: 0 0 timestamp: 1585613645000 size: 23455
2020-04-05 08:36:43,985  local watermark: 1585624508000, global watermark: 1585621886000, delta: 2622000 timeouts: 0, emitter: queues: 1, empty: 0 0 timestamp: 1585624518000 size: 32461
2020-04-05 08:37:44,227  local watermark: 1585636588000, global watermark: 1585634082000, delta: 2506000 timeouts: 0, emitter: queues: 1, empty: 0 0 timestamp: 1585636598000 size: 42527
2020-04-05 08:38:44,258  local watermark: 1585648863000, global watermark: 1585645998000, delta: 2865000 timeouts: 0, emitter: queues: 1, empty: 0 0 timestamp: 1585648873000 size: 42977
2020-04-05 08:39:44,290  local watermark: 1585660610000, global watermark: 1585657589000, delta: 3021000 timeouts: 0, emitter: queues: 1, empty: 0 0 timestamp: 1585660620000 size: 12105
2020-04-05 08:40:44,320  local watermark: 1585669664000, global watermark: 1585666678000, delta: 2986000 timeouts: 0, emitter: queues: 1, empty: 0 0 timestamp: 1585669674000 size: 7041
2020-04-05 08:41:44,352  local watermark: 1585678843000, global watermark: 1585675855000, delta: 2988000 timeouts: 0, emitter: queues: 1, empty: 0 0 timestamp: 1585678853000 size: 42842
2020-04-05 08:42:44,487  local watermark: 1585687898000, global watermark: 1585684547000, delta: 3351000 timeouts: 0, emitter: queues: 1, empty: 0 0 timestamp: 1585687908000 size: 40974
2020-04-05 08:43:44,517  local watermark: 1585696358000, global watermark: 1585693370000, delta: 2988000 timeouts: 0, emitter: queues: 1, empty: 0 0 timestamp: 1585696368000 size: 64189
2020-04-05 08:44:44,629  local watermark: 1585708314000, global watermark: 1585705567000, delta: 2747000 timeouts: 0, emitter: queues: 1, empty: 0 0 timestamp: 1585708324000 size: 41764
2020-04-05 08:45:44,660  local watermark: 1585721235000, global watermark: 1585718611000, delta: 2624000 timeouts: 0, emitter: queues: 1, empty: 0 0 timestamp: 1585721245000 size: 27921
2020-04-05 08:46:44,691  local watermark: 1585732710000, global watermark: 1585730206000, delta: 2504000 timeouts: 0, emitter: queues: 1, empty: 0 0 timestamp: 1585732720000 size: 40854
2020-04-05 08:47:44,970  local watermark: 1585744665000, global watermark: 1585742037000, delta: 2628000 timeouts: 0, emitter: queues: 1, empty: 0 0 timestamp: 1585744675000 size: 62636
2020-04-05 08:48:45,001  local watermark: 1585754074000, global watermark: 1585751448000, delta: 2626000 timeouts: 0, emitter: queues: 1, empty: 0 0 timestamp: 1585754084000 size: 87895
2020-04-05 08:49:45,031  local watermark: 1585764578000, global watermark: 1585761590000, delta: 2988000 timeouts: 0, emitter: queues: 1, empty: 0 0 timestamp: 1585764588000 size: 92298
2020-04-05 08:50:45,063  local watermark: 1585773519000, global watermark: 1585770772000, delta: 2747000 timeouts: 0, emitter: queues: 1, empty: 0 0 timestamp: 1585773529000 size: 139090
2020-04-05 08:51:45,093  local watermark: 1585782938000, global watermark: 1585780194000, delta: 2744000 timeouts: 0, emitter: queues: 1, empty: 0 0 timestamp: 1585782948000 size: 111515
2020-04-05 08:52:45,124  local watermark: 1585790547000, global watermark: 1585787560000, delta: 2987000 timeouts: 0, emitter: queues: 1, empty: 0 0 timestamp: 1585790557000 size: 103282
2020-04-05 08:53:45,154  local watermark: 1585800211000, global watermark: 1585797827000, delta: 2384000 timeouts: 0, emitter: queues: 1, empty: 0 0 timestamp: 1585800221000 size: 114205
2020-04-05 08:54:45,248  local watermark: 1585813008000, global watermark: 1585810019000, delta: 2989000 timeouts: 0, emitter: queues: 1, empty: 0 0 timestamp: 1585813018000 size: 131043
2020-04-05 08:55:45,279  local watermark: 1585823626000, global watermark: 1585820880000, delta: 2746000 timeouts: 0, emitter: queues: 1, empty: 0 0 timestamp: 1585823636000 size: 139232
2020-04-05 08:56:45,309  local watermark: 1585833413000, global watermark: 1585830547000, delta: 2866000 timeouts: 0, emitter: queues: 1, empty: 0 0 timestamp: 1585833423000 size: 156983
2020-04-05 08:57:45,340  local watermark: 1585842588000, global watermark: 1585839479000, delta: 3109000 timeouts: 0, emitter: queues: 1, empty: 0 0 timestamp: 1585842598000 size: 132193
2020-04-05 08:58:45,370  local watermark: 1585847785000, global watermark: 1585844558000, delta: 3227000 timeouts: 0, emitter: queues: 1, empty: 0 0 timestamp: 1585847795000 size: 54961
2020-04-05 08:59:45,400  local watermark: 1585851485000, global watermark: 1585848244000, delta: 3241000 timeouts: 0, emitter: queues: 1, empty: 0 0 timestamp: 1585851495000 size: 33014
2020-04-05 09:00:45,431  local watermark: 1585855232000, global watermark: 1585852004000, delta: 3228000 timeouts: 0, emitter: queues: 1, empty: 0 0 timestamp: 1585855242000 size: 82978
2020-04-05 09:01:45,461  local watermark: 1585862106000, global watermark: 1585858999000, delta: 3107000 timeouts: 0, emitter: queues: 1, empty: 0 0 timestamp: 1585862116000 size: 86132
2020-04-05 09:02:45,491  local watermark: 1585870684000, global watermark: 1585868060000, delta: 2624000 timeouts: 0, emitter: queues: 1, empty: 0 0 timestamp: 1585870694000 size: 79626
2020-04-05 09:03:45,521  local watermark: 1585883611000, global watermark: 1585881109000, delta: 2502000 timeouts: 0, emitter: queues: 1, empty: 0 0 timestamp: 1585883621000 size: 58627
2020-04-05 09:04:45,550  local watermark: 1585896288000, global watermark: 1585893907000, delta: 2381000 timeouts: 0, emitter: queues: 1, empty: 0 0 timestamp: 1585896298000 size: 124521
2020-04-05 09:05:45,578  local watermark: 1585907473000, global watermark: 1585904848000, delta: 2625000 timeouts: 0, emitter: queues: 1, empty: 0 0 timestamp: 1585907483000 size: 113600
2020-04-05 09:06:45,608  local watermark: 1585918096000, global watermark: 1585915231000, delta: 2865000 timeouts: 0, emitter: queues: 1, empty: 0 0 timestamp: 1585918106000 size: 90430
2020-04-05 09:07:45,637  local watermark: 1585928364000, global watermark: 1585925496000, delta: 2868000 timeouts: 0, emitter: queues: 1, empty: 0 0 timestamp: 1585928374000 size: 106336
2020-04-05 09:08:45,665  local watermark: 1585936991000, global watermark: 1585934364000, delta: 2627000 timeouts: 0, emitter: queues: 1, empty: 0 0 timestamp: 1585937001000 size: 97629
2020-04-05 09:09:45,708  local watermark: 1585947496000, global watermark: 1585944629000, delta: 2867000 timeouts: 0, emitter: queues: 1, empty: 0 0 timestamp: 1585947506000 size: 94141
2020-04-05 09:10:45,736  local watermark: 1585958007000, global watermark: 1585955506000, delta: 2501000 timeouts: 0, emitter: queues: 1, empty: 0 0 timestamp: 1585958017000 size: 73463
2020-04-05 09:11:48,783  local watermark: 1585968527000, global watermark: 1585965331000, delta: 3196000 timeouts: 2, emitter: queues: 1, empty: 0 0 timestamp: 1585968537000 size: 69966
2020-04-05 09:12:48,810  local watermark: 1585982521000, global watermark: 1585980016000, delta: 2505000 timeouts: 2, emitter: queues: 1, empty: 0 0 timestamp: 1585982531000 size: 70872
2020-04-05 09:13:48,971  local watermark: 1585992773000, global watermark: 1585991995000, delta: 778000 timeouts: 2, emitter: queues: 1, empty: 1 0 timestamp: 1585992792000 size: 0
2020-04-05 09:14:48,997  local watermark: 1585993537000, global watermark: 1585993262000, delta: 275000 timeouts: 2, emitter: queues: 1, empty: 1 0 timestamp: 1585993557000 size: 0
2020-04-05 09:15:49,024  local watermark: 1585994296000, global watermark: 1585994296000, delta: 0 timeouts: 2, emitter: queues: 1, empty: 1 0 timestamp: 1585994327000 size: 0
2020-04-05 09:16:49,050  local watermark: 1585995062000, global watermark: 1585995062000, delta: 0 timeouts: 2, emitter: queues: 1, empty: 1 0 timestamp: 1585995089000 size: 0
2020-04-05 09:17:49,120  local watermark: 1585995874000, global watermark: 1585995874000, delta: 0 timeouts: 2, emitter: queues: 1, empty: 1 0 timestamp: 1585995902000 size: 0
2020-04-05 09:18:49,147  local watermark: 1585996741000, global watermark: 1585996741000, delta: 0 timeouts: 2, emitter: queues: 1, empty: 1 0 timestamp: 1585996760000 size: 0
2020-04-05 09:19:49,800  local watermark: 1585997551000, global watermark: 1585997551000, delta: 0 timeouts: 2, emitter: queues: 1, empty: 1 0 timestamp: 1585997579000 size: 0
2020-04-05 09:20:49,826  local watermark: 1585998340000, global watermark: 1585998340000, delta: 0 timeouts: 2, emitter: queues: 1, empty: 1 0 timestamp: 1585998350000 size: 0
2020-04-05 09:21:49,852  local watermark: 1585999060000, global watermark: 1585999060000, delta: 0 timeouts: 2, emitter: queues: 1, empty: 1 0 timestamp: 1585999099000 size: 0
2020-04-05 09:22:49,878  local watermark: 1585999949000, global watermark: 1585999949000, delta: 0 timeouts: 2, emitter: queues: 1, empty: 1 0 timestamp: 1585999968000 size: 0
2020-04-05 09:23:49,903  local watermark: 1586000813000, global watermark: 1586000813000, delta: 0 timeouts: 2, emitter: queues: 1, empty: 1 0 timestamp: 1586000831000 size: 0
2020-04-05 09:24:49,929  local watermark: 1586001616000, global watermark: 1586001616000, delta: 0 timeouts: 2, emitter: queues: 1, empty: 1 0 timestamp: 1586001641000 size: 0
2020-04-05 09:25:49,956  local watermark: 1586002436000, global watermark: 1586002436000, delta: 0 timeouts: 2, emitter: queues: 1, empty: 1 0 timestamp: 1586002463000 size: 0
2020-04-05 09:26:49,982  local watermark: 1586003150000, global watermark: 1586003150000, delta: 0 timeouts: 2, emitter: queues: 1, empty: 1 0 timestamp: 1586003171000 size: 0
2020-04-05 09:27:50,008  local watermark: 1586003865000, global watermark: 1586003865000, delta: 0 timeouts: 2, emitter: queues: 1, empty: 1 0 timestamp: 1586003884000 size: 0
2020-04-05 09:28:50,035  local watermark: 1586004698000, global watermark: 1586004698000, delta: 0 timeouts: 2, emitter: queues: 1, empty: 1 0 timestamp: 1586004717000 size: 0
2020-04-05 09:29:50,061  local watermark: 1586005505000, global watermark: 1586005505000, delta: 0 timeouts: 2, emitter: queues: 1, empty: 1 0 timestamp: 1586005524000 size: 0
2020-04-05 09:30:50,086  local watermark: 1586006249000, global watermark: 1586006249000, delta: 0 timeouts: 2, emitter: queues: 1, empty: 1 0 timestamp: 1586006267000 size: 0


Any idea what's going on here? 
Thanks a lot!

Fanbin