Hi everyone!
I'm using Flink
1.12.0 with SQL API.
My streaming job is basically a join of two dynamic tables (from kafka topics) and insertion the result into PostgreSQL table.
I have enabled watermarking based on kafka topic timestamp column for each table in join:
CREATE TABLE table1 (
....,
kafka_time TIMESTAMP(3) METADATA FROM 'timestamp',
WATERMARK FOR kafka_time AS kafka_time - INTERVAL '2' HOUR
)
WITH ('format' = 'json',
...
But checkpoint data size for join task is permanently increasing despite the watermarks on the tables and "Low watermark" mark in UI.
As far as I understand outdated records from both tables must be dropped from checkpoint after 2 hours, but looks like it holds all job state since the beginning.
Should I enable this behavior for outdated records explicitly of modify join query?
Thank you!
Sent from the
Apache Flink User Mailing List archive. mailing list archive at Nabble.com.