Hi forideal,
I got your point.
About replay kafka history data, if the data came in flink very unbalanced between partitions.
That maybe lead to very big state, and lead to disk/memory unstable.
And Yes, FLIP-27 can help you.
About work around way, IMO, it maybe a little hacky but works. And it should not easy.
How about try solving this inside Flink, like use "GlobalAggregateManager", better than external storage.
Hello everyone
Now i have a job with big state in RocksDB.This job's source is Kafka. If i want to replay data, the job will crash.
One of the motivation of FLIP 27 is event time alignment , however , it is not already for me.
How can i work around?
Here is an immature solution, I don't know if it works
1. I save all partition's event time in exteranl storage,for example, Redis
2. In source function,i read all partition's event time periodically
3. If I find something faster, I let him wait
Thank you
--