Ever increasing key space

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Ever increasing key space

burgesschen
Hi every one,

We are building a flink job that keys on a dynamic value. Only a few events
share the same key and events with new keys are consumed constantly.

For each key, there are some keyedState created the first time it is seen.
And we clean up the keyedState if the key has not been seen for X minutes
using a timer.

My question is:
If the key space is ever increasing? Does it result in an ever increasing
checkpoint size even I clean up the keyedState?

Thank you!


Best,
-Chen



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Ever increasing key space

Yun Tang
Hi Chen

From your description, I think you called keyedState.clear() to clear up the key which has not been seen for several minutes.
  • For HeapKeyedStateBackend, it will just remove the related content from memory immediately, no worry about the increasing checkpoint size.
  • For RocksDBKeyedStateBackend, it will record delete operation for the key bytes in the DB, but the actual 'remove' (not occupying any space for the to-delete-key) would happen when compaction executed generally. In other words, if you called keyedState.clear() to clean up current key related bytes, you might not expect the checkpoint size decreased immediately but it eventually decreases as rocksDB always running compaction. If you still worry about this, consider to increase the background compaction threads for RocksDB by calling DBOptions.setMaxBackgroundCompactions or DBOptions.setIncreaseParallelism .
Best,
Yun

From: burgesschen <[hidden email]>
Sent: Monday, July 16, 2018 23:57
To: [hidden email]
Subject: Ever increasing key space
 
Hi every one,

We are building a flink job that keys on a dynamic value. Only a few events
share the same key and events with new keys are consumed constantly.

For each key, there are some keyedState created the first time it is seen.
And we clean up the keyedState if the key has not been seen for X minutes
using a timer.

My question is:
If the key space is ever increasing? Does it result in an ever increasing
checkpoint size even I clean up the keyedState?

Thank you!


Best,
-Chen



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/