Hi,
Do you use incremental checkpoint ?
RocksDB is an append-only DB, so you will experience the steady increase
in state size until a compaction occurs and old values of keys are
garbage-collected.
However, the average state size should stabilize after a while, if the
load doesn't change.
Regards,
Kien
On 10/23/2018 7:03 PM, Sameer W wrote:
> Hi,
>
> We are using ValueState to maintain state. It is a pretty simple job
> with a keyBy operator on a stream and the subsequent map operator
> maintains state in a ValueState instance. The transaction load is in
> billion transactions per day. However the amount of state per key is a
> list of 18x6 long values which are constantly updated. We have about
> 20 million keys and transactions are uniformly distributed across
> those keys.
>
> When the job starts the size of the checkpoints (Using RocksDB backed
> by S3) is low (order of 500 MB). However, after 12 hours of operation
> the checkpoint sizes have increased to about 4-5 GB. Time taken to
> complete the checkpoint starts around 15-20 seconds and after 12 hours
> reaches about a minute.
>
> What is the reason behind the increasing size of checkpoints?
>
> Thanks,
> Sameer