(DEPRECATED) Apache Flink User Mailing List archive.

Overhead when using map state

Classic

List

Threaded

1 message

Lasse Nedergaard-2

Overhead when using map state

Hi

We use Rocksdb for storing state and run on Flink 1.10.
We have followed best practices and used map state instead of a map in value state. We have seen problems with OOM exceptions and investigated it be creating a job with n numbers of key by where each key had a map either stored in map state or value state. The job reads and updates random values in the maps.
It turn out that the same map stored in map state consumes 3-4 times the memory compared with storing it in value state.
1. Can anyone explain why the overhead is so big?

At the same time we also see the throughput drops compared with value state. If iterate over all key in the state it would make sense but in our test we access random individual keys. We had a huge pressure on rocksdb and that could be the case.
2. Can anyone explain why the pressure on rocksdb are higher using map state compared to value state with a map?

Med venlig hilsen / Best regards
Lasse Nedergaard