Hi,
Large state is mainly an issue for Flink's fault tolerance mechanism which is based on periodic checkpoints, which means that the state is copied to a remote storage system in regular intervals.
In case of a failure, the state copy needs to be loaded which takes more time with growing state size.
There are a few features of Flink that reduce the cost of large state, like incremental checkpoints and local recovery.
However, in general is large state more difficult to handle than small state.
If your application needs to persists state forever to run a join with correct semantics, than this can be fine.
However, you should roughly assess how fast your state will be growing and prepare your application to be able to scale to more machines (configure max-parallelism) when the limits of your current setup are reached.
Best, Fabian
Am Do., 16. Jan. 2020 um 16:07 Uhr schrieb kant kodali <
[hidden email]>:
Hi All,
"However, this operation has an important implication: it requires to keep both sides of the join input in Flink’s state forever. Thus, the resource usage will grow indefinitely as well, if one or both input tables are continuously growing"
I wonder why this would be an issue especially when the state is stored in RocksDB which in turn is backed by disk?
I have a use case where I might need to do stream-stream join or some emulation of that across say 6 or more tables and I don't know for sure how long I need to keep the state because a row today can join with a row a year or two years from now. will that be an issue? do I need to think about designing a solution in another way without using stream-stream join?
Thanks!