(DEPRECATED) Apache Flink User Mailing List archive.

[DISCUSS] Proposal to support disk spilling in HeapKeyedStateBackend

Classic

List

Threaded

2 messages Options

Yu Li

[DISCUSS] Proposal to support disk spilling in HeapKeyedStateBackend

Hi All,

As mentioned in our speak[1] given in FlinkForwardChina2018, we have improved HeapKeyedStateBackend to support disk spilling and put it in production here in Alibaba for last year's Singles' Day. Now we're ready to upstream our work and the design doc is up for review[2]. Please let us know your point of the feature and any comment is welcomed/appreciated.

We plan to keep the discussion open for at least 72 hours, and will create umbrella jira and subtasks if no objections. Thanks.

Below is a brief description about the motivation of the work, FYI:

HeapKeyedStateBackend is one of the two KeyedStateBackends in Flink, since state lives as Java objects on the heap in HeapKeyedStateBackend and the de/serialization only happens during state snapshot and restore, it outperforms RocksDBKeyeStateBackend when all data could reside in memory.
However, along with the advantage, HeapKeyedStateBackend also has its shortcomings, and the most painful one is the difficulty to estimate the maximum heap size (Xmx) to set, and we will suffer from GC impact once the heap memory is not enough to hold all state data. There’re several (inevitable) causes for such scenario, including (but not limited to):

* Memory overhead of Java object representation (tens of times of the serialized data size).
* Data flood caused by burst traffic.
* Data accumulation caused by source malfunction.
To resolve this problem, we proposed a solution to support spilling state data to disk before heap memory is exhausted. We will monitor the heap usage and choose the coldest data to spill, and reload them when heap memory is regained after data removing or TTL expiration, automatically.

[1] https://files.alicdn.com/tpsservice/1df9ccb8a7b6b2782a558d3c32d40c19.pdf

[2] https://docs.google.com/document/d/1rtWQjIQ-tYWt0lTkZYdqTM6gQUleV8AXrfTOyWUZMf4/edit?usp=sharing

Best Regards,

Stefan Richter-4

Re: [DISCUSS] Proposal to support disk spilling in HeapKeyedStateBackend

Hi Yu,

Sorry for the late reaction. As already discussed internally, I think this is a very good proposal and design that can help to improve a major limitation of the current state backend. I think that most discussion is happening in the design doc and I left my comments there. Looking forward to seeing this integrated with Flink soon!

Best,

Stefan

On 24. May 2019, at 14:50, Yu Li <[hidden email]> wrote:

Hi All,

As mentioned in our speak[1] given in FlinkForwardChina2018, we have improved HeapKeyedStateBackend to support disk spilling and put it in production here in Alibaba for last year's Singles' Day. Now we're ready to upstream our work and the design doc is up for review[2]. Please let us know your point of the feature and any comment is welcomed/appreciated.

We plan to keep the discussion open for at least 72 hours, and will create umbrella jira and subtasks if no objections. Thanks.

Below is a brief description about the motivation of the work, FYI:

HeapKeyedStateBackend is one of the two KeyedStateBackends in Flink, since state lives as Java objects on the heap in HeapKeyedStateBackend and the de/serialization only happens during state snapshot and restore, it outperforms RocksDBKeyeStateBackend when all data could reside in memory.
However, along with the advantage, HeapKeyedStateBackend also has its shortcomings, and the most painful one is the difficulty to estimate the maximum heap size (Xmx) to set, and we will suffer from GC impact once the heap memory is not enough to hold all state data. There’re several (inevitable) causes for such scenario, including (but not limited to):
* Memory overhead of Java object representation (tens of times of the serialized data size).
* Data flood caused by burst traffic.
* Data accumulation caused by source malfunction.
To resolve this problem, we proposed a solution to support spilling state data to disk before heap memory is exhausted. We will monitor the heap usage and choose the coldest data to spill, and reload them when heap memory is regained after data removing or TTL expiration, automatically.

[1] https://files.alicdn.com/tpsservice/1df9ccb8a7b6b2782a558d3c32d40c19.pdf
[2] https://docs.google.com/document/d/1rtWQjIQ-tYWt0lTkZYdqTM6gQUleV8AXrfTOyWUZMf4/edit?usp=sharing

Best Regards,

Yu