[DISCUSS] Make window state queryable

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[DISCUSS] Make window state queryable

vino yang
Hi folks,

Currently, the queryable state is not widely used in production. IMO, there are two key reasons caused this result. 1) the client of the queryable state is hard to use. Because it requires users to know the address of TaskManager and the port of the proxy. Actually, most business users who do not have good knowledge about the Flink's inner and runtime in production. 2) The benefit of this feature has not been excavated. In Flink DataStream API, State is the first level citizen, it’s Flink key advantage compared with other compute engines. Because the queryable state is the most effective way to pry the latest computing progress.

Three months ago, I started a discussion about improving the queryable state and introducing a proxy component.[1] It brings a lot of attention and discussion. Recently, I have submitted a design document about the proposal.[2] These efforts try to process the first problem.

About the second question, the most essential solution is that we should really make the queryable state work. The window operator is one of the most valuable and most frequently used operators of all Flink operators. And it also uses keyed state which is queryable. So we propose to let the state of the window operator be queried. This is not only for increasing the value of the queryable state but also for the real business needs.

IMO, allowing window state to be queried will provide great value. In many scenarios, we often use large windows for aggregate calculations. A very common example is a day-level window that counts the PV of a day. But usually, the user is not only satisfied to wait until the end of the window to get the result. They want to get "intermediate results" at a smaller time granularity to analyze trends. Because Flink does not provide periodic triggers for fixed windows. We have extended this and implemented an "incremental window". It can trigger a fixed window with a smaller interval period and feedback intermediate results. However, we believe that this approach is still not flexible enough. We should let the user query the current calculation result of the window through the API at any time.

However, I know that if we want to implement it, we still have some details that need to be discussed, such as how to let users know the state descriptors in the window, namespace and so on.

This discussion thread is mainly to listen to the community's opinion on this proposal.

Any feedback and ideas are welcome and appreciated.

Best,
Vino