Hello,
We are using QueryableState in some of Nussknacker deployments as a nice addition, allowing end users to peek inside job state for a given key (we mostly use custom operators). Judging by mailing list and feature radar proposition by Stephan: https://github.com/StephanEwen/flink-web/blob/feature_radar/img/flink_feature_radar.svg this feature is not widely used/supported. I'd like to ask: - are there any alternative ways of accessing state during job execution? State API is very nice, but it operates on checkpoints and loading whole state to lookup one key seems a bit heavy? - are there any inherent problems in QueryableState design (e.g. it's not feasible to use it in K8 settings, performance considerations) or just lack of interest/support (in that case we may offer some help)? thanks, maciek |
Hi Maciek,
Thank you for reaching out. I'll try to answer your questions separately. - nothing comparable. You already mention the State Processor API. Besides that, I can only think of a side channel (CoFunction) that is used to request a certain state that is then send to a side output and ultimate to a sink, e.g. Kafka State Request Topic -> Flink -> Kafka State Response Topic. This puts this complexity into the Flink Job, though. - I think it is a combination of both. Queryable State works well within its limitations. In the case of the RocksDBStatebackend this is mainly the availability of the job and the fact that you might read "uncommitted" state updates. In case of the heap-backed statebackends there are also synchronization issues, e.g. you might read stale values. You also mention the fact that queryable state has been an afterthought when it comes to more recent deployment options. I am not aware of any Committer who currently has the time to work on this to the degree that would be required. So, we thought, it would be more fair and realistic to mark Queryable State as "approaching end of life" in the sense that there is no active development on that component anymore. Best, Konstantin On Tue, Mar 9, 2021 at 7:08 AM Maciek Próchniak <[hidden email]> wrote: Hello, |
Hi Konstantin, thanks for detailed answer. I also thought about CoFunction, but it is a bit too heavy for us for the moment (each state would have to have additional kafka producer/consumer). Guess we'll use QueryableState for now and try to phase it out
slowly...
thanks, maciek
On 09.03.2021 17:42, Konstantin Knauf
wrote:
|
Hi Maciek, Thanks for reaching out. Only through these interactions, we know how important certain features are to users. Queryable State has some limitations and makes the whole system rather fragile. Most users that try it out are disappointed that there is actually no SQL support. If we could support it, then expensive queries would slow down the actual application... So if we have enough interest in the community, we would rather replace queryable state with some way to replicate state to an external system which supports proper queries and which has no influence on the live application. FLIP-158 [1] was just accepted and would make it easier to replicate state onto an external system. Replicating an external system is not planned yet, but it's one of the ideas that are floating around. Could you imagine to have your Flink state replicated into some key/value store, log stream, or database for your use case? What would be your preference? On Wed, Mar 10, 2021 at 2:44 PM Maciek Próchniak <[hidden email]> wrote:
|
Free forum by Nabble | Edit this page |