We are making few decisions on use cases where Queryable state is a natural fit https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/stream/state/queryable_state.html
Is Queryable state production ready ? We will go to 1.5 flnk if that helps to make the case for the usage. |
To be more precise, is anything thing similar to https://engineering.linkedin.com/blog/2018/03/air-traffic-controller--member-first-notifications-at-linkedin . done in Samza, can be replicated with production level guarantees with Flink Queryable state ( as it stands currently version 1.5 ) ? On Fri, Mar 16, 2018 at 5:10 PM, Vishal Santoshi <[hidden email]> wrote:
|
Hi Vishal, In general, Queryable State should be ready to use. There are a few things to consider though: - State queries are not synchronized with the application code, i.e., they can happen at the same time. Therefore, the Flink application should not modify objects that have been put into or read from the state if you are not using the RocksDBStatebackend (which creates copies by deserialization). 2018-03-18 16:50 GMT+01:00 Vishal Santoshi <[hidden email]>:
|
Those are understandable. I am more interested in a few things ( and may be more that could be added ) * As far as I can understand JM is the SPOF. Does HA become a necessity ? * If there are 2 or more JM could we theoretically have a LB fronting them ? Thus it is a peer to peer access ( Cassandra ) or a master slave setup for JM HA specifically for Queryable access ( For flink jobs it is master slave ) * Do we replicate state to other TMs for read optimization ( specifically to avoid Hot Node issues ) ? * If the job goes down it seems the state is not accessible. What plans to we have to "separate concerns" for Queryable state. We consider Queryable State significant a feature Flink provides and would do the necessary leg work if there are certain gaps in it being trully considered a Highly Available Key Value store. Regards. On Mon, Mar 19, 2018 at 5:53 AM, Fabian Hueske <[hidden email]> wrote:
|
Queryable state is "just" an additional feature on regular keyed
state. i.e., the only difference is that you can read the state from an
outside application. Besides that it behaves exactly like regular application stateQueryable state is (at the moment) designed to be accessible if a job runs. Best, Fabian 2018-03-19 14:24 GMT+01:00 Vishal Santoshi <[hidden email]>:
|
Are there plans to address all or few of the above apart from the "JM LB not possible" which seems understandable ? On Mon, Mar 19, 2018 at 9:58 AM, Fabian Hueske <[hidden email]> wrote:
|
AFAIK, there have been discussions to replicate state among TMs to speed up recovery (and improve availability). However, I'm not aware of plans to implement that.2018-03-19 15:17 GMT+01:00 Vishal Santoshi <[hidden email]>:
|
Thank you. These do look like show stoppers for us. But again thank you. On Mon, Mar 19, 2018 at 10:31 AM, Fabian Hueske <[hidden email]> wrote:
|
Hi Vishal,
As Fabian said, queryable state is just a feature that exposes the state kept within Flink, and it is not made to replace functionality that would otherwise be made by a sink. In the future the functionality will definitely evolve but for there are no discussions currently, for keeping the state of a job even after the job is done. For being able to do so, with exactly once semantics and all the guarantees provided by Flink, I would recommend to use an external sink. Cheers, Kostas
|
Thank you for the clarification. On Wed, Mar 21, 2018, 4:28 AM Kostas Kloudas <[hidden email]> wrote:
|
Free forum by Nabble | Edit this page |