Hi all, The external database consists of a set of rules for each key, these rules should be applied on each stream element in the Flink job. Because it is very expensive to make a DB call for each element and retrieve the rules, I want to fetch the rules from the
database at initialization and store it in a local cache. When rules are updated in the external database, a status change event is published to the Flink job which should be used to fetch the rules and refresh this cache. What is the best way to achieve what I've described? I looked into keyed state but initializing all keys and refreshing the keys on update doesn't seem possible. Thanks, Harshith |
Hi, Perhaps broadcast state is natural fit for this scenario. Thanks, Selvaraj C On Fri, 22 Jan 2021 at 8:45 PM, Kumar Bolar, Harshith <[hidden email]> wrote:
Regards, Selvaraj C |
But then you need a way to consume a database as a DataStream. I found this one
https://github.com/ververica/flink-cdc-connectors. I want to implement a similar use case, but I don’t know how to parse the SourceRecord (which comes from the connector) into an PoJo for further processing. Best, Jan Von: Selvaraj chennappan <[hidden email]>
Hi, Perhaps broadcast state is natural fit for this scenario. Thanks, Selvaraj C On Fri, 22 Jan 2021 at 8:45 PM, Kumar Bolar, Harshith <[hidden email]> wrote:
-- Regards, |
I provided an answer on stackoverflow, where I said the following: A few different mechanisms in Flink may be relevant to this use case, depending on your detailed requirements. Broadcast State Jaya Ananthram has already covered the idea of using broadcast state in his answer. This makes sense if the rules should be applied globally, for every key, and if you can find a way to collect and broadcast the updates. Note that the State Processor API If you want to bootstrap state in a Flink savepoint from a database dump, you can do that with this library. You'll find a simple example of using the State Processor API to bootstrap state in this gist. Change Data Capture The Table/SQL API supports Debezium, Canal, and Maxwell CDC streams, and Kafka upsert streams. This may be a solution. There's also flink-cdc-connectors. Lookup Joins Flink SQL can do temporal lookup joins against a JDBC database, with a configurable cache. Not sure this is relevant. On Fri, Jan 22, 2021 at 7:30 PM Jan Oelschlegel <[hidden email]> wrote:
|
Free forum by Nabble | Edit this page |