Hi,
I think this would be the very basic use case for Broadcast State Pattern but I would like to know what are the best approaches to solve this problem. I have an operator that extends BroadcastProcessFunction. The brodcastElement is an element sent as Json format message by Kafka. It describes a processing rules like key/value mapping, like so: ruleName - ruleValue (both strings). In processElement method I'm delegating to my custom RuleEngineService. It is a class that has the "rule engine" logic and accepts received event and "set of processing rules" in some form. What would be the best approaches: 1. Keep original Json String in broadcast state. Whenever there is a new set of rules streamed by Kafka, then in processBroadcastElement method parse this Json, map to some RuleParams abstraction and keep it as transient field in my BroadcastProcessFunction operator. Save Json in broadcast state. Pass RuleParams to rule engine service. 2. Same as 1 but instead keeping Raw Json String in broadcast state, keep already parsed JsonObject, somethign like ObjectNode from KafkaConnector lib. 3. Keep each pair of ruleName - ruleValue (both strings) separate in broadcast state. In processBrodcastElement method parse the received Json and update the state. In processElement method take all rules, build RulePArams object (basically a map) and pass them to rule engine 4. Parse Json in processBroadcastElement method, map it to RuleParams abstraction method, keeping rules in a hashMap and keep this RulePrams in broadcast state 5. any other... -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ |
Hi KristoffSC, It seems the main differences are when to parse your rules and what could be put into the broadcast state. IMO, multiple solutions all can take effect. I prefer option 3. I'd like to parse the rules ASAP and let them be real rule event stream (not ruleset stream) in the source. Then doing the real parse in the processBroadcastElement. In short, it's my personal opinion. Best, Vino KristoffSC <[hidden email]> 于2019年12月11日周三 上午6:26写道: Hi, |
Hi,
I think when it comes to the question "What data type should I put in state?", this question should usually be answered with a well-defined data structure that allows for future state upgrades. Like defining a database schema. So I would not put "arbirary" classes such as Jackson's ObjectNode in there. Putting a JSON string or an object like you RuleParams into state depends on the performance. If the JSON format changes frequently, it might be better to just store string there. But reparsing might be expensive too so keeping the transient variable for broadcast state as a cache should work. Regards, Timo On 11.12.19 04:21, vino yang wrote: > Hi KristoffSC, > > It seems the main differences are when to parse your rules and what > could be put into the broadcast state. > > IMO, multiple solutions all can take effect. I prefer option 3. I'd like > to parse the rules ASAP and let them be real rule event stream (not > ruleset stream) in the source. Then doing the real parse in the > processBroadcastElement. > > In short, it's my personal opinion. > > Best, > Vino > > KristoffSC <[hidden email] > <mailto:[hidden email]>> 于2019年12月11日周三 上午6:26写道: > > Hi, > I think this would be the very basic use case for Broadcast State > Pattern > but I would like to know what are the best approaches to solve this > problem. > > I have an operator that extends BroadcastProcessFunction. The > brodcastElement is an element sent as Json format message by Kafka. It > describes a processing rules like key/value mapping, like so: ruleName - > ruleValue (both strings). > > In processElement method I'm delegating to my custom > RuleEngineService. It > is a class that has the "rule engine" logic and accepts received > event and > "set of processing rules" in some form. > > What would be the best approaches: > 1. Keep original Json String in broadcast state. Whenever there is a > new set > of rules streamed by Kafka, then in processBroadcastElement method parse > this Json, map to some RuleParams abstraction and keep it as > transient field > in my BroadcastProcessFunction operator. Save Json in broadcast > state. Pass > RuleParams to rule engine service. > > 2. Same as 1 but instead keeping Raw Json String in broadcast state, > keep > already parsed JsonObject, somethign like ObjectNode from KafkaConnector > lib. > > 3. Keep each pair of ruleName - ruleValue (both strings) separate in > broadcast state. In processBrodcastElement method parse the received > Json > and update the state. In processElement method take all rules, build > RulePArams object (basically a map) and pass them to rule engine > > 4. Parse Json in processBroadcastElement method, map it to RuleParams > abstraction method, keeping rules in a hashMap and keep this > RulePrams in > broadcast state > > 5. any other... > > > > > > -- > Sent from: > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ > |
Free forum by Nabble | Edit this page |