Hello,
I'm trying to build a stream event correlation engine with Flink and I have some questions regarding the execution of jobs.
In my architecture I need to have different sources of data, lets say for instance:
firewallStream= environment.addSource([FirewalLogsSource]);
proxyStream = environment.addSource([ProxyLogsSource]);
and for each of these sources, I need to apply a set of rules.
So lets say I have a job that has as a source the proxy stream data with the following rules:
//Abnormal Request Method
stream.[RuleLogic].addSink([output])
//Web Service on Non-Typical Port
stream.[RuleLogic].addSink([output])
//Possible Brute Force
stream.[RuleLogic].addSink([output])These rules will probably scale to be in the order of 15 to 20 rules.
What is the best approach in this case:
1. Should I create 2 jobs one for each source and each job would have the 15-20 rules?
2. Should I split the rules into several jobs?
3. Other options?
Thank you and Regards,
Pedro Chaves.
Best Regards,
Pedro Chaves