Hi All, Is there way to run CEP function parallel. Currently CEP run only sequentially Cheers, Dhanuka -- Nothing Impossible,Creativity is more important than knowledge. |
Hi Dhanuka,
In order to make the CEP operator to run parallel, the input stream should be KeyedStream. You can refer [1] for detailed information. Regards, Dian [1]: https://ci.apache.org/projects/flink/flink-docs-master/dev/libs/cep.html#detecting-patterns > 在 2019年1月24日,上午10:18,dhanuka ranasinghe <[hidden email]> 写道: > > Hi All, > > Is there way to run CEP function parallel. Currently CEP run only sequentially > > <flink-CEP.png> > . > > Cheers, > Dhanuka > > -- > Nothing Impossible,Creativity is more important than knowledge. |
Hi Dian, I tried that but then kafkaproducer only produce to single partition and only single flink host working while rest not contribute for processing . I will share the code and screenshot Cheers Dhanuka On Thu, 24 Jan 2019, 12:31 Dian Fu <[hidden email] wrote: Hi Dhanuka, |
Whether using KeyedStream depends on the logic of your job, i.e, whether you are looking for patterns for some partitions, i.e, patterns for a particular user. If so, you should partition the input data before the CEP operator. Otherwise, the input data should not be partitioned. Regards, Dian
|
Hi Dian, Thanks for the explanation. Please find the screen shot and source code for above mention use case. And in main issue is though I use KeyedStream , parallelism not apply properly. Only one host is processing messages. Cheers, Dhanuka On Thu, Jan 24, 2019 at 1:40 PM Dian Fu <[hidden email]> wrote:
-- Nothing Impossible,Creativity is more important than knowledge.
FlinkCEP.java (13K) Download Attachment |
Hi Dhanuka, Does the KeySelector of Event::getTriggerID generate the same key for all the inputs or only generate very few key values and these key values happen to be hashed to the same downstream operator? You can print the results of Event::getTriggerID to check if it's that case. Regards, Dian
|
In this example key will be same. I am using 1 million messages with same key for performance testing. But still I want to process them parallel. Can't I use Split function and get a SplitStream for that purpose? On Thu, Jan 24, 2019 at 2:49 PM Dian Fu <[hidden email]> wrote:
Nothing Impossible,Creativity is more important than knowledge.
|
I'm afraid you cannot do that. The inputs having the same key should be processed by the same CEP operator. Otherwise the results will be nondeterministic and also be wrong. Regards, Dian
|
Free forum by Nabble | Edit this page |