Feasability Question: Distributed FlinkCEP

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Feasability Question: Distributed FlinkCEP

Lucas Konstantin Bärenfänger
Hi all,

here's what I want to do: Consider a query such as (A and B) followed_by (C or D). (Pseudo code, obviously.) I want to create a graph of independent processing nodes, each running an independent instance of FlinkCEP. I want each of them to represent an operator of the query above, like so:

  followed_by       <-- Processing node 1
  /         \
 and        or      <-- Processing nodes 2 and 3
/   \      /   \
A   B      C   D


The three nodes would have to process the following (sub-)queries, respectively. (Pseudo code, again.)

Node1: {{node2ResultStream}} followed_by {{node3ResultStream}}
Node2: A and B
Node3: C or D

Long story short: I want to execute the query in a distributed fashion. Is that currently possible using FlinkCEP?

Thank you very much in advance!

Best
Lucas
Reply | Threaded
Open this post in threaded view
|

Re: Feasability Question: Distributed FlinkCEP

Sameer Wadkar
Could you not do separate followedBy and then perform a join on the resulting alert stream. 

Pattern p1= followedBy(/*1st*/)
Pattern p2= followedBy(/*1st*/)
DataStream alertStream1  = CEP.pattern(keyedDs, p1)
DataStream alertStream2  = CEP.pattern(keyedDs, p2)

Then just join the two alertStream's using a keyBy (some common key in the Alert events) on Event Time, only emit the ones with alerts from both sides if and'ing and either side if or'ing. Or another CEP operation on the two Alert Streams after keying by on something common in the alert events. Or if you just union the two streams and apply CEP on the resulting stream.

The pattern you mentioned seems only possible if each pattern works on separate keys but you still want to decide if two separate keys produced an alert. 

Sameer

On Thu, Oct 20, 2016 at 6:27 AM, Lucas Konstantin Bärenfänger <[hidden email]> wrote:
Hi all,

here's what I want to do: Consider a query such as (A and B) followed_by (C or D). (Pseudo code, obviously.) I want to create a graph of independent processing nodes, each running an independent instance of FlinkCEP. I want each of them to represent an operator of the query above, like so:

  followed_by       <-- Processing node 1
  /         \
 and        or      <-- Processing nodes 2 and 3
/   \      /   \
A   B      C   D


The three nodes would have to process the following (sub-)queries, respectively. (Pseudo code, again.)

Node1: {{node2ResultStream}} followed_by {{node3ResultStream}}
Node2: A and B
Node3: C or D

Long story short: I want to execute the query in a distributed fashion. Is that currently possible using FlinkCEP?

Thank you very much in advance!

Best
Lucas