When using Flink for CEP, can the data in Cassandra database be used for state

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

When using Flink for CEP, can the data in Cassandra database be used for state

shyla deshpande
Hello all,

I am new to Flink.

We have our data in Cassandra database. We have a use case for CEP. 
I am checking out if Flink fits well for us.  

When processing the event data, I may want to pull data for the cassandra database like the user profile and join with the event data.

Is there a way to do that?  I appreciate your help. 

Thanks
Reply | Threaded
Open this post in threaded view
|

Re: When using Flink for CEP, can the data in Cassandra database be used for state

Kostas Kloudas
Hi Shyla,

Happy to hear that you are experimenting with CEP!

For enriching your input stream with data from Cassandra (or whichever external storage system) you could use:
* or, iff all your database fits in memory, you could write a ProcessFunction (https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/stream/process_function.html) which loads the database in memory in the open() method, and then uses the data accordingly.

Afterwards, you can use the resulting (enriched) DataStream to feed it into CEP for further processing.

Hope this helps!
Kostas

On Nov 9, 2017, at 12:08 AM, shyla deshpande <[hidden email]> wrote:

Hello all,

I am new to Flink.

We have our data in Cassandra database. We have a use case for CEP. 
I am checking out if Flink fits well for us.  

When processing the event data, I may want to pull data for the cassandra database like the user profile and join with the event data.

Is there a way to do that?  I appreciate your help. 

Thanks