Streaming to db question

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Streaming to db question

Flavio Pompermaier
Hi flinkers,
I was going to evaluate if Flink streaming could fit a use case we have, where data comes into the system, gets transformed and then added to a db (a very common problem..).
In such use case you have to manage the merge of existing records as new data come in. How can you ensure that only one row/entity of the db is updated at a time with Flink?
Is there any example?

Best,
Flavio
Reply | Threaded
Open this post in threaded view
|

Re: Streaming to db question

Stephan Ewen
Hi!

If the sink that writes to the Database executes partitioned by the primary key, then this should naturally prevent row conflicts.

Greetings,
Stephan


On Mon, Dec 14, 2015 at 11:32 AM, Flavio Pompermaier <[hidden email]> wrote:
Hi flinkers,
I was going to evaluate if Flink streaming could fit a use case we have, where data comes into the system, gets transformed and then added to a db (a very common problem..).
In such use case you have to manage the merge of existing records as new data come in. How can you ensure that only one row/entity of the db is updated at a time with Flink?
Is there any example?

Best,
Flavio

Reply | Threaded
Open this post in threaded view
|

Re: Streaming to db question

Flavio Pompermaier
I was thinking to something more like http://www.infoq.com/articles/key-lessons-learned-from-transition-to-nosql that basically implement what you call Out-of-core state at https://cwiki.apache.org/confluence/display/FLINK/Stateful+Stream+Processing. Riak provide some feature to handle the eventually consistent nature of that use case...or are you more likely to go with the current proposed soluion (the one in the Flink wiki...)?

On Mon, Dec 14, 2015 at 8:18 PM, Stephan Ewen <[hidden email]> wrote:
Hi!

If the sink that writes to the Database executes partitioned by the primary key, then this should naturally prevent row conflicts.

Greetings,
Stephan


On Mon, Dec 14, 2015 at 11:32 AM, Flavio Pompermaier <[hidden email]> wrote:
Hi flinkers,
I was going to evaluate if Flink streaming could fit a use case we have, where data comes into the system, gets transformed and then added to a db (a very common problem..).
In such use case you have to manage the merge of existing records as new data come in. How can you ensure that only one row/entity of the db is updated at a time with Flink?
Is there any example?

Best,
Flavio