Consistency guarantees on multiple sinks

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Consistency guarantees on multiple sinks

Nancy Estrada
Hi,
 
If in a Job there is more than one sink declared, what happens when a failure occurs? all the sink operations get aborted? (atomically as in a transactional environment), or the exactly-once-processing consistency guarantees are provided just when one sink is declared per job? Is it recommended to have more than one sink per job?

Thank you!
Nancy Estrada
Reply | Threaded
Open this post in threaded view
|

Re: Consistency guarantees on multiple sinks

Paris Carbone
Hi Nancy,

Flink’s vanilla rollback recovery mechanism restarts computation from a global checkpoint thus sink duplicates (job output) can occur no matter how many sinks are declared;  the whole computation in the failed execution graph will roll back.

cheers
Paris


> On 5 Jan 2017, at 14:24, Nancy Estrada <[hidden email]> wrote:
>
> Hi,
>
> If in a Job there is more than one sink declared, what happens when a
> failure occurs? all the sink operations get aborted? (atomically as in a
> transactional environment), or the exactly-once-processing consistency
> guarantees are provided just when one sink is declared per job? Is it
> recommended to have more than one sink per job?
>
> Thank you!
> Nancy Estrada
>
>
>
> --
> View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Consistency-guarantees-on-multiple-sinks-tp10877.html
> Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.

Reply | Threaded
Open this post in threaded view
|

Re: Consistency guarantees on multiple sinks

Stephan Ewen
Extending on what Paris said:

If you have an exactly-once sink (like the Rolling/Bucketing file sink or the Cassandra write-ahead sink), then all of them are correctly adjusted to preserve the exactly once semantics. That is regardless or one, two, or n sinks.

On Thu, Jan 5, 2017 at 2:47 PM, Paris Carbone <[hidden email]> wrote:
Hi Nancy,

Flink’s vanilla rollback recovery mechanism restarts computation from a global checkpoint thus sink duplicates (job output) can occur no matter how many sinks are declared;  the whole computation in the failed execution graph will roll back.

cheers
Paris


> On 5 Jan 2017, at 14:24, Nancy Estrada <[hidden email]> wrote:
>
> Hi,
>
> If in a Job there is more than one sink declared, what happens when a
> failure occurs? all the sink operations get aborted? (atomically as in a
> transactional environment), or the exactly-once-processing consistency
> guarantees are provided just when one sink is declared per job? Is it
> recommended to have more than one sink per job?
>
> Thank you!
> Nancy Estrada
>
>
>
> --
> View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Consistency-guarantees-on-multiple-sinks-tp10877.html
> Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.