Checkpoint recovery and state external to flink

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Checkpoint recovery and state external to flink

Aggarwal, Ajay

What happens when the flink job interacts with a user managed database and hence has some state outside of flink? In these situations when a flink job is recovered from last successful checkpoint, this external state will not be in sync with the recovered flink state. In most cases it will be ahead of the recovered flink state. Any recommendations or best practices to follow here? I am assuming it must be very common for flink applications to interact with external systems (databases, message systems etc.)

 

 

 

 

 

Reply | Threaded
Open this post in threaded view
|

Re: Checkpoint recovery and state external to flink

Yun Tang
Hi Ajay

I think two phase commit protocol could solve your concern for the exactly-once external system, Flink already support this feature in some sinks [1], e.g. you could refer to [2] to know which version of Kafaka producer could support exactly-once.



Best
Yun Tang

From: Aggarwal, Ajay <[hidden email]>
Sent: Tuesday, March 5, 2019 4:08
To: [hidden email]
Subject: Checkpoint recovery and state external to flink
 

What happens when the flink job interacts with a user managed database and hence has some state outside of flink? In these situations when a flink job is recovered from last successful checkpoint, this external state will not be in sync with the recovered flink state. In most cases it will be ahead of the recovered flink state. Any recommendations or best practices to follow here? I am assuming it must be very common for flink applications to interact with external systems (databases, message systems etc.)

 

 

 

 

 

Reply | Threaded
Open this post in threaded view
|

Re: Checkpoint recovery and state external to flink

Aggarwal, Ajay

Hi Yun,

 

This is good information. Thank you.

However looks like it only applies to SinkFunction. Any thoughts for when intermediate operators are also interacting with external systems?

 

Thanks.

 

Ajay

 

From: Yun Tang <[hidden email]>
Date: Tuesday, March 5, 2019 at 4:04 AM
To: "Aggarwal, Ajay" <[hidden email]>, "[hidden email]" <[hidden email]>
Subject: Re: Checkpoint recovery and state external to flink


Hi Ajay

 

I think two phase commit protocol could solve your concern for the exactly-once external system, Flink already support this feature in some sinks [1], e.g. you could refer to [2] to know which version of Kafaka producer could support exactly-once.

 

 

 

Best

Yun Tang


From: Aggarwal, Ajay <[hidden email]>
Sent: Tuesday, March 5, 2019 4:08
To: [hidden email]
Subject: Checkpoint recovery and state external to flink

 

What happens when the flink job interacts with a user managed database and hence has some state outside of flink? In these situations when a flink job is recovered from last successful checkpoint, this external state will not be in sync with the recovered flink state. In most cases it will be ahead of the recovered flink state. Any recommendations or best practices to follow here? I am assuming it must be very common for flink applications to interact with external systems (databases, message systems etc.)