What happen to state in Flink Task Manager when crash?

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

What happen to state in Flink Task Manager when crash?

Siew Wai Yow
Hello, 

May i know what happen to state stored in Flink Task Manager when this Task manager crash. Say the state storage is rocksdb, would those data transfer to other running Task Manager so that complete state data is ready for data processing?

Regards,
Yow
Reply | Threaded
Open this post in threaded view
|

Re: What happen to state in Flink Task Manager when crash?

Jamie Grier-2
Flink is designed such that local state is backed up to a highly available system such as HDFS or S3.  When a TaskManager fails state is recovered from there.



On Fri, Jan 11, 2019 at 7:33 AM Siew Wai Yow <[hidden email]> wrote:
Hello, 

May i know what happen to state stored in Flink Task Manager when this Task manager crash. Say the state storage is rocksdb, would those data transfer to other running Task Manager so that complete state data is ready for data processing?

Regards,
Yow
Reply | Threaded
Open this post in threaded view
|

Re: What happen to state in Flink Task Manager when crash?

Siew Wai Yow
Thanks. But this is something I know. I would like to know will the other TM take over the crashed TM's state to ensure data completion(say the state BYKEY, different key state will be stored in different TM) OR the crashed TM need to be recovered to continue?

For example, 5 records,
rec1:KEYA
rec2:KEYB
rec3:KEYA
rec4:KEYC
rec5:KEYB

TM1 stored state for rec1:KEYA, rec3:KEYA
TM2 stored state for rec2:KEYB, rec5:KEYB <--crashed, what happen to those state stored inTM2
TM3 stored state for rec4:KEYC

In case TM2 crashed, rec2 and rec5 will be assigned to other TM? Or those record only recover when TM2 being recover?

Thanks.



From: Jamie Grier <[hidden email]>
Sent: Saturday, January 12, 2019 2:26 AM
To: Siew Wai Yow
Cc: [hidden email]
Subject: Re: What happen to state in Flink Task Manager when crash?
 
Flink is designed such that local state is backed up to a highly available system such as HDFS or S3.  When a TaskManager fails state is recovered from there.



On Fri, Jan 11, 2019 at 7:33 AM Siew Wai Yow <[hidden email]> wrote:
Hello, 

May i know what happen to state stored in Flink Task Manager when this Task manager crash. Say the state storage is rocksdb, would those data transfer to other running Task Manager so that complete state data is ready for data processing?

Regards,
Yow
Reply | Threaded
Open this post in threaded view
|

Re: What happen to state in Flink Task Manager when crash?

Congxian Qiu
Hi, Siew Wai Yow

When the job is running, the states are stored in the local RocksDB, Flink will copy all the needed states to checkpointPath when doing a Checkpoint.
If there have any failures, the job will be restored from the last previously *Successfully* checkpoint, and assign the restored states to all the current TM
(These TMs do not need to be the same as before) .

Siew Wai Yow <[hidden email]> 于2019年1月12日周六 上午11:24写道:
Thanks. But this is something I know. I would like to know will the other TM take over the crashed TM's state to ensure data completion(say the state BYKEY, different key state will be stored in different TM) OR the crashed TM need to be recovered to continue?

For example, 5 records,
rec1:KEYA
rec2:KEYB
rec3:KEYA
rec4:KEYC
rec5:KEYB

TM1 stored state for rec1:KEYA, rec3:KEYA
TM2 stored state for rec2:KEYB, rec5:KEYB <--crashed, what happen to those state stored inTM2
TM3 stored state for rec4:KEYC

In case TM2 crashed, rec2 and rec5 will be assigned to other TM? Or those record only recover when TM2 being recover?

Thanks.



From: Jamie Grier <[hidden email]>
Sent: Saturday, January 12, 2019 2:26 AM
To: Siew Wai Yow
Cc: [hidden email]
Subject: Re: What happen to state in Flink Task Manager when crash?
 
Flink is designed such that local state is backed up to a highly available system such as HDFS or S3.  When a TaskManager fails state is recovered from there.



On Fri, Jan 11, 2019 at 7:33 AM Siew Wai Yow <[hidden email]> wrote:
Hello, 

May i know what happen to state stored in Flink Task Manager when this Task manager crash. Say the state storage is rocksdb, would those data transfer to other running Task Manager so that complete state data is ready for data processing?

Regards,
Yow


--
Best,
Congxian
Reply | Threaded
Open this post in threaded view
|

Reply: Re: What happen to state in Flink Task Manager when crash?

Siew Wai Yow
Reply: Re: What happen to state in Flink Task Manager when crash?
Thanks Qiu but David has different view from stackoverflow. He mentioned the Crashed TM must be recovered.

https://stackoverflow.com/questions/54149134/what-happen-to-state-in-flink-task-manager-when-crash/54153686?noredirect=1#comment95144500_54153686

"The crashed TM must be recovered, and state updates since the last checkpoint will be lost. Of course that lost state should be recreated as the job rewinds and resumes."

Hi, Siew Wai Yow

When the job is running, the states are stored in the local RocksDB, Flink will copy all the needed states to checkpointPath when doing a Checkpoint.
If there have any failures, the job will be restored from the last previously *Successfully* checkpoint, and assign the restored states to all the current TM
(These TMs do not need to be the same as before) .

Siew Wai Yow <[hidden email]> 于2019年1月12日周六 上午11:24写道:
Thanks. But this is something I know. I would like to know will the other TM take over the crashed TM's state to ensure data completion(say the state BYKEY, different key state will be stored in different TM) OR the crashed TM need to be recovered to continue?

For example, 5 records,
rec1:KEYA
rec2:KEYB
rec3:KEYA
rec4:KEYC
rec5:KEYB

TM1 stored state for rec1:KEYA, rec3:KEYA
TM2 stored state for rec2:KEYB, rec5:KEYB <--crashed, what happen to those state stored inTM2
TM3 stored state for rec4:KEYC

In case TM2 crashed, rec2 and rec5 will be assigned to other TM? Or those record only recover when TM2 being recover?

Thanks.



From: Jamie Grier <[hidden email]>
Sent: Saturday, January 12, 2019 2:26 AM
To: Siew Wai Yow
Cc: [hidden email]
Subject: Re: What happen to state in Flink Task Manager when crash?
 
Flink is designed such that local state is backed up to a highly available system such as HDFS or S3.  When a TaskManager fails state is recovered from there.



On Fri, Jan 11, 2019 at 7:33 AM Siew Wai Yow <[hidden email]> wrote:
Hello, 

May i know what happen to state stored in Flink Task Manager when this Task manager crash. Say the state storage is rocksdb, would those data transfer to other running Task Manager so that complete state data is ready for data processing?

Regards,
Yow


--
Best,
Congxian
Reply | Threaded
Open this post in threaded view
|

Re: Reply: Re: What happen to state in Flink Task Manager when crash?

Congxian Qiu
Hi, Siew Wai Yow

Yes, David is correct, the TM must be recovered, the number of TMs before and after the crash must be the same. 
In my last reply, I want to say that the states may not on the same TM after the crash. Sorry for the unclear description.

Siew Wai Yow <[hidden email]> 于2019年1月12日周六 下午6:44写道:
Thanks Qiu but David has different view from stackoverflow. He mentioned the Crashed TM must be recovered.

https://stackoverflow.com/questions/54149134/what-happen-to-state-in-flink-task-manager-when-crash/54153686?noredirect=1#comment95144500_54153686

"The crashed TM must be recovered, and state updates since the last checkpoint will be lost. Of course that lost state should be recreated as the job rewinds and resumes."

Hi, Siew Wai Yow

When the job is running, the states are stored in the local RocksDB, Flink will copy all the needed states to checkpointPath when doing a Checkpoint.
If there have any failures, the job will be restored from the last previously *Successfully* checkpoint, and assign the restored states to all the current TM
(These TMs do not need to be the same as before) .

Siew Wai Yow <[hidden email]> 于2019年1月12日周六 上午11:24写道:
Thanks. But this is something I know. I would like to know will the other TM take over the crashed TM's state to ensure data completion(say the state BYKEY, different key state will be stored in different TM) OR the crashed TM need to be recovered to continue?

For example, 5 records,
rec1:KEYA
rec2:KEYB
rec3:KEYA
rec4:KEYC
rec5:KEYB

TM1 stored state for rec1:KEYA, rec3:KEYA
TM2 stored state for rec2:KEYB, rec5:KEYB <--crashed, what happen to those state stored inTM2
TM3 stored state for rec4:KEYC

In case TM2 crashed, rec2 and rec5 will be assigned to other TM? Or those record only recover when TM2 being recover?

Thanks.



From: Jamie Grier <[hidden email]>
Sent: Saturday, January 12, 2019 2:26 AM
To: Siew Wai Yow
Cc: [hidden email]
Subject: Re: What happen to state in Flink Task Manager when crash?
 
Flink is designed such that local state is backed up to a highly available system such as HDFS or S3.  When a TaskManager fails state is recovered from there.



On Fri, Jan 11, 2019 at 7:33 AM Siew Wai Yow <[hidden email]> wrote:
Hello, 

May i know what happen to state stored in Flink Task Manager when this Task manager crash. Say the state storage is rocksdb, would those data transfer to other running Task Manager so that complete state data is ready for data processing?

Regards,
Yow


--
Best,
Congxian


--
Best,
Congxian
Reply | Threaded
Open this post in threaded view
|

Reply: Re: Reply: Re: What happen to state in Flink Task Manager when crash?

Siew Wai Yow
Reply: Re: Reply: Re: What happen to state in Flink Task Manager when crash?
Hi Qiu thanks again!
Based on my experience on Flink 1.3, when one of the TM crash the whole cluster need to be restarted so i guess this is the recovery you mentioned. But it sounds defeat the purpose of cluster as one TM crash should not crash the whole cluster. May i know is this still the same in Flink 1.7? Restart strategy is for job though not for TM failure.

Thanks!

Hi, Siew Wai Yow

Yes, David is correct, the TM must be recovered, the number of TMs before and after the crash must be the same. 
In my last reply, I want to say that the states may not on the same TM after the crash. Sorry for the unclear description.

Siew Wai Yow <[hidden email]> 于2019年1月12日周六 下午6:44写道:
Thanks Qiu but David has different view from stackoverflow. He mentioned the Crashed TM must be recovered.

https://stackoverflow.com/questions/54149134/what-happen-to-state-in-flink-task-manager-when-crash/54153686?noredirect=1#comment95144500_54153686

"The crashed TM must be recovered, and state updates since the last checkpoint will be lost. Of course that lost state should be recreated as the job rewinds and resumes."

Hi, Siew Wai Yow

When the job is running, the states are stored in the local RocksDB, Flink will copy all the needed states to checkpointPath when doing a Checkpoint.
If there have any failures, the job will be restored from the last previously *Successfully* checkpoint, and assign the restored states to all the current TM
(These TMs do not need to be the same as before) .

Siew Wai Yow <[hidden email]> 于2019年1月12日周六 上午11:24写道:
Thanks. But this is something I know. I would like to know will the other TM take over the crashed TM's state to ensure data completion(say the state BYKEY, different key state will be stored in different TM) OR the crashed TM need to be recovered to continue?

For example, 5 records,
rec1:KEYA
rec2:KEYB
rec3:KEYA
rec4:KEYC
rec5:KEYB

TM1 stored state for rec1:KEYA, rec3:KEYA
TM2 stored state for rec2:KEYB, rec5:KEYB <--crashed, what happen to those state stored inTM2
TM3 stored state for rec4:KEYC

In case TM2 crashed, rec2 and rec5 will be assigned to other TM? Or those record only recover when TM2 being recover?

Thanks.



From: Jamie Grier <[hidden email]>
Sent: Saturday, January 12, 2019 2:26 AM
To: Siew Wai Yow
Cc: [hidden email]
Subject: Re: What happen to state in Flink Task Manager when crash?
 
Flink is designed such that local state is backed up to a highly available system such as HDFS or S3.  When a TaskManager fails state is recovered from there.



On Fri, Jan 11, 2019 at 7:33 AM Siew Wai Yow <[hidden email]> wrote:
Hello, 

May i know what happen to state stored in Flink Task Manager when this Task manager crash. Say the state storage is rocksdb, would those data transfer to other running Task Manager so that complete state data is ready for data processing?

Regards,
Yow


--
Best,
Congxian


--
Best,
Congxian
Reply | Threaded
Open this post in threaded view
|

Re: Reply: Re: Reply: Re: What happen to state in Flink Task Manager when crash?

Congxian Qiu
Hi, Yow
I think there is another restart strategy in flink: region failover[1], but I could not find the documentation, maybe someone else may help here, For region failover, please take a look at this issue[2] before you use it. And you can take a look at this FLIP[3].


Siew Wai Yow <[hidden email]> 于2019年1月13日周日 上午8:49写道:
Hi Qiu thanks again!
Based on my experience on Flink 1.3, when one of the TM crash the whole cluster need to be restarted so i guess this is the recovery you mentioned. But it sounds defeat the purpose of cluster as one TM crash should not crash the whole cluster. May i know is this still the same in Flink 1.7? Restart strategy is for job though not for TM failure.

Thanks!

Hi, Siew Wai Yow

Yes, David is correct, the TM must be recovered, the number of TMs before and after the crash must be the same. 
In my last reply, I want to say that the states may not on the same TM after the crash. Sorry for the unclear description.

Siew Wai Yow <[hidden email]> 于2019年1月12日周六 下午6:44写道:
Thanks Qiu but David has different view from stackoverflow. He mentioned the Crashed TM must be recovered.

https://stackoverflow.com/questions/54149134/what-happen-to-state-in-flink-task-manager-when-crash/54153686?noredirect=1#comment95144500_54153686

"The crashed TM must be recovered, and state updates since the last checkpoint will be lost. Of course that lost state should be recreated as the job rewinds and resumes."

Hi, Siew Wai Yow

When the job is running, the states are stored in the local RocksDB, Flink will copy all the needed states to checkpointPath when doing a Checkpoint.
If there have any failures, the job will be restored from the last previously *Successfully* checkpoint, and assign the restored states to all the current TM
(These TMs do not need to be the same as before) .

Siew Wai Yow <[hidden email]> 于2019年1月12日周六 上午11:24写道:
Thanks. But this is something I know. I would like to know will the other TM take over the crashed TM's state to ensure data completion(say the state BYKEY, different key state will be stored in different TM) OR the crashed TM need to be recovered to continue?

For example, 5 records,
rec1:KEYA
rec2:KEYB
rec3:KEYA
rec4:KEYC
rec5:KEYB

TM1 stored state for rec1:KEYA, rec3:KEYA
TM2 stored state for rec2:KEYB, rec5:KEYB <--crashed, what happen to those state stored inTM2
TM3 stored state for rec4:KEYC

In case TM2 crashed, rec2 and rec5 will be assigned to other TM? Or those record only recover when TM2 being recover?

Thanks.



From: Jamie Grier <[hidden email]>
Sent: Saturday, January 12, 2019 2:26 AM
To: Siew Wai Yow
Cc: [hidden email]
Subject: Re: What happen to state in Flink Task Manager when crash?
 
Flink is designed such that local state is backed up to a highly available system such as HDFS or S3.  When a TaskManager fails state is recovered from there.



On Fri, Jan 11, 2019 at 7:33 AM Siew Wai Yow <[hidden email]> wrote:
Hello, 

May i know what happen to state stored in Flink Task Manager when this Task manager crash. Say the state storage is rocksdb, would those data transfer to other running Task Manager so that complete state data is ready for data processing?

Regards,
Yow


--
Best,
Congxian


--
Best,
Congxian


--
Best,
Congxian
Reply | Threaded
Open this post in threaded view
|

Re: Reply: Re: Reply: Re: What happen to state in Flink Task Manager when crash?

Siew Wai Yow
Thanks Qiu,
this is an useful information indeed, but this strategy will only reduce the chance of re-execution whole graph. I think it won't help if TM crash, which anyhow the whole cluster need to restart to redistribute states, am I right?


From: Congxian Qiu <[hidden email]>
Sent: Sunday, January 13, 2019 9:39 AM
To: Siew Wai Yow
Cc: Jamie Grier; user
Subject: Re: Reply: Re: Reply: Re: What happen to state in Flink Task Manager when crash?
 
Hi, Yow
I think there is another restart strategy in flink: region failover[1], but I could not find the documentation, maybe someone else may help here, For region failover, please take a look at this issue[2] before you use it. And you can take a look at this FLIP[3].

[3] <a href="https://cwiki.apache.org/confluence/display/FLINK/FLIP-1&#43;%3A&#43;Fine&#43;Grained&#43;Recovery&#43;from&#43;Task&#43;Failures">https://cwiki.apache.org/confluence/display/FLINK/FLIP-1+%3A+Fine+Grained+Recovery+from+Task+Failures

Siew Wai Yow <[hidden email]> 于2019年1月13日周日 上午8:49写道:
Hi Qiu thanks again!
Based on my experience on Flink 1.3, when one of the TM crash the whole cluster need to be restarted so i guess this is the recovery you mentioned. But it sounds defeat the purpose of cluster as one TM crash should not crash the whole cluster. May i know is this still the same in Flink 1.7? Restart strategy is for job though not for TM failure.

Thanks!

Hi, Siew Wai Yow

Yes, David is correct, the TM must be recovered, the number of TMs before and after the crash must be the same. 
In my last reply, I want to say that the states may not on the same TM after the crash. Sorry for the unclear description.

Siew Wai Yow <[hidden email]> 于2019年1月12日周六 下午6:44写道:
Thanks Qiu but David has different view from stackoverflow. He mentioned the Crashed TM must be recovered.

https://stackoverflow.com/questions/54149134/what-happen-to-state-in-flink-task-manager-when-crash/54153686?noredirect=1#comment95144500_54153686

"The crashed TM must be recovered, and state updates since the last checkpoint will be lost. Of course that lost state should be recreated as the job rewinds and resumes."

Hi, Siew Wai Yow

When the job is running, the states are stored in the local RocksDB, Flink will copy all the needed states to checkpointPath when doing a Checkpoint.
If there have any failures, the job will be restored from the last previously *Successfully* checkpoint, and assign the restored states to all the current TM
(These TMs do not need to be the same as before) .

Siew Wai Yow <[hidden email]> 于2019年1月12日周六 上午11:24写道:
Thanks. But this is something I know. I would like to know will the other TM take over the crashed TM's state to ensure data completion(say the state BYKEY, different key state will be stored in different TM) OR the crashed TM need to be recovered to continue?

For example, 5 records,
rec1:KEYA
rec2:KEYB
rec3:KEYA
rec4:KEYC
rec5:KEYB

TM1 stored state for rec1:KEYA, rec3:KEYA
TM2 stored state for rec2:KEYB, rec5:KEYB <--crashed, what happen to those state stored inTM2
TM3 stored state for rec4:KEYC

In case TM2 crashed, rec2 and rec5 will be assigned to other TM? Or those record only recover when TM2 being recover?

Thanks.



From: Jamie Grier <[hidden email]>
Sent: Saturday, January 12, 2019 2:26 AM
To: Siew Wai Yow
Cc: [hidden email]
Subject: Re: What happen to state in Flink Task Manager when crash?
 
Flink is designed such that local state is backed up to a highly available system such as HDFS or S3.  When a TaskManager fails state is recovered from there.



On Fri, Jan 11, 2019 at 7:33 AM Siew Wai Yow <[hidden email]> wrote:
Hello, 

May i know what happen to state stored in Flink Task Manager when this Task manager crash. Say the state storage is rocksdb, would those data transfer to other running Task Manager so that complete state data is ready for data processing?

Regards,
Yow


--
Best,
Congxian


--
Best,
Congxian


--
Best,
Congxian
Reply | Threaded
Open this post in threaded view
|

Re: What happen to state in Flink Task Manager when crash?

Dawid Wysakowicz-2
In reply to this post by Congxian Qiu

Hi,

Pretty much just a rephrase of what others said. Flink's state is usually backed some highly available distributed fs and upon checkpoint a consistent view of all local states is written there, so it can be later restored from. As of now, any failure of a Task slot (e.g. if a TM fails, all slots in that TM fail) will result in a job restart. If the remaining TMs have enough slots to restart the job it will be restored onto them. The restoration always starts with the checkpoint as an "entry point". That means all the states written there will be resdistributed to the TMs. With task-local recovery feature [1] flink will try to distribute the state/tasks so that the local snapshot can be reused. Hope that this clears things up.

Best,

Dawid

[1] https://ci.apache.org/projects/flink/flink-docs-stable/ops/state/large_state_tuning.html#task-local-recovery

On 12/01/2019 05:46, Congxian Qiu wrote:
Hi, Siew Wai Yow

When the job is running, the states are stored in the local RocksDB, Flink will copy all the needed states to checkpointPath when doing a Checkpoint.
If there have any failures, the job will be restored from the last previously *Successfully* checkpoint, and assign the restored states to all the current TM
(These TMs do not need to be the same as before) .

Siew Wai Yow <[hidden email]> 于2019年1月12日周六 上午11:24写道:
Thanks. But this is something I know. I would like to know will the other TM take over the crashed TM's state to ensure data completion(say the state BYKEY, different key state will be stored in different TM) OR the crashed TM need to be recovered to continue?

For example, 5 records,
rec1:KEYA
rec2:KEYB
rec3:KEYA
rec4:KEYC
rec5:KEYB

TM1 stored state for rec1:KEYA, rec3:KEYA
TM2 stored state for rec2:KEYB, rec5:KEYB <--crashed, what happen to those state stored inTM2
TM3 stored state for rec4:KEYC

In case TM2 crashed, rec2 and rec5 will be assigned to other TM? Or those record only recover when TM2 being recover?

Thanks.



From: Jamie Grier <[hidden email]>
Sent: Saturday, January 12, 2019 2:26 AM
To: Siew Wai Yow
Cc: [hidden email]
Subject: Re: What happen to state in Flink Task Manager when crash?
 
Flink is designed such that local state is backed up to a highly available system such as HDFS or S3.  When a TaskManager fails state is recovered from there.



On Fri, Jan 11, 2019 at 7:33 AM Siew Wai Yow <[hidden email]> wrote:
Hello, 

May i know what happen to state stored in Flink Task Manager when this Task manager crash. Say the state storage is rocksdb, would those data transfer to other running Task Manager so that complete state data is ready for data processing?

Regards,
Yow


--
Best,
Congxian

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: What happen to state in Flink Task Manager when crash?

Siew Wai Yow
Thanks Dawid and Qiu!
Both of you clear all my doubts, perfect!


From: Dawid Wysakowicz <[hidden email]>
Sent: Monday, January 14, 2019 9:26 PM
To: Congxian Qiu; Siew Wai Yow
Cc: Jamie Grier; [hidden email]
Subject: Re: What happen to state in Flink Task Manager when crash?
 

Hi,

Pretty much just a rephrase of what others said. Flink's state is usually backed some highly available distributed fs and upon checkpoint a consistent view of all local states is written there, so it can be later restored from. As of now, any failure of a Task slot (e.g. if a TM fails, all slots in that TM fail) will result in a job restart. If the remaining TMs have enough slots to restart the job it will be restored onto them. The restoration always starts with the checkpoint as an "entry point". That means all the states written there will be resdistributed to the TMs. With task-local recovery feature [1] flink will try to distribute the state/tasks so that the local snapshot can be reused. Hope that this clears things up.

Best,

Dawid

[1] https://ci.apache.org/projects/flink/flink-docs-stable/ops/state/large_state_tuning.html#task-local-recovery

On 12/01/2019 05:46, Congxian Qiu wrote:
Hi, Siew Wai Yow

When the job is running, the states are stored in the local RocksDB, Flink will copy all the needed states to checkpointPath when doing a Checkpoint.
If there have any failures, the job will be restored from the last previously *Successfully* checkpoint, and assign the restored states to all the current TM
(These TMs do not need to be the same as before) .

Siew Wai Yow <[hidden email]> 于2019年1月12日周六 上午11:24写道:
Thanks. But this is something I know. I would like to know will the other TM take over the crashed TM's state to ensure data completion(say the state BYKEY, different key state will be stored in different TM) OR the crashed TM need to be recovered to continue?

For example, 5 records,
rec1:KEYA
rec2:KEYB
rec3:KEYA
rec4:KEYC
rec5:KEYB

TM1 stored state for rec1:KEYA, rec3:KEYA
TM2 stored state for rec2:KEYB, rec5:KEYB <--crashed, what happen to those state stored inTM2
TM3 stored state for rec4:KEYC

In case TM2 crashed, rec2 and rec5 will be assigned to other TM? Or those record only recover when TM2 being recover?

Thanks.



From: Jamie Grier <[hidden email]>
Sent: Saturday, January 12, 2019 2:26 AM
To: Siew Wai Yow
Cc: [hidden email]
Subject: Re: What happen to state in Flink Task Manager when crash?
 
Flink is designed such that local state is backed up to a highly available system such as HDFS or S3.  When a TaskManager fails state is recovered from there.



On Fri, Jan 11, 2019 at 7:33 AM Siew Wai Yow <[hidden email]> wrote:
Hello, 

May i know what happen to state stored in Flink Task Manager when this Task manager crash. Say the state storage is rocksdb, would those data transfer to other running Task Manager so that complete state data is ready for data processing?

Regards,
Yow


--
Best,
Congxian