(DEPRECATED) Apache Flink User Mailing List archive.

why we need keyed state and operate state when we already have checkpoint?

Classic

List

Threaded

9 messages Options

大森林

why we need keyed state and operate state when we already have checkpoint?

Could you tell me:

why we need keyed state and operator state when we already have checkpoint?

when a running jar crash,we can resume from the checkpoint automatically/manually.

So why did we still need keyed state and operator state.

Thanks

Shengkai Fang

Re: why we need keyed state and operate state when we already have checkpoint?

The checkpoint is a snapshot for the job and we can resume the job if the job is killed unexpectedly. The state is another thing to memorize the intermediate result of calculation. I don't think the checkpoint can replace state.

大森林 <[hidden email]> 于2020年10月7日周三下午12:26写道：

Could you tell me:

why we need keyed state and operator state when we already have checkpoint?

when a running jar crash,we can resume from the checkpoint automatically/manually.
So why did we still need keyed state and operator state.

Thanks

大森林

回复： why we need keyed state and operate state when we already have checkpoint?

when the job is killed,state is also misssing.

so why we need keyed state?Is keyed state useful when we try to resuming the killed job?

------------------ 原始邮件 ------------------

发件人: "Shengkai Fang" <[hidden email]>;

发送时间: 2020年10月7日(星期三) 中午12:43

收件人: "大森林"<[hidden email]>;

抄送: "user"<[hidden email]>;

主题: Re: why we need keyed state and operate state when we already have checkpoint?

大森林 <[hidden email]> 于2020年10月7日周三下午12:26写道：

Could you tell me:

why we need keyed state and operator state when we already have checkpoint?

when a running jar crash,we can resume from the checkpoint automatically/manually.
So why did we still need keyed state and operator state.

Thanks

Arvid Heise-3

Re: why we need keyed state and operate state when we already have checkpoint?

I think there is some misunderstanding here: a checkpoint IS (a snapshot of) the keyed state and operator state (among a few more things). [1]

[1] https://ci.apache.org/projects/flink/flink-docs-release-1.11/learn-flink/fault_tolerance.html#definitions

On Wed, Oct 7, 2020 at 6:51 AM 大森林 <[hidden email]> wrote:

when the job is killed,state is also misssing.
so why we need keyed state?Is keyed state useful when we try to resuming the killed job?

------------------ 原始邮件 ------------------
发件人: "Shengkai Fang" <[hidden email]>;
发送时间: 2020年10月7日(星期三) 中午12:43
收件人: "大森林"<[hidden email]>;
抄送: "user"<[hidden email]>;
主题: Re: why we need keyed state and operate state when we already have checkpoint?

The checkpoint is a snapshot for the job and we can resume the job if the job is killed unexpectedly. The state is another thing to memorize the intermediate result of calculation. I don't think the checkpoint can replace state.

大森林 <[hidden email]> 于2020年10月7日周三下午12:26写道：
Could you tell me:

why we need keyed state and operator state when we already have checkpoint?

when a running jar crash,we can resume from the checkpoint automatically/manually.
So why did we still need keyed state and operator state.

Thanks

Arvid Heise | Senior Java Developer

Join Flink Forward - The Apache Flink Conference

Stream Processing | Event Driven | Real Time

Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany

Ververica GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Toni) Cheng

大森林

回复： why we need keyed state and operate state when we already have checkpoint?

Thanks for your replies,I have some understandings.

There are two cases.

1. if I use no keyed state in program,when it's killed,I can only resume from previous result

1. if I use keyed state in program,when it's killed,I can resume from previous result and previous variable temporary result.

Am I right?

Thanks for your guide.

------------------ 原始邮件 ------------------

发件人: "Arvid Heise" <[hidden email]>;

发送时间: 2020年10月7日(星期三) 下午2:25

收件人: "大森林"<[hidden email]>;

抄送: "Shengkai Fang"<[hidden email]>;"user"<[hidden email]>;

主题: Re: why we need keyed state and operate state when we already have checkpoint?

I think there is some misunderstanding here: a checkpoint IS (a snapshot of) the keyed state and operator state (among a few more things). [1]

[1] https://ci.apache.org/projects/flink/flink-docs-release-1.11/learn-flink/fault_tolerance.html#definitions

On Wed, Oct 7, 2020 at 6:51 AM 大森林 <[hidden email]> wrote:

when the job is killed,state is also misssing.
so why we need keyed state?Is keyed state useful when we try to resuming the killed job?

------------------ 原始邮件 ------------------
发件人: "Shengkai Fang" <[hidden email]>;
发送时间: 2020年10月7日(星期三) 中午12:43
收件人: "大森林"<[hidden email]>;
抄送: "user"<[hidden email]>;
主题: Re: why we need keyed state and operate state when we already have checkpoint?

The checkpoint is a snapshot for the job and we can resume the job if the job is killed unexpectedly. The state is another thing to memorize the intermediate result of calculation. I don't think the checkpoint can replace state.

大森林 <[hidden email]> 于2020年10月7日周三下午12:26写道：
Could you tell me:

why we need keyed state and operator state when we already have checkpoint?

when a running jar crash,we can resume from the checkpoint automatically/manually.
So why did we still need keyed state and operator state.

Thanks

Arvid Heise | Senior Java Developer

Join Flink Forward - The Apache Flink Conference

Stream Processing | Event Driven | Real Time

Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany

Ververica GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Toni) Cheng

Arvid Heise-3

Re: why we need keyed state and operate state when we already have checkpoint?

Hi 大森林,

You can always resume from checkpoints independent of the usage of keyed or non-keyed state of operators.

1 checkpoint contains the state of all operators at a given point in time. Each operator may have keyed state, raw state, or non-keyed state.

As long as you are not changing the operators (too much) before restarting, you can always restart.

During (automatic) restart of a Flink application, the state of a given checkpoint is restored to the operators, such that it looks like the operator never failed. However, the operators are reset to the time of the respective checkpoint.

I have no clue what you mean with "previous variable temporary result".

On Wed, Oct 7, 2020 at 9:13 AM 大森林 <[hidden email]> wrote:

Thanks for your replies,I have some understandings.

There are two cases.
1. if I use no keyed state in program,when it's killed,I can only resume from previous result
1. if I use keyed state in program,when it's killed,I can resume from previous result and previous variable temporary result.

Am I right?
Thanks for your guide.

------------------ 原始邮件 ------------------
发件人: "Arvid Heise" <[hidden email]>;
发送时间: 2020年10月7日(星期三) 下午2:25
收件人: "大森林"<[hidden email]>;
抄送: "Shengkai Fang"<[hidden email]>;"user"<[hidden email]>;
主题: Re: why we need keyed state and operate state when we already have checkpoint?

I think there is some misunderstanding here: a checkpoint IS (a snapshot of) the keyed state and operator state (among a few more things). [1]

[1] https://ci.apache.org/projects/flink/flink-docs-release-1.11/learn-flink/fault_tolerance.html#definitions

On Wed, Oct 7, 2020 at 6:51 AM 大森林 <[hidden email]> wrote:
when the job is killed,state is also misssing.
so why we need keyed state?Is keyed state useful when we try to resuming the killed job?

------------------ 原始邮件 ------------------
发件人: "Shengkai Fang" <[hidden email]>;
发送时间: 2020年10月7日(星期三) 中午12:43
收件人: "大森林"<[hidden email]>;
抄送: "user"<[hidden email]>;
主题: Re: why we need keyed state and operate state when we already have checkpoint?

The checkpoint is a snapshot for the job and we can resume the job if the job is killed unexpectedly. The state is another thing to memorize the intermediate result of calculation. I don't think the checkpoint can replace state.

大森林 <[hidden email]> 于2020年10月7日周三下午12:26写道：
Could you tell me:

why we need keyed state and operator state when we already have checkpoint?

when a running jar crash,we can resume from the checkpoint automatically/manually.
So why did we still need keyed state and operator state.

Thanks

--
Arvid Heise | Senior Java Developer

Follow us @VervericaData
--
Join Flink Forward - The Apache Flink Conference
Stream Processing | Event Driven | Real Time
--
Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
--
Ververica GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Toni) Cheng

Arvid Heise | Senior Java Developer

Join Flink Forward - The Apache Flink Conference

Stream Processing | Event Driven | Real Time

Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany

Ververica GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Toni) Cheng

大森林

回复： why we need keyed state and operate state when we already have checkpoint?

Thanks for your replies.

When I use no state-relevant code in my program,the checkingpoint can be saved and resumed.❶

So then why we need Keyed State/Operator State/Stateful Function?❷

"the operators are reset to the time of the respective checkpoint."

We already have met the requirement:"resume from checkpoint(last state of each operator which store the result)"❶,

why we still need ❷?

Thanks for your help~!

------------------ 原始邮件 ------------------

发件人: "Arvid Heise" <[hidden email]>;

发送时间: 2020年10月12日(星期一) 下午2:53

收件人: "大森林"<[hidden email]>;

抄送: "Shengkai Fang"<[hidden email]>;"user"<[hidden email]>;

主题: Re: why we need keyed state and operate state when we already have checkpoint?

Hi 大森林,

You can always resume from checkpoints independent of the usage of keyed or non-keyed state of operators.

1 checkpoint contains the state of all operators at a given point in time. Each operator may have keyed state, raw state, or non-keyed state.

As long as you are not changing the operators (too much) before restarting, you can always restart.

I have no clue what you mean with "previous variable temporary result".

On Wed, Oct 7, 2020 at 9:13 AM 大森林 <[hidden email]> wrote:

Thanks for your replies,I have some understandings.

There are two cases.
1. if I use no keyed state in program,when it's killed,I can only resume from previous result
1. if I use keyed state in program,when it's killed,I can resume from previous result and previous variable temporary result.

Am I right?
Thanks for your guide.

------------------ 原始邮件 ------------------
发件人: "Arvid Heise" <[hidden email]>;
发送时间: 2020年10月7日(星期三) 下午2:25
收件人: "大森林"<[hidden email]>;
抄送: "Shengkai Fang"<[hidden email]>;"user"<[hidden email]>;
主题: Re: why we need keyed state and operate state when we already have checkpoint?

I think there is some misunderstanding here: a checkpoint IS (a snapshot of) the keyed state and operator state (among a few more things). [1]

[1] https://ci.apache.org/projects/flink/flink-docs-release-1.11/learn-flink/fault_tolerance.html#definitions

On Wed, Oct 7, 2020 at 6:51 AM 大森林 <[hidden email]> wrote:
when the job is killed,state is also misssing.
so why we need keyed state?Is keyed state useful when we try to resuming the killed job?

------------------ 原始邮件 ------------------
发件人: "Shengkai Fang" <[hidden email]>;
发送时间: 2020年10月7日(星期三) 中午12:43
收件人: "大森林"<[hidden email]>;
抄送: "user"<[hidden email]>;
主题: Re: why we need keyed state and operate state when we already have checkpoint?

The checkpoint is a snapshot for the job and we can resume the job if the job is killed unexpectedly. The state is another thing to memorize the intermediate result of calculation. I don't think the checkpoint can replace state.

大森林 <[hidden email]> 于2020年10月7日周三下午12:26写道：
Could you tell me:

why we need keyed state and operator state when we already have checkpoint?

when a running jar crash,we can resume from the checkpoint automatically/manually.
So why did we still need keyed state and operator state.

Thanks

--
Arvid Heise | Senior Java Developer

Follow us @VervericaData
--
Join Flink Forward - The Apache Flink Conference
Stream Processing | Event Driven | Real Time
--
Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
--
Ververica GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Toni) Cheng

Arvid Heise | Senior Java Developer

Join Flink Forward - The Apache Flink Conference

Stream Processing | Event Driven | Real Time

Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany

Ververica GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Toni) Cheng

Congxian Qiu

Re: why we need keyed state and operate state when we already have checkpoint?

As others said, state is different as checkpoint. a checkpoint is just a **snapshot** of the state, and you can restore from the previous checkpoint if the job crashed.

state is for stateful computation, and checkpoint is for fault-tolerant[1]

The state keeps the information you'll need in the future. Take wordcount as an example, the count of the word depends on the total count of the word we have seen, we need to keep the "total count of the word have seen before" somewhere, in Flink you can keep it in the state.

checkpoint/savepoint contains the **snapshot** of all the state, if there is not state, then the checkpoint will be *empty*, you can restore from it, but the content is empty.

PS: maybe you don't create state explicit, but there contain some states in Flink(such as WindowOperator)

[1] https://ci.apache.org/projects/flink/flink-docs-release-1.11/concepts/stateful-stream-processing.html

Best,

Congxian

大森林 <[hidden email]> 于2020年10月12日周一下午9:26写道：

Thanks for your replies.
When I use no state-relevant code in my program,the checkingpoint can be saved and resumed.❶

So then why we need Keyed State/Operator State/Stateful Function?❷
"the operators are reset to the time of the respective checkpoint."
We already have met the requirement:"resume from checkpoint(last state of each operator which store the result)"❶,
why we still need ❷?
Thanks for your help~!

------------------ 原始邮件 ------------------
发件人: "Arvid Heise" <[hidden email]>;
发送时间: 2020年10月12日(星期一) 下午2:53
收件人: "大森林"<[hidden email]>;
抄送: "Shengkai Fang"<[hidden email]>;"user"<[hidden email]>;
主题: Re: why we need keyed state and operate state when we already have checkpoint?

Hi 大森林,

You can always resume from checkpoints independent of the usage of keyed or non-keyed state of operators.
1 checkpoint contains the state of all operators at a given point in time. Each operator may have keyed state, raw state, or non-keyed state.
As long as you are not changing the operators (too much) before restarting, you can always restart.

During (automatic) restart of a Flink application, the state of a given checkpoint is restored to the operators, such that it looks like the operator never failed. However, the operators are reset to the time of the respective checkpoint.

I have no clue what you mean with "previous variable temporary result".

On Wed, Oct 7, 2020 at 9:13 AM 大森林 <[hidden email]> wrote:
Thanks for your replies,I have some understandings.

There are two cases.
1. if I use no keyed state in program,when it's killed,I can only resume from previous result
1. if I use keyed state in program,when it's killed,I can resume from previous result and previous variable temporary result.

Am I right?
Thanks for your guide.

------------------ 原始邮件 ------------------
发件人: "Arvid Heise" <[hidden email]>;
发送时间: 2020年10月7日(星期三) 下午2:25
收件人: "大森林"<[hidden email]>;
抄送: "Shengkai Fang"<[hidden email]>;"user"<[hidden email]>;
主题: Re: why we need keyed state and operate state when we already have checkpoint?

I think there is some misunderstanding here: a checkpoint IS (a snapshot of) the keyed state and operator state (among a few more things). [1]

[1] https://ci.apache.org/projects/flink/flink-docs-release-1.11/learn-flink/fault_tolerance.html#definitions

On Wed, Oct 7, 2020 at 6:51 AM 大森林 <[hidden email]> wrote:
when the job is killed,state is also misssing.
so why we need keyed state?Is keyed state useful when we try to resuming the killed job?

------------------ 原始邮件 ------------------
发件人: "Shengkai Fang" <[hidden email]>;
发送时间: 2020年10月7日(星期三) 中午12:43
收件人: "大森林"<[hidden email]>;
抄送: "user"<[hidden email]>;
主题: Re: why we need keyed state and operate state when we already have checkpoint?

The checkpoint is a snapshot for the job and we can resume the job if the job is killed unexpectedly. The state is another thing to memorize the intermediate result of calculation. I don't think the checkpoint can replace state.

大森林 <[hidden email]> 于2020年10月7日周三下午12:26写道：
Could you tell me:

why we need keyed state and operator state when we already have checkpoint?

when a running jar crash,we can resume from the checkpoint automatically/manually.
So why did we still need keyed state and operator state.

Thanks

--
Arvid Heise | Senior Java Developer

Follow us @VervericaData
--
Join Flink Forward - The Apache Flink Conference
Stream Processing | Event Driven | Real Time
--
Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
--
Ververica GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Toni) Cheng

--
Arvid Heise | Senior Java Developer

Follow us @VervericaData
--
Join Flink Forward - The Apache Flink Conference
Stream Processing | Event Driven | Real Time
--
Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
--
Ververica GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Toni) Cheng

大森林

回复： why we need keyed state and operate state when we already have checkpoint?

state:

store the result of some operator(such as keyby,map)

Checkpoint:

store the last result when the program is running OK.

Am I right?

Thanks for your help~!

------------------ 原始邮件 ------------------

发件人: "Congxian Qiu" <[hidden email]>;

发送时间: 2020年10月13日(星期二) 中午1:32

收件人: "大森林"<[hidden email]>;

抄送: "Arvid Heise"<[hidden email]>;"Shengkai Fang"<[hidden email]>;"user"<[hidden email]>;

主题: Re: why we need keyed state and operate state when we already have checkpoint?

As others said, state is different as checkpoint. a checkpoint is just a **snapshot** of the state, and you can restore from the previous checkpoint if the job crashed.

state is for stateful computation, and checkpoint is for fault-tolerant[1]

checkpoint/savepoint contains the **snapshot** of all the state, if there is not state, then the checkpoint will be *empty*, you can restore from it, but the content is empty.

PS: maybe you don't create state explicit, but there contain some states in Flink(such as WindowOperator)

[1] https://ci.apache.org/projects/flink/flink-docs-release-1.11/concepts/stateful-stream-processing.html

Best,

Congxian

大森林 <[hidden email]> 于2020年10月12日周一下午9:26写道：

Thanks for your replies.
When I use no state-relevant code in my program,the checkingpoint can be saved and resumed.❶

So then why we need Keyed State/Operator State/Stateful Function?❷
"the operators are reset to the time of the respective checkpoint."
We already have met the requirement:"resume from checkpoint(last state of each operator which store the result)"❶,
why we still need ❷?
Thanks for your help~!

------------------ 原始邮件 ------------------
发件人: "Arvid Heise" <[hidden email]>;
发送时间: 2020年10月12日(星期一) 下午2:53
收件人: "大森林"<[hidden email]>;
抄送: "Shengkai Fang"<[hidden email]>;"user"<[hidden email]>;
主题: Re: why we need keyed state and operate state when we already have checkpoint?

Hi 大森林,

You can always resume from checkpoints independent of the usage of keyed or non-keyed state of operators.
1 checkpoint contains the state of all operators at a given point in time. Each operator may have keyed state, raw state, or non-keyed state.
As long as you are not changing the operators (too much) before restarting, you can always restart.

During (automatic) restart of a Flink application, the state of a given checkpoint is restored to the operators, such that it looks like the operator never failed. However, the operators are reset to the time of the respective checkpoint.

I have no clue what you mean with "previous variable temporary result".

On Wed, Oct 7, 2020 at 9:13 AM 大森林 <[hidden email]> wrote:
Thanks for your replies,I have some understandings.

There are two cases.
1. if I use no keyed state in program,when it's killed,I can only resume from previous result
1. if I use keyed state in program,when it's killed,I can resume from previous result and previous variable temporary result.

Am I right?
Thanks for your guide.

------------------ 原始邮件 ------------------
发件人: "Arvid Heise" <[hidden email]>;
发送时间: 2020年10月7日(星期三) 下午2:25
收件人: "大森林"<[hidden email]>;
抄送: "Shengkai Fang"<[hidden email]>;"user"<[hidden email]>;
主题: Re: why we need keyed state and operate state when we already have checkpoint?

I think there is some misunderstanding here: a checkpoint IS (a snapshot of) the keyed state and operator state (among a few more things). [1]

[1] https://ci.apache.org/projects/flink/flink-docs-release-1.11/learn-flink/fault_tolerance.html#definitions

On Wed, Oct 7, 2020 at 6:51 AM 大森林 <[hidden email]> wrote:
when the job is killed,state is also misssing.
so why we need keyed state?Is keyed state useful when we try to resuming the killed job?

------------------ 原始邮件 ------------------
发件人: "Shengkai Fang" <[hidden email]>;
发送时间: 2020年10月7日(星期三) 中午12:43
收件人: "大森林"<[hidden email]>;
抄送: "user"<[hidden email]>;
主题: Re: why we need keyed state and operate state when we already have checkpoint?

The checkpoint is a snapshot for the job and we can resume the job if the job is killed unexpectedly. The state is another thing to memorize the intermediate result of calculation. I don't think the checkpoint can replace state.

大森林 <[hidden email]> 于2020年10月7日周三下午12:26写道：
Could you tell me:

why we need keyed state and operator state when we already have checkpoint?

when a running jar crash,we can resume from the checkpoint automatically/manually.
So why did we still need keyed state and operator state.

Thanks

--
Arvid Heise | Senior Java Developer

Follow us @VervericaData
--
Join Flink Forward - The Apache Flink Conference
Stream Processing | Event Driven | Real Time
--
Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
--
Ververica GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Toni) Cheng

--
Arvid Heise | Senior Java Developer

Follow us @VervericaData
--
Join Flink Forward - The Apache Flink Conference
Stream Processing | Event Driven | Real Time
--
Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
--
Ververica GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Toni) Cheng