Upgrade job topology in checkpoint

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Upgrade job topology in checkpoint

Padarn Wilson-2
Hi all,

I'm looking for some clarity about changing job topology as described here: https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/ops/upgrading/#application-topology

My question is simple: Does this only apply to savepoints? Or can it also work for checkpoints? (also why if not)

Cheers,
Padarn
Reply | Threaded
Open this post in threaded view
|

Re: Upgrade job topology in checkpoint

Yun Gao
Hi Padarn,

By default the checkpoint would be disposed when the job finished or failed,
they would be retained only when explicitly required [1].

From the implementation perspective I think users could be able to change topology when restored
from external checkpoint, but I think Flink would not guarantee this functionality. 

Best,
Yun



------------------Original Mail ------------------
Sender:Padarn Wilson <[hidden email]>
Send Date:Sat Jun 12 12:19:56 2021
Recipients:user <[hidden email]>
Subject:Upgrade job topology in checkpoint
Hi all,

I'm looking for some clarity about changing job topology as described here: https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/ops/upgrading/#application-topology

My question is simple: Does this only apply to savepoints? Or can it also work for checkpoints? (also why if not)

Cheers,
Padarn
Reply | Threaded
Open this post in threaded view
|

Re: Re: Upgrade job topology in checkpoint

Padarn Wilson-2
We added a new sink to the job graph and redeployed - but the new sink did not receive any records, as though it were not connected to the graph (possible it was a code bug, but I was trying to understand if this make sense given the implementation)

re-including mailing list, excluded by accident

Padarn

On Wed, Jun 16, 2021 at 10:59 AM Yun Gao <[hidden email]> wrote:
Hi Padarn,

Sorry I might not fully got the mean of new topology was ignored. 
Do you mean the topology is not the same as expected ? 

Best,
Yun


------------------Original Mail ------------------
Sender:Padarn Wilson <[hidden email]>
Send Date:Tue Jun 15 21:45:17 2021
Recipients:Yun Gao <[hidden email]>
Subject:Re: Upgrade job topology in checkpoint
Thanks Yun,

Yes we do indeed retain checkpoints, but we were unable to restore with new topology from them for some reason. It seemed like the new topology was ignored totally which was surprising to me.

Padarn

On Tue, Jun 15, 2021 at 7:35 PM Yun Gao <[hidden email]> wrote:
Hi Padarn,

By default the checkpoint would be disposed when the job finished or failed,
they would be retained only when explicitly required [1].

From the implementation perspective I think users could be able to change topology when restored
from external checkpoint, but I think Flink would not guarantee this functionality. 

Best,
Yun



------------------Original Mail ------------------
Sender:Padarn Wilson <[hidden email]>
Send Date:Sat Jun 12 12:19:56 2021
Recipients:user <[hidden email]>
Subject:Upgrade job topology in checkpoint
Hi all,

I'm looking for some clarity about changing job topology as described here: https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/ops/upgrading/#application-topology

My question is simple: Does this only apply to savepoints? Or can it also work for checkpoints? (also why if not)

Cheers,
Padarn
Reply | Threaded
Open this post in threaded view
|

Re: Re: Re: Upgrade job topology in checkpoint

Yun Gao
Hi Padarn,

From the current description it seems to me that the issue does not related to
the state ? I think we may first check if the operator logic is right and whether
the precedent tasks have indeed emitted records to the new sink.

Best,
Yun

------------------Original Mail ------------------
Sender:Padarn Wilson <[hidden email]>
Send Date:Wed Jun 16 12:27:43 2021
Recipients:Yun Gao <[hidden email]>, user <[hidden email]>
Subject:Re: Re: Upgrade job topology in checkpoint
We added a new sink to the job graph and redeployed - but the new sink did not receive any records, as though it were not connected to the graph (possible it was a code bug, but I was trying to understand if this make sense given the implementation)

re-including mailing list, excluded by accident

Padarn

On Wed, Jun 16, 2021 at 10:59 AM Yun Gao <[hidden email]> wrote:
Hi Padarn,

Sorry I might not fully got the mean of new topology was ignored. 
Do you mean the topology is not the same as expected ? 

Best,
Yun


------------------Original Mail ------------------
Sender:Padarn Wilson <[hidden email]>
Send Date:Tue Jun 15 21:45:17 2021
Recipients:Yun Gao <[hidden email]>
Subject:Re: Upgrade job topology in checkpoint
Thanks Yun,

Yes we do indeed retain checkpoints, but we were unable to restore with new topology from them for some reason. It seemed like the new topology was ignored totally which was surprising to me.

Padarn

On Tue, Jun 15, 2021 at 7:35 PM Yun Gao <[hidden email]> wrote:
Hi Padarn,

By default the checkpoint would be disposed when the job finished or failed,
they would be retained only when explicitly required [1].

From the implementation perspective I think users could be able to change topology when restored
from external checkpoint, but I think Flink would not guarantee this functionality. 

Best,
Yun



------------------Original Mail ------------------
Sender:Padarn Wilson <[hidden email]>
Send Date:Sat Jun 12 12:19:56 2021
Recipients:user <[hidden email]>
Subject:Upgrade job topology in checkpoint
Hi all,

I'm looking for some clarity about changing job topology as described here: https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/ops/upgrading/#application-topology

My question is simple: Does this only apply to savepoints? Or can it also work for checkpoints? (also why if not)

Cheers,
Padarn
Reply | Threaded
Open this post in threaded view
|

Re: Re: Re: Upgrade job topology in checkpoint

Padarn Wilson-2
Thanks Yun,

Agreed, it seemed unlikely to be state, I just wanted to confirm that this was unexpected before ruling it out.

Thanks,
Padarn

On Thu, Jun 17, 2021 at 10:45 AM Yun Gao <[hidden email]> wrote:
Hi Padarn,

From the current description it seems to me that the issue does not related to
the state ? I think we may first check if the operator logic is right and whether
the precedent tasks have indeed emitted records to the new sink.

Best,
Yun

------------------Original Mail ------------------
Sender:Padarn Wilson <[hidden email]>
Send Date:Wed Jun 16 12:27:43 2021
Recipients:Yun Gao <[hidden email]>, user <[hidden email]>
Subject:Re: Re: Upgrade job topology in checkpoint
We added a new sink to the job graph and redeployed - but the new sink did not receive any records, as though it were not connected to the graph (possible it was a code bug, but I was trying to understand if this make sense given the implementation)

re-including mailing list, excluded by accident

Padarn

On Wed, Jun 16, 2021 at 10:59 AM Yun Gao <[hidden email]> wrote:
Hi Padarn,

Sorry I might not fully got the mean of new topology was ignored. 
Do you mean the topology is not the same as expected ? 

Best,
Yun


------------------Original Mail ------------------
Sender:Padarn Wilson <[hidden email]>
Send Date:Tue Jun 15 21:45:17 2021
Recipients:Yun Gao <[hidden email]>
Subject:Re: Upgrade job topology in checkpoint
Thanks Yun,

Yes we do indeed retain checkpoints, but we were unable to restore with new topology from them for some reason. It seemed like the new topology was ignored totally which was surprising to me.

Padarn

On Tue, Jun 15, 2021 at 7:35 PM Yun Gao <[hidden email]> wrote:
Hi Padarn,

By default the checkpoint would be disposed when the job finished or failed,
they would be retained only when explicitly required [1].

From the implementation perspective I think users could be able to change topology when restored
from external checkpoint, but I think Flink would not guarantee this functionality. 

Best,
Yun



------------------Original Mail ------------------
Sender:Padarn Wilson <[hidden email]>
Send Date:Sat Jun 12 12:19:56 2021
Recipients:user <[hidden email]>
Subject:Upgrade job topology in checkpoint
Hi all,

I'm looking for some clarity about changing job topology as described here: https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/ops/upgrading/#application-topology

My question is simple: Does this only apply to savepoints? Or can it also work for checkpoints? (also why if not)

Cheers,
Padarn