Feature request: Removing state from operators

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Feature request: Removing state from operators

pwestermann

We use the feature for removing stateful operators via the allowNonRestoredState relatively often and it works great. However, there doesn’t seem to be anything like that for removing state from an existing operator (that we want to keep).

Say my operator defines a MapState and a ValueState. Later on, the ValueState becomes obsolete. In this case, we can remove the actual data for each key by clearing it out but the state itself is still referenced in savepoints even if it’s not referenced in code anymore – that e.g. means one cannot remove any class that was previously used in state.

Would it be possible to add support for completely removing state from an operator if it’s no longer referenced in code and allowNonRestoredState is set? (Or to add an explicit “drop this state option” in KeyedStateStore and OperatorStateStore?)

 

Thanks,

Peter

 

Reply | Threaded
Open this post in threaded view
|

Re: Feature request: Removing state from operators

rmetzger0
Hi Peter,

I'm adding two committers to this thread who can help answering your question.

On Mon, Oct 26, 2020 at 3:22 PM Peter Westermann <[hidden email]> wrote:

We use the feature for removing stateful operators via the allowNonRestoredState relatively often and it works great. However, there doesn’t seem to be anything like that for removing state from an existing operator (that we want to keep).

Say my operator defines a MapState and a ValueState. Later on, the ValueState becomes obsolete. In this case, we can remove the actual data for each key by clearing it out but the state itself is still referenced in savepoints even if it’s not referenced in code anymore – that e.g. means one cannot remove any class that was previously used in state.

Would it be possible to add support for completely removing state from an operator if it’s no longer referenced in code and allowNonRestoredState is set? (Or to add an explicit “drop this state option” in KeyedStateStore and OperatorStateStore?)

 

Thanks,

Peter

 

Reply | Threaded
Open this post in threaded view
|

Re: Feature request: Removing state from operators

Congxian Qiu
Hi Peter
     Can applyToAllKeys[1] in KeyedStateBackend help you here? but currently, this is not exposed to users now.


Best,
Congxian


Robert Metzger <[hidden email]> 于2020年10月27日周二 下午5:51写道:
Hi Peter,

I'm adding two committers to this thread who can help answering your question.

On Mon, Oct 26, 2020 at 3:22 PM Peter Westermann <[hidden email]> wrote:

We use the feature for removing stateful operators via the allowNonRestoredState relatively often and it works great. However, there doesn’t seem to be anything like that for removing state from an existing operator (that we want to keep).

Say my operator defines a MapState and a ValueState. Later on, the ValueState becomes obsolete. In this case, we can remove the actual data for each key by clearing it out but the state itself is still referenced in savepoints even if it’s not referenced in code anymore – that e.g. means one cannot remove any class that was previously used in state.

Would it be possible to add support for completely removing state from an operator if it’s no longer referenced in code and allowNonRestoredState is set? (Or to add an explicit “drop this state option” in KeyedStateStore and OperatorStateStore?)

 

Thanks,

Peter

 

Reply | Threaded
Open this post in threaded view
|

Re: Feature request: Removing state from operators

pwestermann

Does that actually allow removing a state completely (vs. just modifying the values stored in state)?

 

Ideally, we would want to just interact with state via KeyedStateStore. Maybe it would be possible to add a couple methods there, e.g. like this:

// List all pre-existing states

<S extends State, T> List<StateDescriptor<S, T>> listStates();

// Completely remove a state

<S extends State, T> void dropState(StateDescriptor<S, T> stateDescriptor);

 

 

Thanks,

Peter

 

 

 

From: Congxian Qiu <[hidden email]>
Date: Thursday, October 29, 2020 at 10:38 AM
To: Robert Metzger <[hidden email]>
Cc: Peter Westermann <[hidden email]>, "[hidden email]" <[hidden email]>
Subject: Re: Feature request: Removing state from operators

 

Hi Peter

     Can applyToAllKeys[1] in KeyedStateBackend help you here? but currently, this is not exposed to users now.

 


Best,

Congxian

 

 

Robert Metzger <[hidden email]> 20201027日周二 下午5:51写道:

Hi Peter,

 

I'm adding two committers to this thread who can help answering your question.

 

On Mon, Oct 26, 2020 at 3:22 PM Peter Westermann <[hidden email]> wrote:

We use the feature for removing stateful operators via the allowNonRestoredState relatively often and it works great. However, there doesn’t seem to be anything like that for removing state from an existing operator (that we want to keep).

Say my operator defines a MapState and a ValueState. Later on, the ValueState becomes obsolete. In this case, we can remove the actual data for each key by clearing it out but the state itself is still referenced in savepoints even if it’s not referenced in code anymore – that e.g. means one cannot remove any class that was previously used in state.

Would it be possible to add support for completely removing state from an operator if it’s no longer referenced in code and allowNonRestoredState is set? (Or to add an explicit “drop this state option” in KeyedStateStore and OperatorStateStore?)

 

Thanks,

Peter

 

Reply | Threaded
Open this post in threaded view
|

Re: Feature request: Removing state from operators

Steven Wu
not a solution, but a potential workaround. Maybe rename the operator uid so that you can continue to leverage allowNonRestoredState?

On Thu, Oct 29, 2020 at 7:58 AM Peter Westermann <[hidden email]> wrote:

Does that actually allow removing a state completely (vs. just modifying the values stored in state)?

 

Ideally, we would want to just interact with state via KeyedStateStore. Maybe it would be possible to add a couple methods there, e.g. like this:

// List all pre-existing states

<S extends State, T> List<StateDescriptor<S, T>> listStates();

// Completely remove a state

<S extends State, T> void dropState(StateDescriptor<S, T> stateDescriptor);

 

 

Thanks,

Peter

 

 

 

From: Congxian Qiu <[hidden email]>
Date: Thursday, October 29, 2020 at 10:38 AM
To: Robert Metzger <[hidden email]>
Cc: Peter Westermann <[hidden email]>, "[hidden email]" <[hidden email]>
Subject: Re: Feature request: Removing state from operators

 

Hi Peter

     Can applyToAllKeys[1] in KeyedStateBackend help you here? but currently, this is not exposed to users now.

 


Best,

Congxian

 

 

Robert Metzger <[hidden email]> 20201027日周二 下午5:51写道:

Hi Peter,

 

I'm adding two committers to this thread who can help answering your question.

 

On Mon, Oct 26, 2020 at 3:22 PM Peter Westermann <[hidden email]> wrote:

We use the feature for removing stateful operators via the allowNonRestoredState relatively often and it works great. However, there doesn’t seem to be anything like that for removing state from an existing operator (that we want to keep).

Say my operator defines a MapState and a ValueState. Later on, the ValueState becomes obsolete. In this case, we can remove the actual data for each key by clearing it out but the state itself is still referenced in savepoints even if it’s not referenced in code anymore – that e.g. means one cannot remove any class that was previously used in state.

Would it be possible to add support for completely removing state from an operator if it’s no longer referenced in code and allowNonRestoredState is set? (Or to add an explicit “drop this state option” in KeyedStateStore and OperatorStateStore?)

 

Thanks,

Peter

 

Reply | Threaded
Open this post in threaded view
|

Re: Feature request: Removing state from operators

David Anderson-4
It seems like another option here would be to occasionally use the state processor API to purge a savepoint of all unnecessary state. 

On Fri, Oct 30, 2020 at 6:57 PM Steven Wu <[hidden email]> wrote:
not a solution, but a potential workaround. Maybe rename the operator uid so that you can continue to leverage allowNonRestoredState?

On Thu, Oct 29, 2020 at 7:58 AM Peter Westermann <[hidden email]> wrote:

Does that actually allow removing a state completely (vs. just modifying the values stored in state)?

 

Ideally, we would want to just interact with state via KeyedStateStore. Maybe it would be possible to add a couple methods there, e.g. like this:

// List all pre-existing states

<S extends State, T> List<StateDescriptor<S, T>> listStates();

// Completely remove a state

<S extends State, T> void dropState(StateDescriptor<S, T> stateDescriptor);

 

 

Thanks,

Peter

 

 

 

From: Congxian Qiu <[hidden email]>
Date: Thursday, October 29, 2020 at 10:38 AM
To: Robert Metzger <[hidden email]>
Cc: Peter Westermann <[hidden email]>, "[hidden email]" <[hidden email]>
Subject: Re: Feature request: Removing state from operators

 

Hi Peter

     Can applyToAllKeys[1] in KeyedStateBackend help you here? but currently, this is not exposed to users now.

 


Best,

Congxian

 

 

Robert Metzger <[hidden email]> 20201027日周二 下午5:51写道:

Hi Peter,

 

I'm adding two committers to this thread who can help answering your question.

 

On Mon, Oct 26, 2020 at 3:22 PM Peter Westermann <[hidden email]> wrote:

We use the feature for removing stateful operators via the allowNonRestoredState relatively often and it works great. However, there doesn’t seem to be anything like that for removing state from an existing operator (that we want to keep).

Say my operator defines a MapState and a ValueState. Later on, the ValueState becomes obsolete. In this case, we can remove the actual data for each key by clearing it out but the state itself is still referenced in savepoints even if it’s not referenced in code anymore – that e.g. means one cannot remove any class that was previously used in state.

Would it be possible to add support for completely removing state from an operator if it’s no longer referenced in code and allowNonRestoredState is set? (Or to add an explicit “drop this state option” in KeyedStateStore and OperatorStateStore?)

 

Thanks,

Peter

 

Reply | Threaded
Open this post in threaded view
|

Re: Feature request: Removing state from operators

pwestermann

Renaming operators and migrating the state we still need manually is what we have done in the past. I was just hoping for a more convenient solution.

 

Peter

 

From: David Anderson <[hidden email]>
Date: Friday, October 30, 2020 at 5:55 PM
To: Peter Westermann <[hidden email]>, "[hidden email]" <[hidden email]>
Subject: Re: Feature request: Removing state from operators

 

It seems like another option here would be to occasionally use the state processor API to purge a savepoint of all unnecessary state. 

 

On Fri, Oct 30, 2020 at 6:57 PM Steven Wu <[hidden email]> wrote:

not a solution, but a potential workaround. Maybe rename the operator uid so that you can continue to leverage allowNonRestoredState?

 

On Thu, Oct 29, 2020 at 7:58 AM Peter Westermann <[hidden email]> wrote:

Does that actually allow removing a state completely (vs. just modifying the values stored in state)?

 

Ideally, we would want to just interact with state via KeyedStateStore. Maybe it would be possible to add a couple methods there, e.g. like this:

// List all pre-existing states

<S extends State, T> List<StateDescriptor<S, T>> listStates();

// Completely remove a state

<S extends State, T> void dropState(StateDescriptor<S, T> stateDescriptor);

 

 

Thanks,

Peter

 

 

 

From: Congxian Qiu <[hidden email]>
Date: Thursday, October 29, 2020 at 10:38 AM
To: Robert Metzger <[hidden email]>
Cc: Peter Westermann <[hidden email]>, "[hidden email]" <[hidden email]>
Subject: Re: Feature request: Removing state from operators

 

Hi Peter

     Can applyToAllKeys[1] in KeyedStateBackend help you here? but currently, this is not exposed to users now.

 


Best,

Congxian

 

 

Robert Metzger <[hidden email]> 20201027日周二 下午5:51写道:

Hi Peter,

 

I'm adding two committers to this thread who can help answering your question.

 

On Mon, Oct 26, 2020 at 3:22 PM Peter Westermann <[hidden email]> wrote:

We use the feature for removing stateful operators via the allowNonRestoredState relatively often and it works great. However, there doesn’t seem to be anything like that for removing state from an existing operator (that we want to keep).

Say my operator defines a MapState and a ValueState. Later on, the ValueState becomes obsolete. In this case, we can remove the actual data for each key by clearing it out but the state itself is still referenced in savepoints even if it’s not referenced in code anymore – that e.g. means one cannot remove any class that was previously used in state.

Would it be possible to add support for completely removing state from an operator if it’s no longer referenced in code and allowNonRestoredState is set? (Or to add an explicit “drop this state option” in KeyedStateStore and OperatorStateStore?)

 

Thanks,

Peter