Removing stream in a job having state

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Removing stream in a job having state

ApoorvK
This post was updated on .
Hi All,

I have multiple steam in flink job which contains different state such as
ValueState or MapState.
But now I need to remove one stream having specific (UID,NAME) from the
JOB.

If I remove it I face issue while restoration stating operator does not
exists.

Another example -

I was using BucketSink for sinking data to HDFS with let say uid->hdfs-sink
name->hdf-sink.
Now I have changed it to sink into S3 using StreamingFile sink keeping
uid->hdfs-sink name->hdf-sink same as restoration will fail , but still it
is failing when I restore from savepoint , this is probably due to
BucketSink to StreamingFile change.

How can I achieve this then I need to remove HDFS and keep s3



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Removing stream in a job having state

David Anderson-3
When you modify a job by removing a stateful operator, then during a restart when Flink tries to restore the state, it will complain that the snapshot contains state that can not be restored. 

The solution to this is to restart from the savepoint (or retained checkpoint), specifying that you want to allow non-restored state [1]:

./bin/flink run -s <savepointPath> --allowNonRestoredState

BTW, you may run into further problems if you haven't assigned UIDs to your stateful operators. [2] 

On Tue, Jul 28, 2020 at 4:11 PM ApoorvK <[hidden email]> wrote:
Hi All,

I have multiple steam in flink job which contains different state such as
ValueState or MapState.
But I now I need to remove one stream having specific (UID,NAME) from the
JOB.

If I remove it I face issue while restoration stating operator does not
exists.

I was using BucketSink for sinking data to HDFS with let say uid->hdfs-sink
name->hdf-sink.
Now I have changed it to sink into S3 using StreamingFile sink keeping
uid->hdfs-sink name->hdf-sink same as restoration will fail , but still it
is failing when I restore from savepoint , this is probably due to
BucketSink to StreamingFile change.

How can I achieve this then I need to remove HDFS and keep s3



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/