Flink operator UUID and serialVersionUID

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Flink operator UUID and serialVersionUID

Jayant Ameta
Hi all, I've a few questions regarding serial version:

1. The production ready checklist mentions using uuids for operators. How is it different from setting a serialVersionUID on an operator?

2. Which operators need to have a serialVersionUID present (or implement Serializable interface)?

3. If I have a MapState<String, MyObject>, does MyObject need to have a serialVersionUID and does it need to implement Serializable interface?

Thanks,
Jayant
Reply | Threaded
Open this post in threaded view
|

Re: Flink operator UUID and serialVersionUID

bupt_ljy

Hi, Jayant

   1. The uuid is an unique identifier for a specific operator, which means that Flink uses the uuid to recognize the operator when restoring.

   2. The operator has already implemented the Serializable interface so you don’t need to do it explicitly.

   3. The type information of “MyObject” should be defined in the MapStateDescriptor, so no need to worry about it.


Best,

Jiayi Liao


 Original Message 
Sender: Jayant Ameta<[hidden email]>
Recipient: user<[hidden email]>
Date: Wednesday, Nov 28, 2018 15:09
Subject: Flink operator UUID and serialVersionUID

Hi all, I've a few questions regarding serial version:

1. The production ready checklist mentions using uuids for operators. How is it different from setting a serialVersionUID on an operator?

2. Which operators need to have a serialVersionUID present (or implement Serializable interface)?

3. If I have a MapState<String, MyObject>, does MyObject need to have a serialVersionUID and does it need to implement Serializable interface?

Thanks,
Jayant
Reply | Threaded
Open this post in threaded view
|

Re: Flink operator UUID and serialVersionUID

Jayant Ameta
Thanks for clarifying Jiayi.
If there is a change in "MyObject" class, would it help to have a serialVersionUID defined?

Thanks,
Jayant


On Wed, Nov 28, 2018 at 12:52 PM bupt_ljy <[hidden email]> wrote:

Hi, Jayant

   1. The uuid is an unique identifier for a specific operator, which means that Flink uses the uuid to recognize the operator when restoring.

   2. The operator has already implemented the Serializable interface so you don’t need to do it explicitly.

   3. The type information of “MyObject” should be defined in the MapStateDescriptor, so no need to worry about it.


Best,

Jiayi Liao


 Original Message 
Sender: Jayant Ameta<[hidden email]>
Recipient: user<[hidden email]>
Date: Wednesday, Nov 28, 2018 15:09
Subject: Flink operator UUID and serialVersionUID

Hi all, I've a few questions regarding serial version:

1. The production ready checklist mentions using uuids for operators. How is it different from setting a serialVersionUID on an operator?

2. Which operators need to have a serialVersionUID present (or implement Serializable interface)?

3. If I have a MapState<String, MyObject>, does MyObject need to have a serialVersionUID and does it need to implement Serializable interface?

Thanks,
Jayant
Reply | Threaded
Open this post in threaded view
|

Re: Flink operator UUID and serialVersionUID

bupt_ljy
In reply to this post by Jayant Ameta

Hi Jayant,

   If you change the “MyObject” class and influence the serialized value, then the “MyObject” instance can’t be deserialized when restoring, which causes a failure of restoring. You can just use the default serialVersionUID instead defining it explicitly(it makes no difference if you do it).


Best,

Jiayi Liao


 Original Message 
Sender: Jayant Ameta<[hidden email]>
Recipient: bupt_ljy<[hidden email]>
Cc: user<[hidden email]>
Date: Wednesday, Nov 28, 2018 15:46
Subject: Re: Flink operator UUID and serialVersionUID

Thanks for clarifying Jiayi.
If there is a change in "MyObject" class, would it help to have a serialVersionUID defined?

Thanks,
Jayant


On Wed, Nov 28, 2018 at 12:52 PM bupt_ljy <[hidden email]> wrote:

Hi, Jayant

   1. The uuid is an unique identifier for a specific operator, which means that Flink uses the uuid to recognize the operator when restoring.

   2. The operator has already implemented the Serializable interface so you don’t need to do it explicitly.

   3. The type information of “MyObject” should be defined in the MapStateDescriptor, so no need to worry about it.


Best,

Jiayi Liao


 Original Message 
Sender: Jayant Ameta<[hidden email]>
Recipient: user<[hidden email]>
Date: Wednesday, Nov 28, 2018 15:09
Subject: Flink operator UUID and serialVersionUID

Hi all, I've a few questions regarding serial version:

1. The production ready checklist mentions using uuids for operators. How is it different from setting a serialVersionUID on an operator?

2. Which operators need to have a serialVersionUID present (or implement Serializable interface)?

3. If I have a MapState<String, MyObject>, does MyObject need to have a serialVersionUID and does it need to implement Serializable interface?

Thanks,
Jayant
Reply | Threaded
Open this post in threaded view
|

Re: Flink operator UUID and serialVersionUID

Jayant Ameta
If I upgrade my flink job, and add a field in "MyObject" class. Will the restore fail?
If so, how to handle such scenarios? Should I convert the "MyObject" instance in json and store the string?

Jayant Ameta


On Wed, Nov 28, 2018 at 1:26 PM bupt_ljy <[hidden email]> wrote:

Hi Jayant,

   If you change the “MyObject” class and influence the serialized value, then the “MyObject” instance can’t be deserialized when restoring, which causes a failure of restoring. You can just use the default serialVersionUID instead defining it explicitly(it makes no difference if you do it).


Best,

Jiayi Liao


 Original Message 
Sender: Jayant Ameta<[hidden email]>
Recipient: bupt_ljy<[hidden email]>
Cc: user<[hidden email]>
Date: Wednesday, Nov 28, 2018 15:46
Subject: Re: Flink operator UUID and serialVersionUID

Thanks for clarifying Jiayi.
If there is a change in "MyObject" class, would it help to have a serialVersionUID defined?

Thanks,
Jayant


On Wed, Nov 28, 2018 at 12:52 PM bupt_ljy <[hidden email]> wrote:

Hi, Jayant

   1. The uuid is an unique identifier for a specific operator, which means that Flink uses the uuid to recognize the operator when restoring.

   2. The operator has already implemented the Serializable interface so you don’t need to do it explicitly.

   3. The type information of “MyObject” should be defined in the MapStateDescriptor, so no need to worry about it.


Best,

Jiayi Liao


 Original Message 
Sender: Jayant Ameta<[hidden email]>
Recipient: user<[hidden email]>
Date: Wednesday, Nov 28, 2018 15:09
Subject: Flink operator UUID and serialVersionUID

Hi all, I've a few questions regarding serial version:

1. The production ready checklist mentions using uuids for operators. How is it different from setting a serialVersionUID on an operator?

2. Which operators need to have a serialVersionUID present (or implement Serializable interface)?

3. If I have a MapState<String, MyObject>, does MyObject need to have a serialVersionUID and does it need to implement Serializable interface?

Thanks,
Jayant
Reply | Threaded
Open this post in threaded view
|

Re: Flink operator UUID and serialVersionUID

bupt_ljy
In reply to this post by Jayant Ameta

Hi,

    It’ll fail because Flink can’t successfully deserialize the data in savepoint into the new “MyObject” class. There is no official way to fix this problem. However, you can take a look at the bravo project https://github.com/king/bravo, which can help to reconstruct the savepoint, but only with the RocksDBStateBackend now.


Best,
Jiayi Liao

 Original Message 
Sender: Jayant Ameta<[hidden email]>
Recipient: bupt_ljy<[hidden email]>
Cc: user<[hidden email]>
Date: Wednesday, Nov 28, 2018 17:14
Subject: Re: Flink operator UUID and serialVersionUID

If I upgrade my flink job, and add a field in "MyObject" class. Will the restore fail?
If so, how to handle such scenarios? Should I convert the "MyObject" instance in json and store the string?

Jayant Ameta


On Wed, Nov 28, 2018 at 1:26 PM bupt_ljy <[hidden email]> wrote:

Hi Jayant,

   If you change the “MyObject” class and influence the serialized value, then the “MyObject” instance can’t be deserialized when restoring, which causes a failure of restoring. You can just use the default serialVersionUID instead defining it explicitly(it makes no difference if you do it).


Best,

Jiayi Liao


 Original Message 
Sender: Jayant Ameta<[hidden email]>
Recipient: bupt_ljy<[hidden email]>
Cc: user<[hidden email]>
Date: Wednesday, Nov 28, 2018 15:46
Subject: Re: Flink operator UUID and serialVersionUID

Thanks for clarifying Jiayi.
If there is a change in "MyObject" class, would it help to have a serialVersionUID defined?

Thanks,
Jayant


On Wed, Nov 28, 2018 at 12:52 PM bupt_ljy <[hidden email]> wrote:

Hi, Jayant

   1. The uuid is an unique identifier for a specific operator, which means that Flink uses the uuid to recognize the operator when restoring.

   2. The operator has already implemented the Serializable interface so you don’t need to do it explicitly.

   3. The type information of “MyObject” should be defined in the MapStateDescriptor, so no need to worry about it.


Best,

Jiayi Liao


 Original Message 
Sender: Jayant Ameta<[hidden email]>
Recipient: user<[hidden email]>
Date: Wednesday, Nov 28, 2018 15:09
Subject: Flink operator UUID and serialVersionUID

Hi all, I've a few questions regarding serial version:

1. The production ready checklist mentions using uuids for operators. How is it different from setting a serialVersionUID on an operator?

2. Which operators need to have a serialVersionUID present (or implement Serializable interface)?

3. If I have a MapState<String, MyObject>, does MyObject need to have a serialVersionUID and does it need to implement Serializable interface?

Thanks,
Jayant
Reply | Threaded
Open this post in threaded view
|

Re: Flink operator UUID and serialVersionUID

Jayant Ameta
Thanks, I'll look into the bravo project.

Will it just impact the MapState or all the operators? 
If I have a map operator which converts DataStream<MyObject> to DataStream<Tuple2<String, MyObject>>. Will this fail to recover as well if a field is added to MyObject?
Jayant Ameta


On Wed, Nov 28, 2018 at 3:08 PM bupt_ljy <[hidden email]> wrote:

Hi,

    It’ll fail because Flink can’t successfully deserialize the data in savepoint into the new “MyObject” class. There is no official way to fix this problem. However, you can take a look at the bravo project https://github.com/king/bravo, which can help to reconstruct the savepoint, but only with the RocksDBStateBackend now.


Best,
Jiayi Liao

 Original Message 
Sender: Jayant Ameta<[hidden email]>
Recipient: bupt_ljy<[hidden email]>
Cc: user<[hidden email]>
Date: Wednesday, Nov 28, 2018 17:14
Subject: Re: Flink operator UUID and serialVersionUID

If I upgrade my flink job, and add a field in "MyObject" class. Will the restore fail?
If so, how to handle such scenarios? Should I convert the "MyObject" instance in json and store the string?

Jayant Ameta


On Wed, Nov 28, 2018 at 1:26 PM bupt_ljy <[hidden email]> wrote:

Hi Jayant,

   If you change the “MyObject” class and influence the serialized value, then the “MyObject” instance can’t be deserialized when restoring, which causes a failure of restoring. You can just use the default serialVersionUID instead defining it explicitly(it makes no difference if you do it).


Best,

Jiayi Liao


 Original Message 
Sender: Jayant Ameta<[hidden email]>
Recipient: bupt_ljy<[hidden email]>
Cc: user<[hidden email]>
Date: Wednesday, Nov 28, 2018 15:46
Subject: Re: Flink operator UUID and serialVersionUID

Thanks for clarifying Jiayi.
If there is a change in "MyObject" class, would it help to have a serialVersionUID defined?

Thanks,
Jayant


On Wed, Nov 28, 2018 at 12:52 PM bupt_ljy <[hidden email]> wrote:

Hi, Jayant

   1. The uuid is an unique identifier for a specific operator, which means that Flink uses the uuid to recognize the operator when restoring.

   2. The operator has already implemented the Serializable interface so you don’t need to do it explicitly.

   3. The type information of “MyObject” should be defined in the MapStateDescriptor, so no need to worry about it.


Best,

Jiayi Liao


 Original Message 
Sender: Jayant Ameta<[hidden email]>
Recipient: user<[hidden email]>
Date: Wednesday, Nov 28, 2018 15:09
Subject: Flink operator UUID and serialVersionUID

Hi all, I've a few questions regarding serial version:

1. The production ready checklist mentions using uuids for operators. How is it different from setting a serialVersionUID on an operator?

2. Which operators need to have a serialVersionUID present (or implement Serializable interface)?

3. If I have a MapState<String, MyObject>, does MyObject need to have a serialVersionUID and does it need to implement Serializable interface?

Thanks,
Jayant
Reply | Threaded
Open this post in threaded view
|

Re: Flink operator UUID and serialVersionUID

bupt_ljy
In reply to this post by Jayant Ameta

Hi, 

It will only affect the states. Because Flink will serialize these states when doing snapshot, therefore, we need to make sure that the states can be deserialized if the program is restarted. As for the stateless operators, it’s not affected because Flink won’t do a snapshot on them.


Best,
Jiayi Liao
 Original Message 
Sender: Jayant Ameta<[hidden email]>
Recipient: bupt_ljy<[hidden email]>
Cc: user<[hidden email]>
Date: Wednesday, Nov 28, 2018 17:50
Subject: Re: Flink operator UUID and serialVersionUID

Thanks, I'll look into the bravo project.

Will it just impact the MapState or all the operators? 
If I have a map operator which converts DataStream<MyObject> to DataStream<Tuple2<String, MyObject>>. Will this fail to recover as well if a field is added to MyObject?
Jayant Ameta


On Wed, Nov 28, 2018 at 3:08 PM bupt_ljy <[hidden email]> wrote:

Hi,

    It’ll fail because Flink can’t successfully deserialize the data in savepoint into the new “MyObject” class. There is no official way to fix this problem. However, you can take a look at the bravo project https://github.com/king/bravo, which can help to reconstruct the savepoint, but only with the RocksDBStateBackend now.


Best,
Jiayi Liao

 Original Message 
Sender: Jayant Ameta<[hidden email]>
Recipient: bupt_ljy<[hidden email]>
Cc: user<[hidden email]>
Date: Wednesday, Nov 28, 2018 17:14
Subject: Re: Flink operator UUID and serialVersionUID

If I upgrade my flink job, and add a field in "MyObject" class. Will the restore fail?
If so, how to handle such scenarios? Should I convert the "MyObject" instance in json and store the string?

Jayant Ameta


On Wed, Nov 28, 2018 at 1:26 PM bupt_ljy <[hidden email]> wrote:

Hi Jayant,

   If you change the “MyObject” class and influence the serialized value, then the “MyObject” instance can’t be deserialized when restoring, which causes a failure of restoring. You can just use the default serialVersionUID instead defining it explicitly(it makes no difference if you do it).


Best,

Jiayi Liao


 Original Message 
Sender: Jayant Ameta<[hidden email]>
Recipient: bupt_ljy<[hidden email]>
Cc: user<[hidden email]>
Date: Wednesday, Nov 28, 2018 15:46
Subject: Re: Flink operator UUID and serialVersionUID

Thanks for clarifying Jiayi.
If there is a change in "MyObject" class, would it help to have a serialVersionUID defined?

Thanks,
Jayant


On Wed, Nov 28, 2018 at 12:52 PM bupt_ljy <[hidden email]> wrote:

Hi, Jayant

   1. The uuid is an unique identifier for a specific operator, which means that Flink uses the uuid to recognize the operator when restoring.

   2. The operator has already implemented the Serializable interface so you don’t need to do it explicitly.

   3. The type information of “MyObject” should be defined in the MapStateDescriptor, so no need to worry about it.


Best,

Jiayi Liao


 Original Message 
Sender: Jayant Ameta<[hidden email]>
Recipient: user<[hidden email]>
Date: Wednesday, Nov 28, 2018 15:09
Subject: Flink operator UUID and serialVersionUID

Hi all, I've a few questions regarding serial version:

1. The production ready checklist mentions using uuids for operators. How is it different from setting a serialVersionUID on an operator?

2. Which operators need to have a serialVersionUID present (or implement Serializable interface)?

3. If I have a MapState<String, MyObject>, does MyObject need to have a serialVersionUID and does it need to implement Serializable interface?

Thanks,
Jayant
Reply | Threaded
Open this post in threaded view
|

Re: Flink operator UUID and serialVersionUID

Jayant Ameta
Makes sense. Thanks Jiayi!!

Regards,
Jayant


On Wed, Nov 28, 2018 at 3:32 PM bupt_ljy <[hidden email]> wrote:

Hi, 

It will only affect the states. Because Flink will serialize these states when doing snapshot, therefore, we need to make sure that the states can be deserialized if the program is restarted. As for the stateless operators, it’s not affected because Flink won’t do a snapshot on them.


Best,
Jiayi Liao
 Original Message 
Sender: Jayant Ameta<[hidden email]>
Recipient: bupt_ljy<[hidden email]>
Cc: user<[hidden email]>
Date: Wednesday, Nov 28, 2018 17:50
Subject: Re: Flink operator UUID and serialVersionUID

Thanks, I'll look into the bravo project.

Will it just impact the MapState or all the operators? 
If I have a map operator which converts DataStream<MyObject> to DataStream<Tuple2<String, MyObject>>. Will this fail to recover as well if a field is added to MyObject?
Jayant Ameta


On Wed, Nov 28, 2018 at 3:08 PM bupt_ljy <[hidden email]> wrote:

Hi,

    It’ll fail because Flink can’t successfully deserialize the data in savepoint into the new “MyObject” class. There is no official way to fix this problem. However, you can take a look at the bravo project https://github.com/king/bravo, which can help to reconstruct the savepoint, but only with the RocksDBStateBackend now.


Best,
Jiayi Liao

 Original Message 
Sender: Jayant Ameta<[hidden email]>
Recipient: bupt_ljy<[hidden email]>
Cc: user<[hidden email]>
Date: Wednesday, Nov 28, 2018 17:14
Subject: Re: Flink operator UUID and serialVersionUID

If I upgrade my flink job, and add a field in "MyObject" class. Will the restore fail?
If so, how to handle such scenarios? Should I convert the "MyObject" instance in json and store the string?

Jayant Ameta


On Wed, Nov 28, 2018 at 1:26 PM bupt_ljy <[hidden email]> wrote:

Hi Jayant,

   If you change the “MyObject” class and influence the serialized value, then the “MyObject” instance can’t be deserialized when restoring, which causes a failure of restoring. You can just use the default serialVersionUID instead defining it explicitly(it makes no difference if you do it).


Best,

Jiayi Liao


 Original Message 
Sender: Jayant Ameta<[hidden email]>
Recipient: bupt_ljy<[hidden email]>
Cc: user<[hidden email]>
Date: Wednesday, Nov 28, 2018 15:46
Subject: Re: Flink operator UUID and serialVersionUID

Thanks for clarifying Jiayi.
If there is a change in "MyObject" class, would it help to have a serialVersionUID defined?

Thanks,
Jayant


On Wed, Nov 28, 2018 at 12:52 PM bupt_ljy <[hidden email]> wrote:

Hi, Jayant

   1. The uuid is an unique identifier for a specific operator, which means that Flink uses the uuid to recognize the operator when restoring.

   2. The operator has already implemented the Serializable interface so you don’t need to do it explicitly.

   3. The type information of “MyObject” should be defined in the MapStateDescriptor, so no need to worry about it.


Best,

Jiayi Liao


 Original Message 
Sender: Jayant Ameta<[hidden email]>
Recipient: user<[hidden email]>
Date: Wednesday, Nov 28, 2018 15:09
Subject: Flink operator UUID and serialVersionUID

Hi all, I've a few questions regarding serial version:

1. The production ready checklist mentions using uuids for operators. How is it different from setting a serialVersionUID on an operator?

2. Which operators need to have a serialVersionUID present (or implement Serializable interface)?

3. If I have a MapState<String, MyObject>, does MyObject need to have a serialVersionUID and does it need to implement Serializable interface?

Thanks,
Jayant