Will that be a problem if POJO key type didn't override equals?

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Will that be a problem if POJO key type didn't override equals?

Tony Wei
Hi all,

I have defined a POJO class that override Object#hashCode and used it in keyBy().
The pipeline looks good (i.e. no exception that told me it is UNSUPPORTED key types), but I'm afraid that it will leads to a problem that elements that I think have the same key will not get the same state because I didn't override Object#equals.

Is it necessary that POJO key type overrides Object#equals? Or PojoTypeInfo didn't rely on MyClass#equals? Or keyBy() didn't rely on equals?

Thank you very much.

Best Regards,
Tony Wei
Reply | Threaded
Open this post in threaded view
|

Re: Will that be a problem if POJO key type didn't override equals?

Timo Walther
Hi Tony,

not having a proper equals() method might work for a keyBy()
(partitioning operation) but it can lead to unexpected side effects when
dealing with state. If not now, then maybe in the future. For example,
heap-based state uses a hash table data structures such that your key
might never be found again. I would recommend to wrap your POJO into
another class that implements a proper hashCode/equals.

Regards,
Timo


Am 2/6/18 um 4:16 AM schrieb Tony Wei:

> Hi all,
>
> I have defined a POJO class that override Object#hashCode and used it
> in keyBy().
> The pipeline looks good (i.e. no exception that told me it is
> UNSUPPORTED key types), but I'm afraid that it will leads to a problem
> that elements that I think have the same key will not get the same
> state because I didn't override Object#equals.
>
> Is it necessary that POJO key type overrides Object#equals? Or
> PojoTypeInfo didn't rely on MyClass#equals? Or keyBy() didn't rely on
> equals?
>
> Thank you very much.
>
> Best Regards,
> Tony Wei


Reply | Threaded
Open this post in threaded view
|

Re: Will that be a problem if POJO key type didn't override equals?

Tony Wei
Hi Timo,

Thanks for your response. I will implement equals for my POJO directly. Is that be okay instead of wrap it into another class?
Furthermore, I want to migrate the states from the previous job. Will it lead to state lost? I run my job on Flink 1.4.0. I used RocksDBStateBackend and only ValueState as key state.

BTW, could you please give more explanations about what heap-based state is? Since I'm not familiar with the details below the state implementations, it will be great if you can share more technical details or some references to me. Thank you!

Best Regards,
Tony Wei

2018-02-06 15:24 GMT+08:00 Timo Walther <[hidden email]>:
Hi Tony,

not having a proper equals() method might work for a keyBy() (partitioning operation) but it can lead to unexpected side effects when dealing with state. If not now, then maybe in the future. For example, heap-based state uses a hash table data structures such that your key might never be found again. I would recommend to wrap your POJO into another class that implements a proper hashCode/equals.

Regards,
Timo


Am 2/6/18 um 4:16 AM schrieb Tony Wei:

Hi all,

I have defined a POJO class that override Object#hashCode and used it in keyBy().
The pipeline looks good (i.e. no exception that told me it is UNSUPPORTED key types), but I'm afraid that it will leads to a problem that elements that I think have the same key will not get the same state because I didn't override Object#equals.

Is it necessary that POJO key type overrides Object#equals? Or PojoTypeInfo didn't rely on MyClass#equals? Or keyBy() didn't rely on equals?

Thank you very much.

Best Regards,
Tony Wei



Reply | Threaded
Open this post in threaded view
|

Re: Will that be a problem if POJO key type didn't override equals?

Timo Walther
With heap-based state I meant state that is stored using the MemoryStateBackend or FsStateBackend [1]. In general, even if you are just using ValueState, the key might be used internally to store your value state in hash table.

I think the migration should work in your case. Otherwise feel free to let us know.

Regards,
Timo


[1] https://ci.apache.org/projects/flink/flink-docs-release-1.4/ops/state/state_backends.html#the-memorystatebackend


Am 2/6/18 um 8:54 AM schrieb Tony Wei:
Hi Timo,

Thanks for your response. I will implement equals for my POJO directly. Is that be okay instead of wrap it into another class?
Furthermore, I want to migrate the states from the previous job. Will it lead to state lost? I run my job on Flink 1.4.0. I used RocksDBStateBackend and only ValueState as key state.

BTW, could you please give more explanations about what heap-based state is? Since I'm not familiar with the details below the state implementations, it will be great if you can share more technical details or some references to me. Thank you!

Best Regards,
Tony Wei

2018-02-06 15:24 GMT+08:00 Timo Walther <[hidden email]>:
Hi Tony,

not having a proper equals() method might work for a keyBy() (partitioning operation) but it can lead to unexpected side effects when dealing with state. If not now, then maybe in the future. For example, heap-based state uses a hash table data structures such that your key might never be found again. I would recommend to wrap your POJO into another class that implements a proper hashCode/equals.

Regards,
Timo


Am 2/6/18 um 4:16 AM schrieb Tony Wei:

Hi all,

I have defined a POJO class that override Object#hashCode and used it in keyBy().
The pipeline looks good (i.e. no exception that told me it is UNSUPPORTED key types), but I'm afraid that it will leads to a problem that elements that I think have the same key will not get the same state because I didn't override Object#equals.

Is it necessary that POJO key type overrides Object#equals? Or PojoTypeInfo didn't rely on MyClass#equals? Or keyBy() didn't rely on equals?

Thank you very much.

Best Regards,
Tony Wei




Reply | Threaded
Open this post in threaded view
|

Re: Will that be a problem if POJO key type didn't override equals?

Tony Wei
Hi Timo,

Thanks a lot. I will try it out.

Best Regards,
Tony Wei

2018-02-06 17:25 GMT+08:00 Timo Walther <[hidden email]>:
With heap-based state I meant state that is stored using the MemoryStateBackend or FsStateBackend [1]. In general, even if you are just using ValueState, the key might be used internally to store your value state in hash table.

I think the migration should work in your case. Otherwise feel free to let us know.

Regards,
Timo


[1] https://ci.apache.org/projects/flink/flink-docs-release-1.4/ops/state/state_backends.html#the-memorystatebackend


Am 2/6/18 um 8:54 AM schrieb Tony Wei:
Hi Timo,

Thanks for your response. I will implement equals for my POJO directly. Is that be okay instead of wrap it into another class?
Furthermore, I want to migrate the states from the previous job. Will it lead to state lost? I run my job on Flink 1.4.0. I used RocksDBStateBackend and only ValueState as key state.

BTW, could you please give more explanations about what heap-based state is? Since I'm not familiar with the details below the state implementations, it will be great if you can share more technical details or some references to me. Thank you!

Best Regards,
Tony Wei

2018-02-06 15:24 GMT+08:00 Timo Walther <[hidden email]>:
Hi Tony,

not having a proper equals() method might work for a keyBy() (partitioning operation) but it can lead to unexpected side effects when dealing with state. If not now, then maybe in the future. For example, heap-based state uses a hash table data structures such that your key might never be found again. I would recommend to wrap your POJO into another class that implements a proper hashCode/equals.

Regards,
Timo


Am 2/6/18 um 4:16 AM schrieb Tony Wei:

Hi all,

I have defined a POJO class that override Object#hashCode and used it in keyBy().
The pipeline looks good (i.e. no exception that told me it is UNSUPPORTED key types), but I'm afraid that it will leads to a problem that elements that I think have the same key will not get the same state because I didn't override Object#equals.

Is it necessary that POJO key type overrides Object#equals? Or PojoTypeInfo didn't rely on MyClass#equals? Or keyBy() didn't rely on equals?

Thank you very much.

Best Regards,
Tony Wei