Hi all,
I have defined a POJO class that override Object#hashCode and used it in keyBy(). The pipeline looks good (i.e. no exception that told me it is UNSUPPORTED key types), but I'm afraid that it will leads to a problem that elements that I think have the same key will not get the same state because I didn't override Object#equals. Is it necessary that POJO key type overrides Object#equals? Or PojoTypeInfo didn't rely on MyClass#equals? Or keyBy() didn't rely on equals? Thank you very much. Best Regards, Tony Wei |
Hi Tony,
not having a proper equals() method might work for a keyBy() (partitioning operation) but it can lead to unexpected side effects when dealing with state. If not now, then maybe in the future. For example, heap-based state uses a hash table data structures such that your key might never be found again. I would recommend to wrap your POJO into another class that implements a proper hashCode/equals. Regards, Timo Am 2/6/18 um 4:16 AM schrieb Tony Wei: > Hi all, > > I have defined a POJO class that override Object#hashCode and used it > in keyBy(). > The pipeline looks good (i.e. no exception that told me it is > UNSUPPORTED key types), but I'm afraid that it will leads to a problem > that elements that I think have the same key will not get the same > state because I didn't override Object#equals. > > Is it necessary that POJO key type overrides Object#equals? Or > PojoTypeInfo didn't rely on MyClass#equals? Or keyBy() didn't rely on > equals? > > Thank you very much. > > Best Regards, > Tony Wei |
Hi Timo, Thanks for your response. I will implement equals for my POJO directly. Is that be okay instead of wrap it into another class? Furthermore, I want to migrate the states from the previous job. Will it lead to state lost? I run my job on Flink 1.4.0. I used RocksDBStateBackend and only ValueState as key state. BTW, could you please give more explanations about what heap-based state is? Since I'm not familiar with the details below the state implementations, it will be great if you can share more technical details or some references to me. Thank you! Best Regards, Tony Wei 2018-02-06 15:24 GMT+08:00 Timo Walther <[hidden email]>: Hi Tony, |
With heap-based state I meant state
that is stored using the MemoryStateBackend or FsStateBackend [1].
In general, even if you are just using ValueState, the key might
be used internally to store your value state in hash table.
I think the migration should work in your case. Otherwise feel free to let us know. Regards, Timo [1] https://ci.apache.org/projects/flink/flink-docs-release-1.4/ops/state/state_backends.html#the-memorystatebackend Am 2/6/18 um 8:54 AM schrieb Tony Wei:
|
Hi Timo, Thanks a lot. I will try it out. Best Regards, Tony Wei 2018-02-06 17:25 GMT+08:00 Timo Walther <[hidden email]>:
|
Free forum by Nabble | Edit this page |