Avro Serialization and RocksDB Internal State

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Avro Serialization and RocksDB Internal State

Biplob Biswas
Hi,

This is somewhat related to my previous query here:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Evolving-serializers-and-impact-on-flink-managed-states-td14777.html

I was exploring Avro Serialization and in that regard I enabled the force use of avro using,

env.getConfig().enableForceAvro();

Now, my assumption is all internal data transfers will be done using the Avro serde, what I also assume is that on my RocksDB state backend my objects would also be stored after serializing with avro.

Is this assumption correct?

If not is there a way to register my objects (sometimes not exactly a POJO as recognized by Flink) with the avro serializers?

I want to de-/serialize my objects using avro and store it on my RocksDB state backend, but I am not really aware how can I verify the serializer which is used to perform that.

Thanks and Regards,
Biplob
Reply | Threaded
Open this post in threaded view
|

Re: Avro Serialization and RocksDB Internal State

Biplob Biswas
Can anyone please shed some light on this?
Reply | Threaded
Open this post in threaded view
|

Re: Avro Serialization and RocksDB Internal State

Tzu-Li (Gordon) Tai
In reply to this post by Biplob Biswas
Hi Biplob,

Yes, your assumptions are correct [1]. To be a bit more exact, the `AvroSerializer` will be used to serialize your POJO data types.
That would be the case for data transfers and state serialization (unless for state serialization you specify a custom state serializer; see [2])

If not is there a way to register my objects (sometimes not exactly a POJO 
as recognized by Flink) with the avro serializers? 

If your objects are not recognized as POJO (in which case the `GenericTypeInformation` will be extracted instead of `PojoTypeInformation`), there’s also an alternative way to serialize to RocksDB using Avro.

1. You can simply register custom serializers for that type [3].
2. Use custom state serializers for that specific registered state. Please see [2] for instructions on how to do that.

Hope this helps!

Cheers,
Gordon

[3] https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/custom_serializers.html#register-a-custom-serializer-for-your-flink-program


On 15 August 2017 at 4:36:27 PM, Biplob Biswas ([hidden email]) wrote:

Hi,

This is somewhat related to my previous query here:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Evolving-serializers-and-impact-on-flink-managed-states-td14777.html

I was exploring Avro Serialization and in that regard I enabled the force
use of avro using,

env.getConfig().enableForceAvro();

Now, my assumption is all internal data transfers will be done using the
Avro serde, what I also assume is that on my RocksDB state backend my
objects would also be stored after serializing with avro.

*Is this assumption correct? *

If not is there a way to register my objects (sometimes not exactly a POJO
as recognized by Flink) with the avro serializers?

I want to de-/serialize my objects using avro and store it on my RocksDB
state backend, but I am not really aware how can I verify the serializer
which is used to perform that.

Thanks and Regards,
Biplob



--
View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Avro-Serialization-and-RocksDB-Internal-State-tp14912.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.
Reply | Threaded
Open this post in threaded view
|

Re: Avro Serialization and RocksDB Internal State

Biplob Biswas
Thanks a lot Gordon, that really helps a lot. :) One last thing, is there any way to verify that an object has been serialized with a specific serializer? except trying to deserialize with a different deserializer and failing?