Hi to all, in my Flink job I create a Dataset<MyThriftObj> using HadoopInputFormat in this way: new ParquetThriftInputFormat<MyThriftObj>(), Void.class, MyThriftObj.class, job); FileInputFormat.addInputPath(job, new org.apache.hadoop.fs.Path(inputPath); DataSet<Tuple2<Void, MyThriftObj>> ds = env.createInput(inputFormat); Flink logs this message:
Indeed MyThriftObj has readObject/writeObject functions and when I print the type of ds I see:
Fom my experience GenericType is a performace killer...what should I do to improve the reading/writing of MyThriftObj? Best, Flavio Flavio Pompermaier Development Department OKKAM S.r.l. Tel. +(39) 0461 1823908 |
Hi Flavio! I believe [1] has what you are looking for. Have you taken a look at that? Cheers, Gordon [1] https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/custom_serializers.html On 15 May 2017 at 9:08:33 PM, Flavio Pompermaier ([hidden email]) wrote:
|
Hi Gordon, thanks for the link. Will the usage ofTBaseSerializer wrt Kryo lead to a performance gain? On Tue, May 16, 2017 at 7:32 AM, Tzu-Li (Gordon) Tai <[hidden email]> wrote:
Flavio Pompermaier Development Department OKKAM S.r.l. Tel. +(39) 0461 1823908 |
If you don’t register the TBaseSerializer for your MyThriftObj (or in general don’t register any serializer for the Thrift class), I think Kryo’s default FieldSerializer will be used for it. The TBaseSerializer basically just uses TBase for de-/serialization as you normally would for the Thrift classes (see [1]), so there should be some specific optimization going on for that compared to Kryo’s generic FieldSerializer. I’m not entirely sure about performance gain between the two as I don’t really know the details of the serialization differences, but I would suggest to use TBaseSerializer if they are Thrift classes. Cheers, Gordon On 16 May 2017 at 3:26:32 PM, Flavio Pompermaier ([hidden email]) wrote:
|
Ok thanks Gordon! It would be nice to have a benchmark also on this ;)
Thanks a lot for the support, Flavio On Tue, May 16, 2017 at 9:41 AM, Tzu-Li (Gordon) Tai <[hidden email]> wrote:
|
Free forum by Nabble | Edit this page |