Hi, I am trying to implement a flatMap collecting duplicates row keys. I thought I could use simple util.List<String>, but I get this exception.Exception in thread "main" org.apache.flink.api.common.functions.InvalidTypesException: Interfaces and abstract classes are not valid types: interface java.util.List at org.apache.flink.api.java.typeutils.TypeExtractor.privateGetForClass(TypeExtractor.java:871) at org.apache.flink.api.java.typeutils.TypeExtractor.createTypeInfoWithTypeHierarchy(TypeExtractor.java:402) at org.apache.flink.api.java.typeutils.TypeExtractor.createTypeInfoWithTypeHierarchy(TypeExtractor.java:324) at org.apache.flink.api.java.typeutils.TypeExtractor.createTypeInfoFromInputs(TypeExtractor.java:431) at org.apache.flink.api.java.typeutils.TypeExtractor.privateCreateTypeInfo(TypeExtractor.java:211) at org.apache.flink.api.java.typeutils.TypeExtractor.getUnaryOperatorReturnType(TypeExtractor.java:147) at org.apache.flink.api.java.typeutils.TypeExtractor.getFlatMapReturnTypes(TypeExtractor.java:82) at org.apache.flink.api.java.DataSet.flatMap(DataSet.java:199) at org.okkam.flink.HBaseEntityNaiveDeduplication.main(HBaseEntityNaiveDeduplication.java:179) However, ListValue that is Avro, and for what Flavio told me, you might be on the way to change it to Kryo, that would allow the serialization of these objects as well. What is expected time for working with such serialization? Thanks a lot guys, you are doing a great work! :-) saluti, Stefano |
Dear Stefano, As of now the default solution would be to use a string array instead of the list of strings for sending the data. If you need the list itself during your user defined code you can always convert it. I do agree that this is a bit inconvenient, but is the current best practice. Cheers, Marton On Thu, Nov 6, 2014 at 3:50 PM, Stefano Bortoli <[hidden email]> wrote:
|
What I did in the end was to implement the ListValue<String>, and its ok for the moment. Looking forward for more flexible serialization, maybe with kryo. :-) saluti, Stefano 2014-11-07 12:26 GMT+01:00 Márton Balassi <[hidden email]>:
|
Hi Stefano,
we are currently working on this problem. We hope to have Kryo integrated by the end of next week. Cheers, Till On Fri, Nov 7, 2014 at 12:36 PM, Stefano Bortoli <[hidden email]> wrote: > What I did in the end was to implement the ListValue<String>, and its ok for > the moment. Looking forward for more flexible serialization, maybe with > kryo. :-) > > saluti, > Stefano > > 2014-11-07 12:26 GMT+01:00 Márton Balassi <[hidden email]>: >> >> Dear Stefano, >> >> As of now the default solution would be to use a string array instead of >> the list of strings for sending the data. If you need the list itself during >> your user defined code you can always convert it. I do agree that this is a >> bit inconvenient, but is the current best practice. >> >> Cheers, >> >> Marton >> >> On Thu, Nov 6, 2014 at 3:50 PM, Stefano Bortoli <[hidden email]> >> wrote: >>> >>> Hi, >>> >>> I am trying to implement a flatMap collecting duplicates row keys. I >>> thought I could use simple util.List<String>, but I get this exception. >>> >>> Exception in thread "main" >>> org.apache.flink.api.common.functions.InvalidTypesException: Interfaces and >>> abstract classes are not valid types: interface java.util.List >>> at >>> org.apache.flink.api.java.typeutils.TypeExtractor.privateGetForClass(TypeExtractor.java:871) >>> at >>> org.apache.flink.api.java.typeutils.TypeExtractor.createTypeInfoWithTypeHierarchy(TypeExtractor.java:402) >>> at >>> org.apache.flink.api.java.typeutils.TypeExtractor.createTypeInfoWithTypeHierarchy(TypeExtractor.java:324) >>> at >>> org.apache.flink.api.java.typeutils.TypeExtractor.createTypeInfoFromInputs(TypeExtractor.java:431) >>> at >>> org.apache.flink.api.java.typeutils.TypeExtractor.privateCreateTypeInfo(TypeExtractor.java:211) >>> at >>> org.apache.flink.api.java.typeutils.TypeExtractor.getUnaryOperatorReturnType(TypeExtractor.java:147) >>> at >>> org.apache.flink.api.java.typeutils.TypeExtractor.getFlatMapReturnTypes(TypeExtractor.java:82) >>> at org.apache.flink.api.java.DataSet.flatMap(DataSet.java:199) >>> at >>> org.okkam.flink.HBaseEntityNaiveDeduplication.main(HBaseEntityNaiveDeduplication.java:179) >>> >>> What is the best practice to deal with Lists? to use ListValue? >>> >>> However, ListValue that is Avro, and for what Flavio told me, you might >>> be on the way to change it to Kryo, that would allow the serialization of >>> these objects as well. What is expected time for working with such >>> serialization? >>> >>> Thanks a lot guys, you are doing a great work! :-) >>> >>> saluti, >>> Stefano >> >> > |
This is great news! :-) 2014-11-09 16:23 GMT+01:00 Till Rohrmann <[hidden email]>: Hi Stefano, |
Hi,
As of now the fastest Java serialisation library is Blixtser (https://github.com/Mojang/blixtser). It is faster than Kryo. S |
Free forum by Nabble | Edit this page |