FlatMap collecting List<String> gives InvalidTypesException

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

FlatMap collecting List<String> gives InvalidTypesException

Stefano Bortoli
Hi,

I am trying to implement a flatMap collecting duplicates row keys. I thought I could use simple util.List<String>, but I get this exception.

Exception in thread "main" org.apache.flink.api.common.functions.InvalidTypesException: Interfaces and abstract classes are not valid types: interface java.util.List
    at org.apache.flink.api.java.typeutils.TypeExtractor.privateGetForClass(TypeExtractor.java:871)
    at org.apache.flink.api.java.typeutils.TypeExtractor.createTypeInfoWithTypeHierarchy(TypeExtractor.java:402)
    at org.apache.flink.api.java.typeutils.TypeExtractor.createTypeInfoWithTypeHierarchy(TypeExtractor.java:324)
    at org.apache.flink.api.java.typeutils.TypeExtractor.createTypeInfoFromInputs(TypeExtractor.java:431)
    at org.apache.flink.api.java.typeutils.TypeExtractor.privateCreateTypeInfo(TypeExtractor.java:211)
    at org.apache.flink.api.java.typeutils.TypeExtractor.getUnaryOperatorReturnType(TypeExtractor.java:147)
    at org.apache.flink.api.java.typeutils.TypeExtractor.getFlatMapReturnTypes(TypeExtractor.java:82)
    at org.apache.flink.api.java.DataSet.flatMap(DataSet.java:199)
    at org.okkam.flink.HBaseEntityNaiveDeduplication.main(HBaseEntityNaiveDeduplication.java:179)

What is the best practice to deal with Lists? to use ListValue?

However, ListValue that is Avro, and for what Flavio told me, you might be on the way to change it to Kryo, that would allow the serialization of these objects as well.  What is expected time for working with such serialization?

Thanks a lot guys, you are doing a great work! :-)

saluti,
Stefano
Reply | Threaded
Open this post in threaded view
|

Re: FlatMap collecting List<String> gives InvalidTypesException

Márton Balassi
Dear Stefano,

As of now the default solution would be to use a string array instead of the list of strings for sending the data. If you need the list itself during your user defined code you can always convert it. I do agree that this is a bit inconvenient, but is the current best practice.

Cheers,

Marton

On Thu, Nov 6, 2014 at 3:50 PM, Stefano Bortoli <[hidden email]> wrote:
Hi,

I am trying to implement a flatMap collecting duplicates row keys. I thought I could use simple util.List<String>, but I get this exception.

Exception in thread "main" org.apache.flink.api.common.functions.InvalidTypesException: Interfaces and abstract classes are not valid types: interface java.util.List
    at org.apache.flink.api.java.typeutils.TypeExtractor.privateGetForClass(TypeExtractor.java:871)
    at org.apache.flink.api.java.typeutils.TypeExtractor.createTypeInfoWithTypeHierarchy(TypeExtractor.java:402)
    at org.apache.flink.api.java.typeutils.TypeExtractor.createTypeInfoWithTypeHierarchy(TypeExtractor.java:324)
    at org.apache.flink.api.java.typeutils.TypeExtractor.createTypeInfoFromInputs(TypeExtractor.java:431)
    at org.apache.flink.api.java.typeutils.TypeExtractor.privateCreateTypeInfo(TypeExtractor.java:211)
    at org.apache.flink.api.java.typeutils.TypeExtractor.getUnaryOperatorReturnType(TypeExtractor.java:147)
    at org.apache.flink.api.java.typeutils.TypeExtractor.getFlatMapReturnTypes(TypeExtractor.java:82)
    at org.apache.flink.api.java.DataSet.flatMap(DataSet.java:199)
    at org.okkam.flink.HBaseEntityNaiveDeduplication.main(HBaseEntityNaiveDeduplication.java:179)

What is the best practice to deal with Lists? to use ListValue?

However, ListValue that is Avro, and for what Flavio told me, you might be on the way to change it to Kryo, that would allow the serialization of these objects as well.  What is expected time for working with such serialization?

Thanks a lot guys, you are doing a great work! :-)

saluti,
Stefano

Reply | Threaded
Open this post in threaded view
|

Re: FlatMap collecting List<String> gives InvalidTypesException

Stefano Bortoli
What I did in the end was to implement the ListValue<String>, and its ok for the moment. Looking forward for more flexible serialization, maybe with kryo. :-)

saluti,
Stefano

2014-11-07 12:26 GMT+01:00 Márton Balassi <[hidden email]>:
Dear Stefano,

As of now the default solution would be to use a string array instead of the list of strings for sending the data. If you need the list itself during your user defined code you can always convert it. I do agree that this is a bit inconvenient, but is the current best practice.

Cheers,

Marton

On Thu, Nov 6, 2014 at 3:50 PM, Stefano Bortoli <[hidden email]> wrote:
Hi,

I am trying to implement a flatMap collecting duplicates row keys. I thought I could use simple util.List<String>, but I get this exception.

Exception in thread "main" org.apache.flink.api.common.functions.InvalidTypesException: Interfaces and abstract classes are not valid types: interface java.util.List
    at org.apache.flink.api.java.typeutils.TypeExtractor.privateGetForClass(TypeExtractor.java:871)
    at org.apache.flink.api.java.typeutils.TypeExtractor.createTypeInfoWithTypeHierarchy(TypeExtractor.java:402)
    at org.apache.flink.api.java.typeutils.TypeExtractor.createTypeInfoWithTypeHierarchy(TypeExtractor.java:324)
    at org.apache.flink.api.java.typeutils.TypeExtractor.createTypeInfoFromInputs(TypeExtractor.java:431)
    at org.apache.flink.api.java.typeutils.TypeExtractor.privateCreateTypeInfo(TypeExtractor.java:211)
    at org.apache.flink.api.java.typeutils.TypeExtractor.getUnaryOperatorReturnType(TypeExtractor.java:147)
    at org.apache.flink.api.java.typeutils.TypeExtractor.getFlatMapReturnTypes(TypeExtractor.java:82)
    at org.apache.flink.api.java.DataSet.flatMap(DataSet.java:199)
    at org.okkam.flink.HBaseEntityNaiveDeduplication.main(HBaseEntityNaiveDeduplication.java:179)

What is the best practice to deal with Lists? to use ListValue?

However, ListValue that is Avro, and for what Flavio told me, you might be on the way to change it to Kryo, that would allow the serialization of these objects as well.  What is expected time for working with such serialization?

Thanks a lot guys, you are doing a great work! :-)

saluti,
Stefano


Reply | Threaded
Open this post in threaded view
|

Re: FlatMap collecting List<String> gives InvalidTypesException

Till Rohrmann
Hi Stefano,

we are currently working on this problem. We hope to have Kryo
integrated by the end of next week.

Cheers,

Till

On Fri, Nov 7, 2014 at 12:36 PM, Stefano Bortoli <[hidden email]> wrote:

> What I did in the end was to implement the ListValue<String>, and its ok for
> the moment. Looking forward for more flexible serialization, maybe with
> kryo. :-)
>
> saluti,
> Stefano
>
> 2014-11-07 12:26 GMT+01:00 Márton Balassi <[hidden email]>:
>>
>> Dear Stefano,
>>
>> As of now the default solution would be to use a string array instead of
>> the list of strings for sending the data. If you need the list itself during
>> your user defined code you can always convert it. I do agree that this is a
>> bit inconvenient, but is the current best practice.
>>
>> Cheers,
>>
>> Marton
>>
>> On Thu, Nov 6, 2014 at 3:50 PM, Stefano Bortoli <[hidden email]>
>> wrote:
>>>
>>> Hi,
>>>
>>> I am trying to implement a flatMap collecting duplicates row keys. I
>>> thought I could use simple util.List<String>, but I get this exception.
>>>
>>> Exception in thread "main"
>>> org.apache.flink.api.common.functions.InvalidTypesException: Interfaces and
>>> abstract classes are not valid types: interface java.util.List
>>>     at
>>> org.apache.flink.api.java.typeutils.TypeExtractor.privateGetForClass(TypeExtractor.java:871)
>>>     at
>>> org.apache.flink.api.java.typeutils.TypeExtractor.createTypeInfoWithTypeHierarchy(TypeExtractor.java:402)
>>>     at
>>> org.apache.flink.api.java.typeutils.TypeExtractor.createTypeInfoWithTypeHierarchy(TypeExtractor.java:324)
>>>     at
>>> org.apache.flink.api.java.typeutils.TypeExtractor.createTypeInfoFromInputs(TypeExtractor.java:431)
>>>     at
>>> org.apache.flink.api.java.typeutils.TypeExtractor.privateCreateTypeInfo(TypeExtractor.java:211)
>>>     at
>>> org.apache.flink.api.java.typeutils.TypeExtractor.getUnaryOperatorReturnType(TypeExtractor.java:147)
>>>     at
>>> org.apache.flink.api.java.typeutils.TypeExtractor.getFlatMapReturnTypes(TypeExtractor.java:82)
>>>     at org.apache.flink.api.java.DataSet.flatMap(DataSet.java:199)
>>>     at
>>> org.okkam.flink.HBaseEntityNaiveDeduplication.main(HBaseEntityNaiveDeduplication.java:179)
>>>
>>> What is the best practice to deal with Lists? to use ListValue?
>>>
>>> However, ListValue that is Avro, and for what Flavio told me, you might
>>> be on the way to change it to Kryo, that would allow the serialization of
>>> these objects as well.  What is expected time for working with such
>>> serialization?
>>>
>>> Thanks a lot guys, you are doing a great work! :-)
>>>
>>> saluti,
>>> Stefano
>>
>>
>
Reply | Threaded
Open this post in threaded view
|

Re: FlatMap collecting List<String> gives InvalidTypesException

Stefano Bortoli
This is great news! :-)

2014-11-09 16:23 GMT+01:00 Till Rohrmann <[hidden email]>:
Hi Stefano,

we are currently working on this problem. We hope to have Kryo
integrated by the end of next week.

Cheers,

Till

On Fri, Nov 7, 2014 at 12:36 PM, Stefano Bortoli <[hidden email]> wrote:
> What I did in the end was to implement the ListValue<String>, and its ok for
> the moment. Looking forward for more flexible serialization, maybe with
> kryo. :-)
>
> saluti,
> Stefano
>
> 2014-11-07 12:26 GMT+01:00 Márton Balassi <[hidden email]>:
>>
>> Dear Stefano,
>>
>> As of now the default solution would be to use a string array instead of
>> the list of strings for sending the data. If you need the list itself during
>> your user defined code you can always convert it. I do agree that this is a
>> bit inconvenient, but is the current best practice.
>>
>> Cheers,
>>
>> Marton
>>
>> On Thu, Nov 6, 2014 at 3:50 PM, Stefano Bortoli <[hidden email]>
>> wrote:
>>>
>>> Hi,
>>>
>>> I am trying to implement a flatMap collecting duplicates row keys. I
>>> thought I could use simple util.List<String>, but I get this exception.
>>>
>>> Exception in thread "main"
>>> org.apache.flink.api.common.functions.InvalidTypesException: Interfaces and
>>> abstract classes are not valid types: interface java.util.List
>>>     at
>>> org.apache.flink.api.java.typeutils.TypeExtractor.privateGetForClass(TypeExtractor.java:871)
>>>     at
>>> org.apache.flink.api.java.typeutils.TypeExtractor.createTypeInfoWithTypeHierarchy(TypeExtractor.java:402)
>>>     at
>>> org.apache.flink.api.java.typeutils.TypeExtractor.createTypeInfoWithTypeHierarchy(TypeExtractor.java:324)
>>>     at
>>> org.apache.flink.api.java.typeutils.TypeExtractor.createTypeInfoFromInputs(TypeExtractor.java:431)
>>>     at
>>> org.apache.flink.api.java.typeutils.TypeExtractor.privateCreateTypeInfo(TypeExtractor.java:211)
>>>     at
>>> org.apache.flink.api.java.typeutils.TypeExtractor.getUnaryOperatorReturnType(TypeExtractor.java:147)
>>>     at
>>> org.apache.flink.api.java.typeutils.TypeExtractor.getFlatMapReturnTypes(TypeExtractor.java:82)
>>>     at org.apache.flink.api.java.DataSet.flatMap(DataSet.java:199)
>>>     at
>>> org.okkam.flink.HBaseEntityNaiveDeduplication.main(HBaseEntityNaiveDeduplication.java:179)
>>>
>>> What is the best practice to deal with Lists? to use ListValue?
>>>
>>> However, ListValue that is Avro, and for what Flavio told me, you might
>>> be on the way to change it to Kryo, that would allow the serialization of
>>> these objects as well.  What is expected time for working with such
>>> serialization?
>>>
>>> Thanks a lot guys, you are doing a great work! :-)
>>>
>>> saluti,
>>> Stefano
>>
>>
>

Reply | Threaded
Open this post in threaded view
|

Re: FlatMap collecting List<String> gives InvalidTypesException

sirinath
Hi,

As of now the fastest Java serialisation library is Blixtser (https://github.com/Mojang/blixtser). It is faster than Kryo.

S