Serialising null value in case-class

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Serialising null value in case-class

Averell
Good day,

I have a case-class defined like this:

    case class MyClass(ts: Long, s1: String, s2: String, i1: Integer,  i2:
Integer)
    object MyClass {
        val EMPTY = MyClass(0L, null, null, 0, 0)
        def apply(): MyClass = EMPTY
    }

My code has been running fine (I was not aware of the limitation mentioned
in
https://ci.apache.org/projects/flink/flink-docs-stable/dev/types_serialization.html)

But when I tried to create the instance /MyClass(0L, null, null, *null*,
0)/, I got the following error: /org.apache.flink.types.NullFieldException:
Field 3 is null, but expected to hold a value./

I am confused. Why there's the difference between a null String and a null
Integer?

Thanks and regards,
Averell



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Serialising null value in case-class

Timo Walther
Hi Averell,

the reason for this lies in the internal serializer implementation. In
general, the composite/wrapping type serializer is responsible for
encoding nulls. The case class serialzer does not support nulls, because
Scala discourages the use of nulls and promotes `Option`. Some
serializers such as `String` use the length binary field internally to
encode nulls see [1]. For a full list of Scala types, I would recommend
this class [2].

Regards,
Timo

[1]
https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/types/StringValue.java#L788
[2]
https://github.com/apache/flink/blob/master/flink-scala/src/main/scala/org/apache/flink/api/scala/typeutils/Types.scala

Am 26.04.19 um 11:30 schrieb Averell:

> Good day,
>
> I have a case-class defined like this:
>
>      case class MyClass(ts: Long, s1: String, s2: String, i1: Integer,  i2:
> Integer)
>      object MyClass {
>          val EMPTY = MyClass(0L, null, null, 0, 0)
>          def apply(): MyClass = EMPTY
>      }
>
> My code has been running fine (I was not aware of the limitation mentioned
> in
> https://ci.apache.org/projects/flink/flink-docs-stable/dev/types_serialization.html)
>
> But when I tried to create the instance /MyClass(0L, null, null, *null*,
> 0)/, I got the following error: /org.apache.flink.types.NullFieldException:
> Field 3 is null, but expected to hold a value./
>
> I am confused. Why there's the difference between a null String and a null
> Integer?
>
> Thanks and regards,
> Averell
>
>
>
> --
> Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/


Reply | Threaded
Open this post in threaded view
|

Re: Serialising null value in case-class

Averell
Thank you Timo.

In term of performance, does the use of Option[] cause performance impact? I
guess that there is because there will be one more layer of object handling,
isn't it?

I am also confused about choosing between primitive types (Int, Long) vs
object type (Integer, JLong). I have seen many places in Flink documents
that Java primitive types are recommended. But how are Scala types?

Thanks and regards,
Averell



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Serialising null value in case-class

Timo Walther
Currently, tuples and case classes are the most efficient data types
because they avoid the need for special null handling. Everything else
is hard to estimate. You might need to perform micro benchmarks with the
serializers you want to use if you have a very performance critical use
case. Object types vs primitive types don't make a big difference (also
for Scala) as their value is serialzed by the same serializers.

I hope this helps.

Timo


Am 26.04.19 um 13:17 schrieb Averell:

> Thank you Timo.
>
> In term of performance, does the use of Option[] cause performance impact? I
> guess that there is because there will be one more layer of object handling,
> isn't it?
>
> I am also confused about choosing between primitive types (Int, Long) vs
> object type (Integer, JLong). I have seen many places in Flink documents
> that Java primitive types are recommended. But how are Scala types?
>
> Thanks and regards,
> Averell
>
>
>
> --
> Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/


Reply | Threaded
Open this post in threaded view
|

Re: Serialising null value in case-class

Averell