Converting String/boxed-primitive Array columns back to DataStream

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Converting String/boxed-primitive Array columns back to DataStream

Gyula Fóra
Hi All!

I have a Table with columns of ARRAY<STRING> and ARRAY<INT>, is there any way to convert it back to the respective java arrays? String[] and Integer[]

It only seems to work for primitive types (non null), date, time and decimal.

For String for instance I get the following error:
Query schema: [f0: ARRAY<STRING>]
Sink schema: [f0: LEGACY('ARRAY', 'ANY<[Ljava.lang.String;,....

Am I doing something wrong?

Thanks
Gyula
Reply | Threaded
Open this post in threaded view
|

Re: Converting String/boxed-primitive Array columns back to DataStream

Timo Walther
Hi Gyula,

are you coming from DataStream API or are you trying to implement a
source/sink? It looks like the array is currently serialized with Kryo.
I would recommend to take a look at this class:

org.apache.flink.table.types.utils.LegacyTypeInfoDataTypeConverter

This is the current mapping between old TypeInformation and new DataType
system. A back and forth conversion should work between those types.

Regards,
Timo

On 28.04.20 15:36, Gyula Fóra wrote:

> Hi All!
>
> I have a Table with columns of ARRAY<STRING> and ARRAY<INT>, is there
> any way to convert it back to the respective java arrays? String[] and
> Integer[]
>
> It only seems to work for primitive types (non null), date, time and
> decimal.
>
> For String for instance I get the following error:
> Query schema: [f0: ARRAY<STRING>]
> Sink schema: [f0: LEGACY('ARRAY', 'ANY<[Ljava.lang.String;,....
>
> Am I doing something wrong?
>
> Thanks
> Gyula

Reply | Threaded
Open this post in threaded view
|

Re: Converting String/boxed-primitive Array columns back to DataStream

Gyula Fóra
Hi Timo,

I am trying to convert simply back to a DataStream. Let's say: DataStream<Tuple2<String[], Integer[]>>

I can convert the DataStream into a table without a problem, the problem is getting a DataStream back.

Thanks
Gyula


On Tue, Apr 28, 2020 at 6:32 PM Timo Walther <[hidden email]> wrote:
Hi Gyula,

are you coming from DataStream API or are you trying to implement a
source/sink? It looks like the array is currently serialized with Kryo.
I would recommend to take a look at this class:

org.apache.flink.table.types.utils.LegacyTypeInfoDataTypeConverter

This is the current mapping between old TypeInformation and new DataType
system. A back and forth conversion should work between those types.

Regards,
Timo

On 28.04.20 15:36, Gyula Fóra wrote:
> Hi All!
>
> I have a Table with columns of ARRAY<STRING> and ARRAY<INT>, is there
> any way to convert it back to the respective java arrays? String[] and
> Integer[]
>
> It only seems to work for primitive types (non null), date, time and
> decimal.
>
> For String for instance I get the following error:
> Query schema: [f0: ARRAY<STRING>]
> Sink schema: [f0: LEGACY('ARRAY', 'ANY<[Ljava.lang.String;,....
>
> Am I doing something wrong?
>
> Thanks
> Gyula

Reply | Threaded
Open this post in threaded view
|

Re: Converting String/boxed-primitive Array columns back to DataStream

Timo Walther
Hi Gyula,

does `toAppendStream(Row.class)` work for you? The other methods take
TypeInformation and might cause this problem. It is definitely a bug.

Feel free to open an issue under:
https://issues.apache.org/jira/browse/FLINK-12251

Regards,
Timo

On 28.04.20 18:44, Gyula Fóra wrote:

> Hi Timo,
>
> I am trying to convert simply back to a DataStream. Let's say:
> DataStream<Tuple2<String[], Integer[]>>
>
> I can convert the DataStream into a table without a problem, the problem
> is getting a DataStream back.
>
> Thanks
> Gyula
>
>
> On Tue, Apr 28, 2020 at 6:32 PM Timo Walther <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>     Hi Gyula,
>
>     are you coming from DataStream API or are you trying to implement a
>     source/sink? It looks like the array is currently serialized with Kryo.
>     I would recommend to take a look at this class:
>
>     org.apache.flink.table.types.utils.LegacyTypeInfoDataTypeConverter
>
>     This is the current mapping between old TypeInformation and new
>     DataType
>     system. A back and forth conversion should work between those types.
>
>     Regards,
>     Timo
>
>     On 28.04.20 15:36, Gyula Fóra wrote:
>      > Hi All!
>      >
>      > I have a Table with columns of ARRAY<STRING> and ARRAY<INT>, is
>     there
>      > any way to convert it back to the respective java arrays?
>     String[] and
>      > Integer[]
>      >
>      > It only seems to work for primitive types (non null), date, time and
>      > decimal.
>      >
>      > For String for instance I get the following error:
>      > Query schema: [f0: ARRAY<STRING>]
>      > Sink schema: [f0: LEGACY('ARRAY', 'ANY<[Ljava.lang.String;,....
>      >
>      > Am I doing something wrong?
>      >
>      > Thanks
>      > Gyula
>

Reply | Threaded
Open this post in threaded view
|

Re: Converting String/boxed-primitive Array columns back to DataStream

Gyula Fóra
Hi Timo,

Row will work definitely work at this point for sure, thank you for helping out.

I opened a jira ticket: https://issues.apache.org/jira/browse/FLINK-17442

Gyula

On Tue, Apr 28, 2020 at 6:48 PM Timo Walther <[hidden email]> wrote:
Hi Gyula,

does `toAppendStream(Row.class)` work for you? The other methods take
TypeInformation and might cause this problem. It is definitely a bug.

Feel free to open an issue under:
https://issues.apache.org/jira/browse/FLINK-12251

Regards,
Timo

On 28.04.20 18:44, Gyula Fóra wrote:
> Hi Timo,
>
> I am trying to convert simply back to a DataStream. Let's say:
> DataStream<Tuple2<String[], Integer[]>>
>
> I can convert the DataStream into a table without a problem, the problem
> is getting a DataStream back.
>
> Thanks
> Gyula
>
>
> On Tue, Apr 28, 2020 at 6:32 PM Timo Walther <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>     Hi Gyula,
>
>     are you coming from DataStream API or are you trying to implement a
>     source/sink? It looks like the array is currently serialized with Kryo.
>     I would recommend to take a look at this class:
>
>     org.apache.flink.table.types.utils.LegacyTypeInfoDataTypeConverter
>
>     This is the current mapping between old TypeInformation and new
>     DataType
>     system. A back and forth conversion should work between those types.
>
>     Regards,
>     Timo
>
>     On 28.04.20 15:36, Gyula Fóra wrote:
>      > Hi All!
>      >
>      > I have a Table with columns of ARRAY<STRING> and ARRAY<INT>, is
>     there
>      > any way to convert it back to the respective java arrays?
>     String[] and
>      > Integer[]
>      >
>      > It only seems to work for primitive types (non null), date, time and
>      > decimal.
>      >
>      > For String for instance I get the following error:
>      > Query schema: [f0: ARRAY<STRING>]
>      > Sink schema: [f0: LEGACY('ARRAY', 'ANY<[Ljava.lang.String;,....
>      >
>      > Am I doing something wrong?
>      >
>      > Thanks
>      > Gyula
>