How to to in Flink to support below HIVE SQL

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

How to to in Flink to support below HIVE SQL

Xiaohua
Hi,

We meet some issue when migrate from Hive/Spark to Flink, Could you please
help me?

Below is HIVE SQL we used:

DISTRIBUTE BY
named_struct
COALECE
LATERAL VIEW
row format
delimited fields
STR_TO_MAP
OVERWRITE
FULL OUTER JOIN
Rlike
Array

How to do use Flink SQL?

Thank you~

BR
Xiaohua



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: How to to in Flink to support below HIVE SQL

Jark Wu-3
Hi Xiaohua,

I'm not very familiar with Hive SQL, I will try to answer some of them:

COALESCE => there is also a COALESCE built-in function in Flink [1]. From the documentation, I think they are identical.
STR_TO_MAP =>  there is also a STR_TO_MAP built-in function in Flink blink planner[1]. But the default delimiter is different from Hive's.
OVERWRITE => Blink planner supports INSERT OVERWRITE [2].
FULL OUTER JOIN => Blink planner also supports this both streaming mode and batch mode.
Rlike => Blink planner has REGEXP [1] built-in function which I think is similar to Hive's Rlike? 
LATERAL VIEW => This is called UDTF in Flink, see how to use UDTF in docs [3] "Join with Table Function (UDTF)"

I cc'ed Rui Li who is working on FLIP-123 "DDL and DML compatibility for Hive", he may have more insights on this and please correct me if I give a wrong answer above.

Best,
Jark



On Thu, 9 Apr 2020 at 11:23, Xiaohua <[hidden email]> wrote:
Hi,

We meet some issue when migrate from Hive/Spark to Flink, Could you please
help me?

Below is HIVE SQL we used:

DISTRIBUTE BY
named_struct
COALECE
LATERAL VIEW
row format
delimited fields
STR_TO_MAP
OVERWRITE
FULL OUTER JOIN
Rlike
Array

How to do use Flink SQL?

Thank you~

BR
Xiaohua



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: How to to in Flink to support below HIVE SQL

Rui Li-2
Hey Xiaohua & Jark,

I'm sorry for overlooking the email. Adding to Jark's answers:

DISTRIBUTE BY => the functionality and syntax are not supported. We can consider this as a candidate feature for 1.12.
named_struct => you should be able to call this function with Hive module
LATERAL VIEW => the syntax is not supported. As Jark mentioned, you can rewrite the SQL to achieve the same functionalities
row format => defining row formate in DDL will be supported in FLIP-123
delimited fields => defining field delimiter in DDL will be supported in FLIP-123
STR_TO_MAP => you should be able to call this function with Hive module, but there's a known issue with this function[1]
Array => you should be able to call this function with Hive module

Feel free to raise questions if anything is still unclear or if you hit any issues with these features.



On Thu, Apr 9, 2020 at 12:04 PM Jark Wu <[hidden email]> wrote:
Hi Xiaohua,

I'm not very familiar with Hive SQL, I will try to answer some of them:

COALESCE => there is also a COALESCE built-in function in Flink [1]. From the documentation, I think they are identical.
STR_TO_MAP =>  there is also a STR_TO_MAP built-in function in Flink blink planner[1]. But the default delimiter is different from Hive's.
OVERWRITE => Blink planner supports INSERT OVERWRITE [2].
FULL OUTER JOIN => Blink planner also supports this both streaming mode and batch mode.
Rlike => Blink planner has REGEXP [1] built-in function which I think is similar to Hive's Rlike? 
LATERAL VIEW => This is called UDTF in Flink, see how to use UDTF in docs [3] "Join with Table Function (UDTF)"

I cc'ed Rui Li who is working on FLIP-123 "DDL and DML compatibility for Hive", he may have more insights on this and please correct me if I give a wrong answer above.

Best,
Jark



On Thu, 9 Apr 2020 at 11:23, Xiaohua <[hidden email]> wrote:
Hi,

We meet some issue when migrate from Hive/Spark to Flink, Could you please
help me?

Below is HIVE SQL we used:

DISTRIBUTE BY
named_struct
COALECE
LATERAL VIEW
row format
delimited fields
STR_TO_MAP
OVERWRITE
FULL OUTER JOIN
Rlike
Array

How to do use Flink SQL?

Thank you~

BR
Xiaohua



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/


--
Cheers,
Rui Li