(DEPRECATED) Apache Flink User Mailing List archive.

How to to in Flink to support below HIVE SQL

Classic

List

Threaded

3 messages Options

Xiaohua

How to to in Flink to support below HIVE SQL

Hi,

We meet some issue when migrate from Hive/Spark to Flink, Could you please
help me?

Below is HIVE SQL we used：

DISTRIBUTE BY
named_struct
COALECE
LATERAL VIEW
row format
delimited fields
STR_TO_MAP
OVERWRITE
FULL OUTER JOIN
Rlike
Array

How to do use Flink SQL?

Thank you~

BR
Xiaohua

--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Jark Wu-3

Re: How to to in Flink to support below HIVE SQL

Hi Xiaohua,

I'm not very familiar with Hive SQL, I will try to answer some of them:

COALESCE => there is also a COALESCE built-in function in Flink [1]. From the documentation, I think they are identical.

STR_TO_MAP => there is also a STR_TO_MAP built-in function in Flink blink planner[1]. But the default delimiter is different from Hive's.

OVERWRITE => Blink planner supports INSERT OVERWRITE [2].

FULL OUTER JOIN => Blink planner also supports this both streaming mode and batch mode.

Rlike => Blink planner has REGEXP [1] built-in function which I think is similar to Hive's Rlike?

LATERAL VIEW => This is called UDTF in Flink, see how to use UDTF in docs [3] "Join with Table Function (UDTF)"

I cc'ed Rui Li who is working on FLIP-123 "DDL and DML compatibility for Hive", he may have more insights on this and please correct me if I give a wrong answer above.

Best,

Jark

[1]: https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/table/functions/systemFunctions.html

[2]: https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/table/sql/insert.html#insert-from-select-queries

[3]: https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/table/sql/queries.html#joins

On Thu, 9 Apr 2020 at 11:23, Xiaohua <[hidden email]> wrote:

Hi,

We meet some issue when migrate from Hive/Spark to Flink, Could you please
help me?

Below is HIVE SQL we used：

DISTRIBUTE BY
named_struct
COALECE
LATERAL VIEW
row format
delimited fields
STR_TO_MAP
OVERWRITE
FULL OUTER JOIN
Rlike
Array

How to do use Flink SQL?

Thank you~

BR
Xiaohua

--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Rui Li-2

Re: How to to in Flink to support below HIVE SQL

Hey Xiaohua & Jark,

I'm sorry for overlooking the email. Adding to Jark's answers:

DISTRIBUTE BY => the functionality and syntax are not supported. We can consider this as a candidate feature for 1.12.

named_struct => you should be able to call this function with Hive module

LATERAL VIEW => the syntax is not supported. As Jark mentioned, you can rewrite the SQL to achieve the same functionalities

row format => defining row formate in DDL will be supported in FLIP-123

delimited fields => defining field delimiter in DDL will be supported in FLIP-123

STR_TO_MAP => you should be able to call this function with Hive module, but there's a known issue with this function[1]

Array => you should be able to call this function with Hive module

Feel free to raise questions if anything is still unclear or if you hit any issues with these features.

[1] https://issues.apache.org/jira/browse/FLINK-16732

On Thu, Apr 9, 2020 at 12:04 PM Jark Wu <[hidden email]> wrote:

Hi Xiaohua,

I'm not very familiar with Hive SQL, I will try to answer some of them:

COALESCE => there is also a COALESCE built-in function in Flink [1]. From the documentation, I think they are identical.
STR_TO_MAP => there is also a STR_TO_MAP built-in function in Flink blink planner[1]. But the default delimiter is different from Hive's.
OVERWRITE => Blink planner supports INSERT OVERWRITE [2].
FULL OUTER JOIN => Blink planner also supports this both streaming mode and batch mode.
Rlike => Blink planner has REGEXP [1] built-in function which I think is similar to Hive's Rlike?
LATERAL VIEW => This is called UDTF in Flink, see how to use UDTF in docs [3] "Join with Table Function (UDTF)"

I cc'ed Rui Li who is working on FLIP-123 "DDL and DML compatibility for Hive", he may have more insights on this and please correct me if I give a wrong answer above.

Best,
Jark

[1]: https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/table/functions/systemFunctions.html
[2]: https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/table/sql/insert.html#insert-from-select-queries
[3]: https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/table/sql/queries.html#joins

On Thu, 9 Apr 2020 at 11:23, Xiaohua <[hidden email]> wrote:
Hi,

We meet some issue when migrate from Hive/Spark to Flink, Could you please
help me?

Below is HIVE SQL we used：

DISTRIBUTE BY
named_struct
COALECE
LATERAL VIEW
row format
delimited fields
STR_TO_MAP
OVERWRITE
FULL OUTER JOIN
Rlike
Array

How to do use Flink SQL?

Thank you~

BR
Xiaohua

--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Cheers,

Rui Li