lack of function and low usability of provided function

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

lack of function and low usability of provided function

徐涛
Hi All,
        I found flink is lack of some basic functions , for example string split, regular express support, json parse and extract support, these function are used frequently in development , but they are not supported, use has to write UDF to support this.
        And some of the provided functions are lack of usability, for example log(2, 1.0) and exp(1.0)  with double params are not supported. I think they are not hard to implement and they are very basic functions.
        Will flink enhance the basic functions , maybe in later releases?

Best,
Henry
Reply | Threaded
Open this post in threaded view
|

Re: lack of function and low usability of provided function

vino yang
Hi Henry,

I recently submitted some PRs about Scalar functions, some of which have been merged and some are being reviewed, and some may be what you need.

Log2(x) :https://issues.apache.org/jira/browse/FLINK-9928 will be released in Flink 1.7
regular express support: use similar to , also see here  https://ci.apache.org/projects/flink/flink-docs-stable/dev/table/sql.html 

The status of most scalar functions can be seen here : https://issues.apache.org/jira/browse/FLINK-6810

Thanks, vino.


徐涛 <[hidden email]> 于2018年8月23日周四 下午3:16写道:
Hi All,
        I found flink is lack of some basic functions , for example string split, regular express support, json parse and extract support, these function are used frequently in development , but they are not supported, use has to write UDF to support this.
        And some of the provided functions are lack of usability, for example log(2, 1.0) and exp(1.0)  with double params are not supported. I think they are not hard to implement and they are very basic functions.
        Will flink enhance the basic functions , maybe in later releases?

Best,
Henry
Reply | Threaded
Open this post in threaded view
|

Re: lack of function and low usability of provided function

Timo Walther
Hi Henry,

thanks for giving feedback. The set of built-in functions is a continous effort that will never be considered as "done". If you think a function should be supported, you can open issues in FLINK-6810 and we can discuss its priority.

Flink is an open source project so feel also free to contribute either by opening issues, reviewing existing PRs, or code.

Regards,
Timo


Am 23.08.18 um 10:01 schrieb vino yang:
Hi Henry,

I recently submitted some PRs about Scalar functions, some of which have been merged and some are being reviewed, and some may be what you need.

Log2(x) :https://issues.apache.org/jira/browse/FLINK-9928 will be released in Flink 1.7
regular express support: use similar to , also see here  https://ci.apache.org/projects/flink/flink-docs-stable/dev/table/sql.html 

The status of most scalar functions can be seen here : https://issues.apache.org/jira/browse/FLINK-6810

Thanks, vino.


徐涛 <[hidden email]> 于2018年8月23日周四 下午3:16写道:
Hi All,
        I found flink is lack of some basic functions , for example string split, regular express support, json parse and extract support, these function are used frequently in development , but they are not supported, use has to write UDF to support this.
        And some of the provided functions are lack of usability, for example log(2, 1.0) and exp(1.0)  with double params are not supported. I think they are not hard to implement and they are very basic functions.
        Will flink enhance the basic functions , maybe in later releases?

Best,
Henry


Reply | Threaded
Open this post in threaded view
|

Re: lack of function and low usability of provided function

Fabian Hueske-2
In reply to this post by 徐涛
Hi Henry,

Flink is an open source project. New build-in functions are constantly contributed to Flink. Right now, there are more than 5 PRs open to add or improve various functions.

If you find that some functions are not working correctly or could be improved, you can open a Jira issue. The same applies to missing functions.
Before opening an issue, it would be good to check if there's another issue for the same problem.
After opening a Jira issue, you can either wait for somebody to pick it up or contribute the function yourself.
The scalar function umbrella Jira issue [1] explains what needs to be done to add a function.

Contributing scalar functions is a good way to get involved with the Flink development.

Best, Fabian


Am Do., 23. Aug. 2018 um 09:16 Uhr schrieb 徐涛 <[hidden email]>:
Hi All,
        I found flink is lack of some basic functions , for example string split, regular express support, json parse and extract support, these function are used frequently in development , but they are not supported, use has to write UDF to support this.
        And some of the provided functions are lack of usability, for example log(2, 1.0) and exp(1.0)  with double params are not supported. I think they are not hard to implement and they are very basic functions.
        Will flink enhance the basic functions , maybe in later releases?

Best,
Henry
Reply | Threaded
Open this post in threaded view
|

Re: lack of function and low usability of provided function

徐涛
Found another function which does not implement the function as it declared.  https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/table/functions.html#temporal-functions
The function is TIMESTAMP string.

I use the sql as follows, and the ttt type is String.
insert into rslt select word,cast( TIMESTAMPADD(HOUR, 2,   TIMESTAMP ttt    ) as varchar) from xxx

But it throws the following exception, the Flink version is 1.7.2.
Exception in thread "main" org.apache.flink.table.api.SqlParserException: SQL parse failed. Encountered "TIMESTAMP ttt" at line 1, column 60.
Was expecting one of:
    "+" ...
    "-" ...
    "NOT" ...
    "EXISTS" ...
    <UNSIGNED_INTEGER_LITERAL> ...
    <DECIMAL_NUMERIC_LITERAL> ...
    <APPROX_NUMERIC_LITERAL> ...
    <BINARY_STRING_LITERAL> ...
    <PREFIXED_STRING_LITERAL> ...
    <QUOTED_STRING> ...
    <UNICODE_STRING_LITERAL> ...
    "TRUE" ...
    "FALSE" ...
    "UNKNOWN" ...
    "NULL" ...
    <LBRACE_D> ...
    <LBRACE_T> ...
    <LBRACE_TS> ...
    "DATE" ...
    "TIME" ...
    "TIMESTAMP" <QUOTED_STRING> ...
    "INTERVAL" ...
    "?" ...
    "CAST" ...
    "EXTRACT" ...
    "POSITION" ...
    "CONVERT" ...
    "TRANSLATE" ...
    "OVERLAY" ...
    "FLOOR" ...
    "CEIL" ...
    "CEILING" ...
    "SUBSTRING" ...
    "TRIM" ...
    "CLASSIFIER" ...
    "MATCH_NUMBER" ...
    "RUNNING" ...
    "PREV" ...
    "NEXT" ...
    <LBRACE_FN> ...
    "MULTISET" ...
    "ARRAY" ...
    "PERIOD" ...
    "SPECIFIC" ...
    <IDENTIFIER> ...
    <QUOTED_IDENTIFIER> ...
    <BACK_QUOTED_IDENTIFIER> ...
    <BRACKET_QUOTED_IDENTIFIER> ...
    <UNICODE_QUOTED_IDENTIFIER> ...
    "ABS" ...
    "AVG" ...
    "CARDINALITY" ...
    "CHAR_LENGTH" ...
    "CHARACTER_LENGTH" ...
    "COALESCE" ...
    "COLLECT" ...
    "COVAR_POP" ...
    "COVAR_SAMP" ...
    "CUME_DIST" ...
    "COUNT" ...
    "CURRENT_DATE" ...
    "CURRENT_TIME" ...
    "CURRENT_TIMESTAMP" ...
    "DENSE_RANK" ...
    "ELEMENT" ...
    "EXP" ...
    "FIRST_VALUE" ...
    "FUSION" ...
    "GROUPING" ...
    "HOUR" ...
    "LAG" ...
    "LEAD" ...
    "LAST_VALUE" ...
    "LN" ...
    "LOCALTIME" ...
    "LOCALTIMESTAMP" ...
    "LOWER" ...
    "MAX" ...
    "MIN" ...
    "MINUTE" ...
    "MOD" ...
    "MONTH" ...
    "NTH_VALUE" ...
    "NTILE" ...
    "NULLIF" ...
    "OCTET_LENGTH" ...
    "PERCENT_RANK" ...
    "POWER" ...
    "RANK" ...
    "REGR_SXX" ...
    "REGR_SYY" ...
    "ROW_NUMBER" ...
    "SECOND" ...
    "SQRT" ...
    "STDDEV_POP" ...
    "STDDEV_SAMP" ...
    "SUM" ...
    "UPPER" ...
    "TRUNCATE" ...
    "USER" ...
    "VAR_POP" ...
    "VAR_SAMP" ...
    "YEAR" ...
    "CURRENT_CATALOG" ...
    "CURRENT_DEFAULT_TRANSFORM_GROUP" ...
    "CURRENT_PATH" ...
    "CURRENT_ROLE" ...
    "CURRENT_SCHEMA" ...
    "CURRENT_USER" ...
    "SESSION_USER" ...
    "SYSTEM_USER" ...
    "NEW" ...
    "CASE" ...
    "CURRENT" ...
    "CURSOR" ...
    "ROW" ...
    "(" ...
    
  at org.apache.flink.table.calcite.FlinkPlannerImpl.parse(FlinkPlannerImpl.scala:94)
  at org.apache.flink.table.api.TableEnvironment.sqlUpdate(TableEnvironment.scala:803)
  at org.apache.flink.table.api.TableEnvironment.sqlUpdate(TableEnvironment.scala:777)
  at com.ximalaya.flink.dsl.application.simple.HelloWorldTable$.main(HelloWorldTable.scala:44)
  at com.ximalaya.flink.dsl.application.simple.HelloWorldTable.main(HelloWorldTable.scala)
Caused by: org.apache.calcite.sql.parser.SqlParseException: Encountered "TIMESTAMP ttt" at line 1, column 60.
Was expecting one of:
    "+" ...
    "-" ...
    "NOT" ...
    "EXISTS" ...
    <UNSIGNED_INTEGER_LITERAL> ...
    <DECIMAL_NUMERIC_LITERAL> ...
    <APPROX_NUMERIC_LITERAL> ...
    <BINARY_STRING_LITERAL> ...
    <PREFIXED_STRING_LITERAL> ...
    <QUOTED_STRING> ...
    <UNICODE_STRING_LITERAL> ...
    "TRUE" ...
    "FALSE" ...
    "UNKNOWN" ...
Disconnected from the target VM, address: '127.0.0.1:55077', transport: 'socket'
    "NULL" ...
    <LBRACE_D> ...
    <LBRACE_T> ...
    <LBRACE_TS> ...
    "DATE" ...
    "TIME" ...
    "TIMESTAMP" <QUOTED_STRING> ...
    "INTERVAL" ...
    "?" ...
    "CAST" ...
    "EXTRACT" ...
    "POSITION" ...
    "CONVERT" ...
    "TRANSLATE" ...
    "OVERLAY" ...
    "FLOOR" ...
    "CEIL" ...
    "CEILING" ...
    "SUBSTRING" ...
    "TRIM" ...
    "CLASSIFIER" ...
    "MATCH_NUMBER" ...
    "RUNNING" ...
    "PREV" ...
    "NEXT" ...
    <LBRACE_FN> ...
    "MULTISET" ...
    "ARRAY" ...
    "PERIOD" ...
    "SPECIFIC" ...
    <IDENTIFIER> ...
    <QUOTED_IDENTIFIER> ...
    <BACK_QUOTED_IDENTIFIER> ...
    <BRACKET_QUOTED_IDENTIFIER> ...
    <UNICODE_QUOTED_IDENTIFIER> ...
    "ABS" ...
    "AVG" ...
    "CARDINALITY" ...
    "CHAR_LENGTH" ...
    "CHARACTER_LENGTH" ...
    "COALESCE" ...
    "COLLECT" ...
    "COVAR_POP" ...
    "COVAR_SAMP" ...
    "CUME_DIST" ...
    "COUNT" ...
    "CURRENT_DATE" ...
    "CURRENT_TIME" ...
    "CURRENT_TIMESTAMP" ...
    "DENSE_RANK" ...
    "ELEMENT" ...
    "EXP" ...
    "FIRST_VALUE" ...
    "FUSION" ...
    "GROUPING" ...
    "HOUR" ...
    "LAG" ...
    "LEAD" ...
    "LAST_VALUE" ...
    "LN" ...
    "LOCALTIME" ...
    "LOCALTIMESTAMP" ...
    "LOWER" ...
    "MAX" ...
    "MIN" ...
    "MINUTE" ...
    "MOD" ...
    "MONTH" ...
    "NTH_VALUE" ...
    "NTILE" ...
    "NULLIF" ...
    "OCTET_LENGTH" ...
    "PERCENT_RANK" ...
    "POWER" ...
    "RANK" ...
    "REGR_SXX" ...
    "REGR_SYY" ...
    "ROW_NUMBER" ...
    "SECOND" ...
    "SQRT" ...
    "STDDEV_POP" ...
    "STDDEV_SAMP" ...
    "SUM" ...
    "UPPER" ...
    "TRUNCATE" ...
    "USER" ...
    "VAR_POP" ...
    "VAR_SAMP" ...
    "YEAR" ...
    "CURRENT_CATALOG" ...
    "CURRENT_DEFAULT_TRANSFORM_GROUP" ...
    "CURRENT_PATH" ...
    "CURRENT_ROLE" ...
    "CURRENT_SCHEMA" ...
    "CURRENT_USER" ...
    "SESSION_USER" ...
    "SYSTEM_USER" ...
    "NEW" ...
    "CASE" ...
    "CURRENT" ...
    "CURSOR" ...
    "ROW" ...
    "(" ...
    
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.convertException(SqlParserImpl.java:347)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.normalizeException(SqlParserImpl.java:128)
  at org.apache.calcite.sql.parser.SqlParser.parseQuery(SqlParser.java:137)
  at org.apache.calcite.sql.parser.SqlParser.parseStmt(SqlParser.java:162)
  at org.apache.flink.table.calcite.FlinkPlannerImpl.parse(FlinkPlannerImpl.scala:90)
  ... 4 more
Caused by: org.apache.calcite.sql.parser.impl.ParseException: Encountered "TIMESTAMP ttt" at line 1, column 60.
Was expecting one of:
    "+" ...
    "-" ...
    "NOT" ...
    "EXISTS" ...
    <UNSIGNED_INTEGER_LITERAL> ...
    <DECIMAL_NUMERIC_LITERAL> ...
    <APPROX_NUMERIC_LITERAL> ...
    <BINARY_STRING_LITERAL> ...
    <PREFIXED_STRING_LITERAL> ...
    <QUOTED_STRING> ...
    <UNICODE_STRING_LITERAL> ...
    "TRUE" ...
    "FALSE" ...
    "UNKNOWN" ...
    "NULL" ...
    <LBRACE_D> ...
    <LBRACE_T> ...
    <LBRACE_TS> ...
    "DATE" ...
    "TIME" ...
    "TIMESTAMP" <QUOTED_STRING> ...
    "INTERVAL" ...
    "?" ...
    "CAST" ...
    "EXTRACT" ...
    "POSITION" ...
    "CONVERT" ...
    "TRANSLATE" ...
    "OVERLAY" ...
    "FLOOR" ...
    "CEIL" ...
    "CEILING" ...
    "SUBSTRING" ...
    "TRIM" ...
    "CLASSIFIER" ...
    "MATCH_NUMBER" ...
    "RUNNING" ...
    "PREV" ...
    "NEXT" ...
    <LBRACE_FN> ...
    "MULTISET" ...
    "ARRAY" ...
    "PERIOD" ...
    "SPECIFIC" ...
    <IDENTIFIER> ...
    <QUOTED_IDENTIFIER> ...
    <BACK_QUOTED_IDENTIFIER> ...
    <BRACKET_QUOTED_IDENTIFIER> ...
    <UNICODE_QUOTED_IDENTIFIER> ...
    "ABS" ...
    "AVG" ...
    "CARDINALITY" ...
    "CHAR_LENGTH" ...
    "CHARACTER_LENGTH" ...
    "COALESCE" ...
    "COLLECT" ...
    "COVAR_POP" ...
    "COVAR_SAMP" ...
    "CUME_DIST" ...
    "COUNT" ...
    "CURRENT_DATE" ...
    "CURRENT_TIME" ...
    "CURRENT_TIMESTAMP" ...
    "DENSE_RANK" ...
    "ELEMENT" ...
    "EXP" ...
    "FIRST_VALUE" ...
    "FUSION" ...
    "GROUPING" ...
    "HOUR" ...
    "LAG" ...
    "LEAD" ...
    "LAST_VALUE" ...
    "LN" ...
    "LOCALTIME" ...
    "LOCALTIMESTAMP" ...
    "LOWER" ...
    "MAX" ...
    "MIN" ...
    "MINUTE" ...
    "MOD" ...
    "MONTH" ...
    "NTH_VALUE" ...
    "NTILE" ...
    "NULLIF" ...
    "OCTET_LENGTH" ...
    "PERCENT_RANK" ...
    "POWER" ...
    "RANK" ...
    "REGR_SXX" ...
    "REGR_SYY" ...
    "ROW_NUMBER" ...
    "SECOND" ...
    "SQRT" ...
    "STDDEV_POP" ...
    "STDDEV_SAMP" ...
    "SUM" ...
    "UPPER" ...
    "TRUNCATE" ...
    "USER" ...
    "VAR_POP" ...
    "VAR_SAMP" ...
    "YEAR" ...
    "CURRENT_CATALOG" ...
    "CURRENT_DEFAULT_TRANSFORM_GROUP" ...
    "CURRENT_PATH" ...
    "CURRENT_ROLE" ...
    "CURRENT_SCHEMA" ...
    "CURRENT_USER" ...
    "SESSION_USER" ...
    "SYSTEM_USER" ...
    "NEW" ...
    "CASE" ...
    "CURRENT" ...
    "CURSOR" ...
    "ROW" ...
    "(" ...
    
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.generateParseException(SqlParserImpl.java:23019)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.jj_consume_token(SqlParserImpl.java:22836)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.Expression3(SqlParserImpl.java:3379)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.Expression2b(SqlParserImpl.java:3066)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.Expression2(SqlParserImpl.java:3092)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.Expression(SqlParserImpl.java:3045)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.TimestampAddFunctionCall(SqlParserImpl.java:5317)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.BuiltinFunctionCall(SqlParserImpl.java:5281)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.AtomicRowExpression(SqlParserImpl.java:3474)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.Expression3(SqlParserImpl.java:3319)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.Expression2b(SqlParserImpl.java:3066)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.Expression2(SqlParserImpl.java:3092)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.Expression(SqlParserImpl.java:3045)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.BuiltinFunctionCall(SqlParserImpl.java:5051)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.AtomicRowExpression(SqlParserImpl.java:3474)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.Expression3(SqlParserImpl.java:3319)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.Expression2b(SqlParserImpl.java:3066)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.Expression2(SqlParserImpl.java:3092)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.Expression(SqlParserImpl.java:3045)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.SelectExpression(SqlParserImpl.java:1525)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.SelectItem(SqlParserImpl.java:1500)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.SelectList(SqlParserImpl.java:1487)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.SqlSelect(SqlParserImpl.java:912)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.LeafQuery(SqlParserImpl.java:552)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.LeafQueryOrExpr(SqlParserImpl.java:3030)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.QueryOrExpr(SqlParserImpl.java:2949)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.OrderedQueryOrExpr(SqlParserImpl.java:463)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.SqlInsert(SqlParserImpl.java:1212)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.SqlStmt(SqlParserImpl.java:847)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.SqlStmtEof(SqlParserImpl.java:869)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.parseSqlStmtEof(SqlParserImpl.java:184)
  at org.apache.calcite.sql.parser.SqlParser.parseQuery(SqlParser.java:130)
  ... 6 more


Best
Henry

在 2018年8月23日,下午4:11,Fabian Hueske <[hidden email]> 写道:

Hi Henry,

Flink is an open source project. New build-in functions are constantly contributed to Flink. Right now, there are more than 5 PRs open to add or improve various functions.

If you find that some functions are not working correctly or could be improved, you can open a Jira issue. The same applies to missing functions.
Before opening an issue, it would be good to check if there's another issue for the same problem.
After opening a Jira issue, you can either wait for somebody to pick it up or contribute the function yourself.
The scalar function umbrella Jira issue [1] explains what needs to be done to add a function.

Contributing scalar functions is a good way to get involved with the Flink development.

Best, Fabian


Am Do., 23. Aug. 2018 um 09:16 Uhr schrieb 徐涛 <[hidden email]>:
Hi All,
        I found flink is lack of some basic functions , for example string split, regular express support, json parse and extract support, these function are used frequently in development , but they are not supported, use has to write UDF to support this.
        And some of the provided functions are lack of usability, for example log(2, 1.0) and exp(1.0)  with double params are not supported. I think they are not hard to implement and they are very basic functions.
        Will flink enhance the basic functions , maybe in later releases?

Best,
Henry