Re: [DISCUSS] Introduction of a Table API Java Expression DSL

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Introduction of a Table API Java Expression DSL

Jark Wu-3
Hi Timo,

I'm +1 on the proposal. I like the idea to provide a Java DSL which is more friendly than string-based approach in programming.

My concern is if/when we can drop the string-based expression parser. If it takes a very long time, we have to paid more development
cost on the three Table APIs. As far as I know, the string-based API is used in many companies.
We should also get some feedbacks from users. So I'm CCing this email to user mailing list.

Best,
Jark



On Wed, 20 Mar 2019 at 08:51, Rong Rong <[hidden email]> wrote:
Thanks for sharing the initiative of improving Java side Table expression
DSL.

I agree as in the doc stated that Java DSL was always a "3rd class citizen"
and we've run into many hand holding scenarios with our Flink developers
trying to get the Stringify syntax working.
Overall I am a +1 on this, it also help reduce the development cost of the
Table API so that we no longer need to maintain different DSL and
documentations.

I left a few comments in the doc. and also some features that I think will
be beneficial to the final outcome. Please kindly take a look @Timo.

Many thanks,
Rong

On Mon, Mar 18, 2019 at 7:15 AM Timo Walther <[hidden email]> wrote:

> Hi everyone,
>
> some of you might have already noticed the JIRA issue that I opened
> recently [1] about introducing a proper Java expression DSL for the
> Table API. Instead of using string-based expressions, we should aim for
> a unified, maintainable, programmatic Java DSL.
>
> Some background: The Blink merging efforts and the big refactorings as
> part of FLIP-32 have revealed many shortcomings in the current Table &
> SQL API design. Most of these legacy issues cause problems nowadays in
> making the Table API a first-class API next to the DataStream API. An
> example is the ExpressionParser class[2]. It was implemented in the
> early days of the Table API using Scala parser combinators. During the
> last years, this parser caused many JIRA issues and user confusion on
> the mailing list. Because the exceptions and syntax might not be
> straight forward.
>
> For FLINK-11908, we added a temporary bridge instead of reimplementing
> the parser in Java for FLIP-32. However, this is only a intermediate
> solution until we made a final decision.
>
> I would like to propose a new, parser-free version of the Java Table API:
>
>
> https://docs.google.com/document/d/1r3bfR9R6q5Km0wXKcnhfig2XQ4aMiLG5h2MTx960Fg8/edit?usp=sharing
>
> I already implemented an early protoype that shows that such a DSL is
> not much implementation effort and integrates nicely with all existing
> API methods.
>
> What do you think?
>
> Thanks for your feedback,
>
> Timo
>
> [1] https://issues.apache.org/jira/browse/FLINK-11890
>
> [2]
>
> https://github.com/apache/flink/blob/master/flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/expressions/PlannerExpressionParserImpl.scala
>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Introduction of a Table API Java Expression DSL

Timo Walther
Thanks for your feedback Rong and Jark.

@Jark: Yes, you are right that the string-based API is used quite a lot. On the other side, the potential user base in the future is still bigger than our current user base. Because the Table API will become equally important as the DataStream API, we really need to fix some crucial design decisions before it is too late. I would suggest to introduce the new DSL in 1.9 and remove the Expression parser either in 1.10 or 1.11. From a developement point of view, I think we can handle the overhead to maintain 3 APIs until then because 2 APIs will share the same code base + expression parser.

Regards,
Timo

Am 21.03.19 um 05:21 schrieb Jark Wu:
Hi Timo,

I'm +1 on the proposal. I like the idea to provide a Java DSL which is more friendly than string-based approach in programming.

My concern is if/when we can drop the string-based expression parser. If it takes a very long time, we have to paid more development
cost on the three Table APIs. As far as I know, the string-based API is used in many companies.
We should also get some feedbacks from users. So I'm CCing this email to user mailing list.

Best,
Jark



On Wed, 20 Mar 2019 at 08:51, Rong Rong <[hidden email]> wrote:
Thanks for sharing the initiative of improving Java side Table expression
DSL.

I agree as in the doc stated that Java DSL was always a "3rd class citizen"
and we've run into many hand holding scenarios with our Flink developers
trying to get the Stringify syntax working.
Overall I am a +1 on this, it also help reduce the development cost of the
Table API so that we no longer need to maintain different DSL and
documentations.

I left a few comments in the doc. and also some features that I think will
be beneficial to the final outcome. Please kindly take a look @Timo.

Many thanks,
Rong

On Mon, Mar 18, 2019 at 7:15 AM Timo Walther <[hidden email]> wrote:

> Hi everyone,
>
> some of you might have already noticed the JIRA issue that I opened
> recently [1] about introducing a proper Java expression DSL for the
> Table API. Instead of using string-based expressions, we should aim for
> a unified, maintainable, programmatic Java DSL.
>
> Some background: The Blink merging efforts and the big refactorings as
> part of FLIP-32 have revealed many shortcomings in the current Table &
> SQL API design. Most of these legacy issues cause problems nowadays in
> making the Table API a first-class API next to the DataStream API. An
> example is the ExpressionParser class[2]. It was implemented in the
> early days of the Table API using Scala parser combinators. During the
> last years, this parser caused many JIRA issues and user confusion on
> the mailing list. Because the exceptions and syntax might not be
> straight forward.
>
> For FLINK-11908, we added a temporary bridge instead of reimplementing
> the parser in Java for FLIP-32. However, this is only a intermediate
> solution until we made a final decision.
>
> I would like to propose a new, parser-free version of the Java Table API:
>
>
> https://docs.google.com/document/d/1r3bfR9R6q5Km0wXKcnhfig2XQ4aMiLG5h2MTx960Fg8/edit?usp=sharing
>
> I already implemented an early protoype that shows that such a DSL is
> not much implementation effort and integrates nicely with all existing
> API methods.
>
> What do you think?
>
> Thanks for your feedback,
>
> Timo
>
> [1] https://issues.apache.org/jira/browse/FLINK-11890
>
> [2]
>
> https://github.com/apache/flink/blob/master/flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/expressions/PlannerExpressionParserImpl.scala
>
>


Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Introduction of a Table API Java Expression DSL

Jark Wu-3
Hi Timo,  

Sounds good to me. 

Do you want to deprecate the string-based API in 1.9 or make the decision in 1.10 after some feedbacks ? 


On Thu, 21 Mar 2019 at 21:32, Timo Walther <[hidden email]> wrote:
Thanks for your feedback Rong and Jark.

@Jark: Yes, you are right that the string-based API is used quite a lot. On the other side, the potential user base in the future is still bigger than our current user base. Because the Table API will become equally important as the DataStream API, we really need to fix some crucial design decisions before it is too late. I would suggest to introduce the new DSL in 1.9 and remove the Expression parser either in 1.10 or 1.11. From a developement point of view, I think we can handle the overhead to maintain 3 APIs until then because 2 APIs will share the same code base + expression parser.

Regards,
Timo

Am 21.03.19 um 05:21 schrieb Jark Wu:
Hi Timo,

I'm +1 on the proposal. I like the idea to provide a Java DSL which is more friendly than string-based approach in programming.

My concern is if/when we can drop the string-based expression parser. If it takes a very long time, we have to paid more development
cost on the three Table APIs. As far as I know, the string-based API is used in many companies.
We should also get some feedbacks from users. So I'm CCing this email to user mailing list.

Best,
Jark



On Wed, 20 Mar 2019 at 08:51, Rong Rong <[hidden email]> wrote:
Thanks for sharing the initiative of improving Java side Table expression
DSL.

I agree as in the doc stated that Java DSL was always a "3rd class citizen"
and we've run into many hand holding scenarios with our Flink developers
trying to get the Stringify syntax working.
Overall I am a +1 on this, it also help reduce the development cost of the
Table API so that we no longer need to maintain different DSL and
documentations.

I left a few comments in the doc. and also some features that I think will
be beneficial to the final outcome. Please kindly take a look @Timo.

Many thanks,
Rong

On Mon, Mar 18, 2019 at 7:15 AM Timo Walther <[hidden email]> wrote:

> Hi everyone,
>
> some of you might have already noticed the JIRA issue that I opened
> recently [1] about introducing a proper Java expression DSL for the
> Table API. Instead of using string-based expressions, we should aim for
> a unified, maintainable, programmatic Java DSL.
>
> Some background: The Blink merging efforts and the big refactorings as
> part of FLIP-32 have revealed many shortcomings in the current Table &
> SQL API design. Most of these legacy issues cause problems nowadays in
> making the Table API a first-class API next to the DataStream API. An
> example is the ExpressionParser class[2]. It was implemented in the
> early days of the Table API using Scala parser combinators. During the
> last years, this parser caused many JIRA issues and user confusion on
> the mailing list. Because the exceptions and syntax might not be
> straight forward.
>
> For FLINK-11908, we added a temporary bridge instead of reimplementing
> the parser in Java for FLIP-32. However, this is only a intermediate
> solution until we made a final decision.
>
> I would like to propose a new, parser-free version of the Java Table API:
>
>
> https://docs.google.com/document/d/1r3bfR9R6q5Km0wXKcnhfig2XQ4aMiLG5h2MTx960Fg8/edit?usp=sharing
>
> I already implemented an early protoype that shows that such a DSL is
> not much implementation effort and integrates nicely with all existing
> API methods.
>
> What do you think?
>
> Thanks for your feedback,
>
> Timo
>
> [1] https://issues.apache.org/jira/browse/FLINK-11890
>
> [2]
>
> https://github.com/apache/flink/blob/master/flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/expressions/PlannerExpressionParserImpl.scala
>
>


Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Introduction of a Table API Java Expression DSL

Dawid Wysakowicz-2

Hi,

I really like the idea of introducing Java Expression DSL. I think this will solve many problems e.g. right now it's quite tricky how string literals work in scala (sometimes it might go through the ExpressionParser and it will end up as an UnresolvedFieldReference), another important problem we could solve with this is the need for unique column names in tables right now. We could at some point introduce sth like:

Table table = ...

table.field("fieldName")

and etc. A common "entry point" to expressions should simplify a lot.

Therefore I am strongly +1 for introducing this feature.

@Jark I think we could aim to introduce the new Java DSL API in 1.9 and once we do that we could deprecate the string approach.

Best,

Dawid

On 22/03/2019 03:36, Jark Wu wrote:
Hi Timo,  

Sounds good to me. 

Do you want to deprecate the string-based API in 1.9 or make the decision in 1.10 after some feedbacks ? 


On Thu, 21 Mar 2019 at 21:32, Timo Walther <[hidden email]> wrote:
Thanks for your feedback Rong and Jark.

@Jark: Yes, you are right that the string-based API is used quite a lot. On the other side, the potential user base in the future is still bigger than our current user base. Because the Table API will become equally important as the DataStream API, we really need to fix some crucial design decisions before it is too late. I would suggest to introduce the new DSL in 1.9 and remove the Expression parser either in 1.10 or 1.11. From a developement point of view, I think we can handle the overhead to maintain 3 APIs until then because 2 APIs will share the same code base + expression parser.

Regards,
Timo

Am 21.03.19 um 05:21 schrieb Jark Wu:
Hi Timo,

I'm +1 on the proposal. I like the idea to provide a Java DSL which is more friendly than string-based approach in programming.

My concern is if/when we can drop the string-based expression parser. If it takes a very long time, we have to paid more development
cost on the three Table APIs. As far as I know, the string-based API is used in many companies.
We should also get some feedbacks from users. So I'm CCing this email to user mailing list.

Best,
Jark



On Wed, 20 Mar 2019 at 08:51, Rong Rong <[hidden email]> wrote:
Thanks for sharing the initiative of improving Java side Table expression
DSL.

I agree as in the doc stated that Java DSL was always a "3rd class citizen"
and we've run into many hand holding scenarios with our Flink developers
trying to get the Stringify syntax working.
Overall I am a +1 on this, it also help reduce the development cost of the
Table API so that we no longer need to maintain different DSL and
documentations.

I left a few comments in the doc. and also some features that I think will
be beneficial to the final outcome. Please kindly take a look @Timo.

Many thanks,
Rong

On Mon, Mar 18, 2019 at 7:15 AM Timo Walther <[hidden email]> wrote:

> Hi everyone,
>
> some of you might have already noticed the JIRA issue that I opened
> recently [1] about introducing a proper Java expression DSL for the
> Table API. Instead of using string-based expressions, we should aim for
> a unified, maintainable, programmatic Java DSL.
>
> Some background: The Blink merging efforts and the big refactorings as
> part of FLIP-32 have revealed many shortcomings in the current Table &
> SQL API design. Most of these legacy issues cause problems nowadays in
> making the Table API a first-class API next to the DataStream API. An
> example is the ExpressionParser class[2]. It was implemented in the
> early days of the Table API using Scala parser combinators. During the
> last years, this parser caused many JIRA issues and user confusion on
> the mailing list. Because the exceptions and syntax might not be
> straight forward.
>
> For FLINK-11908, we added a temporary bridge instead of reimplementing
> the parser in Java for FLIP-32. However, this is only a intermediate
> solution until we made a final decision.
>
> I would like to propose a new, parser-free version of the Java Table API:
>
>
> https://docs.google.com/document/d/1r3bfR9R6q5Km0wXKcnhfig2XQ4aMiLG5h2MTx960Fg8/edit?usp=sharing
>
> I already implemented an early protoype that shows that such a DSL is
> not much implementation effort and integrates nicely with all existing
> API methods.
>
> What do you think?
>
> Thanks for your feedback,
>
> Timo
>
> [1] https://issues.apache.org/jira/browse/FLINK-11890
>
> [2]
>
> https://github.com/apache/flink/blob/master/flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/expressions/PlannerExpressionParserImpl.scala
>
>



signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Introduction of a Table API Java Expression DSL

jincheng sun
Thanks for bringing up this DISCUSS Timo!

Java Expression DSL is pretty useful for java user. When we have the Java Expression DSL, Java API will become very rich and easy to use!

+1 from my side.

Best,
Jincheng


Dawid Wysakowicz <[hidden email]> 于2019年3月26日周二 下午5:08写道:

Hi,

I really like the idea of introducing Java Expression DSL. I think this will solve many problems e.g. right now it's quite tricky how string literals work in scala (sometimes it might go through the ExpressionParser and it will end up as an UnresolvedFieldReference), another important problem we could solve with this is the need for unique column names in tables right now. We could at some point introduce sth like:

Table table = ...

table.field("fieldName")

and etc. A common "entry point" to expressions should simplify a lot.

Therefore I am strongly +1 for introducing this feature.

@Jark I think we could aim to introduce the new Java DSL API in 1.9 and once we do that we could deprecate the string approach.

Best,

Dawid

On 22/03/2019 03:36, Jark Wu wrote:
Hi Timo,  

Sounds good to me. 

Do you want to deprecate the string-based API in 1.9 or make the decision in 1.10 after some feedbacks ? 


On Thu, 21 Mar 2019 at 21:32, Timo Walther <[hidden email]> wrote:
Thanks for your feedback Rong and Jark.

@Jark: Yes, you are right that the string-based API is used quite a lot. On the other side, the potential user base in the future is still bigger than our current user base. Because the Table API will become equally important as the DataStream API, we really need to fix some crucial design decisions before it is too late. I would suggest to introduce the new DSL in 1.9 and remove the Expression parser either in 1.10 or 1.11. From a developement point of view, I think we can handle the overhead to maintain 3 APIs until then because 2 APIs will share the same code base + expression parser.

Regards,
Timo

Am 21.03.19 um 05:21 schrieb Jark Wu:
Hi Timo,

I'm +1 on the proposal. I like the idea to provide a Java DSL which is more friendly than string-based approach in programming.

My concern is if/when we can drop the string-based expression parser. If it takes a very long time, we have to paid more development
cost on the three Table APIs. As far as I know, the string-based API is used in many companies.
We should also get some feedbacks from users. So I'm CCing this email to user mailing list.

Best,
Jark



On Wed, 20 Mar 2019 at 08:51, Rong Rong <[hidden email]> wrote:
Thanks for sharing the initiative of improving Java side Table expression
DSL.

I agree as in the doc stated that Java DSL was always a "3rd class citizen"
and we've run into many hand holding scenarios with our Flink developers
trying to get the Stringify syntax working.
Overall I am a +1 on this, it also help reduce the development cost of the
Table API so that we no longer need to maintain different DSL and
documentations.

I left a few comments in the doc. and also some features that I think will
be beneficial to the final outcome. Please kindly take a look @Timo.

Many thanks,
Rong

On Mon, Mar 18, 2019 at 7:15 AM Timo Walther <[hidden email]> wrote:

> Hi everyone,
>
> some of you might have already noticed the JIRA issue that I opened
> recently [1] about introducing a proper Java expression DSL for the
> Table API. Instead of using string-based expressions, we should aim for
> a unified, maintainable, programmatic Java DSL.
>
> Some background: The Blink merging efforts and the big refactorings as
> part of FLIP-32 have revealed many shortcomings in the current Table &
> SQL API design. Most of these legacy issues cause problems nowadays in
> making the Table API a first-class API next to the DataStream API. An
> example is the ExpressionParser class[2]. It was implemented in the
> early days of the Table API using Scala parser combinators. During the
> last years, this parser caused many JIRA issues and user confusion on
> the mailing list. Because the exceptions and syntax might not be
> straight forward.
>
> For FLINK-11908, we added a temporary bridge instead of reimplementing
> the parser in Java for FLIP-32. However, this is only a intermediate
> solution until we made a final decision.
>
> I would like to propose a new, parser-free version of the Java Table API:
>
>
> https://docs.google.com/document/d/1r3bfR9R6q5Km0wXKcnhfig2XQ4aMiLG5h2MTx960Fg8/edit?usp=sharing
>
> I already implemented an early protoype that shows that such a DSL is
> not much implementation effort and integrates nicely with all existing
> API methods.
>
> What do you think?
>
> Thanks for your feedback,
>
> Timo
>
> [1] https://issues.apache.org/jira/browse/FLINK-11890
>
> [2]
>
> https://github.com/apache/flink/blob/master/flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/expressions/PlannerExpressionParserImpl.scala
>
>