Flink Query Optimizer

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Flink Query Optimizer

albertjonathan
Hello,

I am just wondering, does Flink use Apache Calcite's query optimizer to generate an optimal logical plan for stream queries, or does it have its own independent query optimizer? 
From what I observed so far, the Flink's query optimizer only groups operator together without changing the order of aggregation operators (e.g., join). Did I miss anything?

I am thinking of extending Flink to apply query optimization as in the context of DBMS by either integrating it with Calcite or implementing it as a new module. 
Any feedback or guidelines will be highly appreciated. 

Thank you,
Albert

Reply | Threaded
Open this post in threaded view
|

Re: Flink Query Optimizer

Hequn Cheng
Hi Albert, 

Apache Flink leverages Apache Calcite to optimize and translate queries. The optimization currently performed include projection and filter push-down, subquery decorrelation, and other kinds of query rewriting. Flink does not yet optimize the order of joins[1].
I agree with you it is valuable to make flink support changing the order of aggregation operators.

Btw, the main code can be found here[2], in the "def optimize(relNode: RelNode, updatesAsRetraction: Boolean)" function.

Best, Hequn


On Sun, Jul 15, 2018 at 3:14 AM, Albert Jonathan <[hidden email]> wrote:
Hello,

I am just wondering, does Flink use Apache Calcite's query optimizer to generate an optimal logical plan for stream queries, or does it have its own independent query optimizer? 
From what I observed so far, the Flink's query optimizer only groups operator together without changing the order of aggregation operators (e.g., join). Did I miss anything?

I am thinking of extending Flink to apply query optimization as in the context of DBMS by either integrating it with Calcite or implementing it as a new module. 
Any feedback or guidelines will be highly appreciated. 

Thank you,
Albert


Reply | Threaded
Open this post in threaded view
|

Re: Flink Query Optimizer

vino yang
Hi Albert,

If you want to provide more feature about the query optimizer for Flink.  I suggest you based on Apache Calcite, if Calcite's optimizer can not match your requirement. You can talk with Calcite community or just customize Calcite if you do not want to wait. 

Our inner Calcite version did some customization but not for query optimizer.

Just my suggestion.

Thanks
Vino

2018-07-15 15:57 GMT+08:00 Hequn Cheng <[hidden email]>:
Hi Albert, 

Apache Flink leverages Apache Calcite to optimize and translate queries. The optimization currently performed include projection and filter push-down, subquery decorrelation, and other kinds of query rewriting. Flink does not yet optimize the order of joins[1].
I agree with you it is valuable to make flink support changing the order of aggregation operators.

Btw, the main code can be found here[2], in the "def optimize(relNode: RelNode, updatesAsRetraction: Boolean)" function.

Best, Hequn


On Sun, Jul 15, 2018 at 3:14 AM, Albert Jonathan <[hidden email]> wrote:
Hello,

I am just wondering, does Flink use Apache Calcite's query optimizer to generate an optimal logical plan for stream queries, or does it have its own independent query optimizer? 
From what I observed so far, the Flink's query optimizer only groups operator together without changing the order of aggregation operators (e.g., join). Did I miss anything?

I am thinking of extending Flink to apply query optimization as in the context of DBMS by either integrating it with Calcite or implementing it as a new module. 
Any feedback or guidelines will be highly appreciated. 

Thank you,
Albert