Hello-world example of Flink Table API using a edited Calcite rule

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Hello-world example of Flink Table API using a edited Calcite rule

Felipe Gutierrez
Hi,

does someone have a simple example using Table API and a Calcite rule which change/optimize the query execution plan of a query in Flink?

From the official documentation, I know that I have to create a CalciteConfig object [1]. Then, I based my firsts tests on this stackoverflow post [2] and I implemented this piece of code:

// change the current calcite config plan
CalciteConfigBuilder ccb = new CalciteConfigBuilder();
RuleSet ruleSets = RuleSets.ofList(FilterMergeRule.INSTANCE);
ccb.addLogicalOptRuleSet(ruleSets);
TableConfig tableConfig = new TableConfig();
tableConfig.setCalciteConfig(ccb.build());

I suppose that with this I can change the query plan of the Flink Table API. I am also not sure if I will need to use an external catalog like this post assumes to use [3].
In a nutshell, I would like to have a simple example where I can execute a query using Flink Table API and change its query execution plan using a Calcite rule. Does anyone have a Hello world of it? I plan to use it on this example [4].

[1] https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/table/common.html#query-optimization
[2] https://stackoverflow.com/questions/54278508/how-can-i-create-an-external-catalog-table-in-apache-flink
[3] https://stackoverflow.com/questions/54278508/how-can-i-create-an-external-catalog-table-in-apache-flink
[4] https://github.com/felipegutierrez/explore-flink/blob/master/src/main/java/org/sense/flink/examples/stream/table/MqttSensorDataAverageTableAPI.java#L40

Kind Regards,
Felipe
--
-- Felipe Gutierrez
-- skype: felipe.o.gutierrez
Reply | Threaded
Open this post in threaded view
|

Re: Hello-world example of Flink Table API using a edited Calcite rule

JingsongLee
Hi Felipe:
I think your approach is absolutely right. You can try to do some plan test just like [1].
You can find more CalciteConfigBuilder API test in [2].


Best, JingsongLee

------------------------------------------------------------------
From:Felipe Gutierrez <[hidden email]>
Send Time:2019年6月26日(星期三) 18:04
To:user <[hidden email]>
Subject:Hello-world example of Flink Table API using a edited Calcite rule

Hi,

does someone have a simple example using Table API and a Calcite rule which change/optimize the query execution plan of a query in Flink?

From the official documentation, I know that I have to create a CalciteConfig object [1]. Then, I based my firsts tests on this stackoverflow post [2] and I implemented this piece of code:

// change the current calcite config plan
CalciteConfigBuilder ccb = new CalciteConfigBuilder();
RuleSet ruleSets = RuleSets.ofList(FilterMergeRule.INSTANCE);
ccb.addLogicalOptRuleSet(ruleSets);
TableConfig tableConfig = new TableConfig();
tableConfig.setCalciteConfig(ccb.build());

I suppose that with this I can change the query plan of the Flink Table API. I am also not sure if I will need to use an external catalog like this post assumes to use [3].
In a nutshell, I would like to have a simple example where I can execute a query using Flink Table API and change its query execution plan using a Calcite rule. Does anyone have a Hello world of it? I plan to use it on this example [4].

[1] https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/table/common.html#query-optimization
[2] https://stackoverflow.com/questions/54278508/how-can-i-create-an-external-catalog-table-in-apache-flink
[3] https://stackoverflow.com/questions/54278508/how-can-i-create-an-external-catalog-table-in-apache-flink
[4] https://github.com/felipegutierrez/explore-flink/blob/master/src/main/java/org/sense/flink/examples/stream/table/MqttSensorDataAverageTableAPI.java#L40

Kind Regards,
Felipe
--
-- Felipe Gutierrez
-- skype: felipe.o.gutierrez

Reply | Threaded
Open this post in threaded view
|

Re: Hello-world example of Flink Table API using a edited Calcite rule

Felipe Gutierrez
Hi JingsongLee,

it is still not very clear to me. I imagine that I can create an InMemoryExternalCatalog and insert some tuples there (which will be in memory). Then I can use Calcite to use the values of my InMemoryExternalCatalog and change my plan. Is that correct?

Do you have an example of how to create an InMemoryExternalCatalog using Flink 1.8? Because I guess the Csv [1] class does not exist anymore.

[1] https://github.com/srapisarda/stypes-flink/blob/feature/sql/src/test/scala/uk/ac/bbk/dcs/stypes/flink/CalciteEmptyConsistencyTest.scala#L73

Kind Regards,
Felipe


--
-- Felipe Gutierrez
-- skype: felipe.o.gutierrez


On Wed, Jun 26, 2019 at 1:36 PM JingsongLee <[hidden email]> wrote:
Hi Felipe:
I think your approach is absolutely right. You can try to do some plan test just like [1].
You can find more CalciteConfigBuilder API test in [2].


Best, JingsongLee

------------------------------------------------------------------
From:Felipe Gutierrez <[hidden email]>
Send Time:2019年6月26日(星期三) 18:04
To:user <[hidden email]>
Subject:Hello-world example of Flink Table API using a edited Calcite rule

Hi,

does someone have a simple example using Table API and a Calcite rule which change/optimize the query execution plan of a query in Flink?

From the official documentation, I know that I have to create a CalciteConfig object [1]. Then, I based my firsts tests on this stackoverflow post [2] and I implemented this piece of code:

// change the current calcite config plan
CalciteConfigBuilder ccb = new CalciteConfigBuilder();
RuleSet ruleSets = RuleSets.ofList(FilterMergeRule.INSTANCE);
ccb.addLogicalOptRuleSet(ruleSets);
TableConfig tableConfig = new TableConfig();
tableConfig.setCalciteConfig(ccb.build());

I suppose that with this I can change the query plan of the Flink Table API. I am also not sure if I will need to use an external catalog like this post assumes to use [3].
In a nutshell, I would like to have a simple example where I can execute a query using Flink Table API and change its query execution plan using a Calcite rule. Does anyone have a Hello world of it? I plan to use it on this example [4].

[1] https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/table/common.html#query-optimization
[2] https://stackoverflow.com/questions/54278508/how-can-i-create-an-external-catalog-table-in-apache-flink
[3] https://stackoverflow.com/questions/54278508/how-can-i-create-an-external-catalog-table-in-apache-flink
[4] https://github.com/felipegutierrez/explore-flink/blob/master/src/main/java/org/sense/flink/examples/stream/table/MqttSensorDataAverageTableAPI.java#L40

Kind Regards,
Felipe
--
-- Felipe Gutierrez
-- skype: felipe.o.gutierrez

Reply | Threaded
Open this post in threaded view
|

Re: Hello-world example of Flink Table API using a edited Calcite rule

JingsongLee
Hi Felipe:

Yeah, you can use InMemoryExternalCatalog and CalciteConfig,
but I don't quite understand what you mean.
InMemoryExternalCatalog provides methods to create, drop, and 
alter (sub-)catalogs or tables. And CalciteConfig is for defining a
 custom Calcite configuration. They are two separate things.

About InMemoryExternalCatalog, You can take a look at [1]. 
Csv has been renamed to OldCsv [2], But recommendation 
using the RFC-compliant `Csv` format in the dedicated
`flink-formats/flink-csv` module instead.



Best, JingsongLee

------------------------------------------------------------------
From:Felipe Gutierrez <[hidden email]>
Send Time:2019年6月26日(星期三) 20:58
To:JingsongLee <[hidden email]>
Cc:user <[hidden email]>
Subject:Re: Hello-world example of Flink Table API using a edited Calcite rule

Hi JingsongLee,

it is still not very clear to me. I imagine that I can create an InMemoryExternalCatalog and insert some tuples there (which will be in memory). Then I can use Calcite to use the values of my InMemoryExternalCatalog and change my plan. Is that correct?

Do you have an example of how to create an InMemoryExternalCatalog using Flink 1.8? Because I guess the Csv [1] class does not exist anymore.

[1] https://github.com/srapisarda/stypes-flink/blob/feature/sql/src/test/scala/uk/ac/bbk/dcs/stypes/flink/CalciteEmptyConsistencyTest.scala#L73

Kind Regards,
Felipe


--
-- Felipe Gutierrez
-- skype: felipe.o.gutierrez


On Wed, Jun 26, 2019 at 1:36 PM JingsongLee <[hidden email]> wrote:
Hi Felipe:
I think your approach is absolutely right. You can try to do some plan test just like [1].
You can find more CalciteConfigBuilder API test in [2].


Best, JingsongLee

------------------------------------------------------------------
From:Felipe Gutierrez <[hidden email]>
Send Time:2019年6月26日(星期三) 18:04
To:user <[hidden email]>
Subject:Hello-world example of Flink Table API using a edited Calcite rule

Hi,

does someone have a simple example using Table API and a Calcite rule which change/optimize the query execution plan of a query in Flink?

From the official documentation, I know that I have to create a CalciteConfig object [1]. Then, I based my firsts tests on this stackoverflow post [2] and I implemented this piece of code:

// change the current calcite config plan
CalciteConfigBuilder ccb = new CalciteConfigBuilder();
RuleSet ruleSets = RuleSets.ofList(FilterMergeRule.INSTANCE);
ccb.addLogicalOptRuleSet(ruleSets);
TableConfig tableConfig = new TableConfig();
tableConfig.setCalciteConfig(ccb.build());

I suppose that with this I can change the query plan of the Flink Table API. I am also not sure if I will need to use an external catalog like this post assumes to use [3].
In a nutshell, I would like to have a simple example where I can execute a query using Flink Table API and change its query execution plan using a Calcite rule. Does anyone have a Hello world of it? I plan to use it on this example [4].

[1] https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/table/common.html#query-optimization
[2] https://stackoverflow.com/questions/54278508/how-can-i-create-an-external-catalog-table-in-apache-flink
[3] https://stackoverflow.com/questions/54278508/how-can-i-create-an-external-catalog-table-in-apache-flink
[4] https://github.com/felipegutierrez/explore-flink/blob/master/src/main/java/org/sense/flink/examples/stream/table/MqttSensorDataAverageTableAPI.java#L40

Kind Regards,
Felipe
--
-- Felipe Gutierrez
-- skype: felipe.o.gutierrez


Reply | Threaded
Open this post in threaded view
|

Re: Hello-world example of Flink Table API using a edited Calcite rule

Felipe Gutierrez
Hi JingsongLee,

Sorry for not explain very well. I am gonna try a clarification of my idea.
1 - I want to use InMemoryExternalCatalog in a way to save some statistics which I create by listening to a stream.
2 - Then I will have my core application using Table API to execute some aggregation/join.
3 - Because the application on 2 uses Table API, I am able to influence its plan through Calcite configuration rules. So, I am gonna use the statistics from 1 to change the rules dynamic on 2.

Do you think it is clear? and it is a feasible application with the current capabilities of Table API?
ps.: I am gonna look at the links that you mentioned. Thanks for that!

Felipe

--
-- Felipe Gutierrez
-- skype: felipe.o.gutierrez


On Thu, Jun 27, 2019 at 7:23 AM JingsongLee <[hidden email]> wrote:
Hi Felipe:

Yeah, you can use InMemoryExternalCatalog and CalciteConfig,
but I don't quite understand what you mean.
InMemoryExternalCatalog provides methods to create, drop, and 
alter (sub-)catalogs or tables. And CalciteConfig is for defining a
 custom Calcite configuration. They are two separate things.

About InMemoryExternalCatalog, You can take a look at [1]. 
Csv has been renamed to OldCsv [2], But recommendation 
using the RFC-compliant `Csv` format in the dedicated
`flink-formats/flink-csv` module instead.



Best, JingsongLee

------------------------------------------------------------------
From:Felipe Gutierrez <[hidden email]>
Send Time:2019年6月26日(星期三) 20:58
To:JingsongLee <[hidden email]>
Cc:user <[hidden email]>
Subject:Re: Hello-world example of Flink Table API using a edited Calcite rule

Hi JingsongLee,

it is still not very clear to me. I imagine that I can create an InMemoryExternalCatalog and insert some tuples there (which will be in memory). Then I can use Calcite to use the values of my InMemoryExternalCatalog and change my plan. Is that correct?

Do you have an example of how to create an InMemoryExternalCatalog using Flink 1.8? Because I guess the Csv [1] class does not exist anymore.

[1] https://github.com/srapisarda/stypes-flink/blob/feature/sql/src/test/scala/uk/ac/bbk/dcs/stypes/flink/CalciteEmptyConsistencyTest.scala#L73

Kind Regards,
Felipe


--
-- Felipe Gutierrez
-- skype: felipe.o.gutierrez


On Wed, Jun 26, 2019 at 1:36 PM JingsongLee <[hidden email]> wrote:
Hi Felipe:
I think your approach is absolutely right. You can try to do some plan test just like [1].
You can find more CalciteConfigBuilder API test in [2].


Best, JingsongLee

------------------------------------------------------------------
From:Felipe Gutierrez <[hidden email]>
Send Time:2019年6月26日(星期三) 18:04
To:user <[hidden email]>
Subject:Hello-world example of Flink Table API using a edited Calcite rule

Hi,

does someone have a simple example using Table API and a Calcite rule which change/optimize the query execution plan of a query in Flink?

From the official documentation, I know that I have to create a CalciteConfig object [1]. Then, I based my firsts tests on this stackoverflow post [2] and I implemented this piece of code:

// change the current calcite config plan
CalciteConfigBuilder ccb = new CalciteConfigBuilder();
RuleSet ruleSets = RuleSets.ofList(FilterMergeRule.INSTANCE);
ccb.addLogicalOptRuleSet(ruleSets);
TableConfig tableConfig = new TableConfig();
tableConfig.setCalciteConfig(ccb.build());

I suppose that with this I can change the query plan of the Flink Table API. I am also not sure if I will need to use an external catalog like this post assumes to use [3].
In a nutshell, I would like to have a simple example where I can execute a query using Flink Table API and change its query execution plan using a Calcite rule. Does anyone have a Hello world of it? I plan to use it on this example [4].

[1] https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/table/common.html#query-optimization
[2] https://stackoverflow.com/questions/54278508/how-can-i-create-an-external-catalog-table-in-apache-flink
[3] https://stackoverflow.com/questions/54278508/how-can-i-create-an-external-catalog-table-in-apache-flink
[4] https://github.com/felipegutierrez/explore-flink/blob/master/src/main/java/org/sense/flink/examples/stream/table/MqttSensorDataAverageTableAPI.java#L40

Kind Regards,
Felipe
--
-- Felipe Gutierrez
-- skype: felipe.o.gutierrez


Reply | Threaded
Open this post in threaded view
|

Re: Hello-world example of Flink Table API using a edited Calcite rule

JingsongLee
Got it, it's clear, TableStats is the important functions of ExternalCatalog. It is right way.

Best, JingsongLee

------------------------------------------------------------------
From:Felipe Gutierrez <[hidden email]>
Send Time:2019年6月27日(星期四) 14:53
To:JingsongLee <[hidden email]>
Cc:user <[hidden email]>
Subject:Re: Hello-world example of Flink Table API using a edited Calcite rule

Hi JingsongLee,

Sorry for not explain very well. I am gonna try a clarification of my idea.
1 - I want to use InMemoryExternalCatalog in a way to save some statistics which I create by listening to a stream.
2 - Then I will have my core application using Table API to execute some aggregation/join.
3 - Because the application on 2 uses Table API, I am able to influence its plan through Calcite configuration rules. So, I am gonna use the statistics from 1 to change the rules dynamic on 2.

Do you think it is clear? and it is a feasible application with the current capabilities of Table API?
ps.: I am gonna look at the links that you mentioned. Thanks for that!

Felipe

--
-- Felipe Gutierrez
-- skype: felipe.o.gutierrez


On Thu, Jun 27, 2019 at 7:23 AM JingsongLee <[hidden email]> wrote:
Hi Felipe:

Yeah, you can use InMemoryExternalCatalog and CalciteConfig,
but I don't quite understand what you mean.
InMemoryExternalCatalog provides methods to create, drop, and 
alter (sub-)catalogs or tables. And CalciteConfig is for defining a
 custom Calcite configuration. They are two separate things.

About InMemoryExternalCatalog, You can take a look at [1]. 
Csv has been renamed to OldCsv [2], But recommendation 
using the RFC-compliant `Csv` format in the dedicated
`flink-formats/flink-csv` module instead.



Best, JingsongLee

------------------------------------------------------------------
From:Felipe Gutierrez <[hidden email]>
Send Time:2019年6月26日(星期三) 20:58
To:JingsongLee <[hidden email]>
Cc:user <[hidden email]>
Subject:Re: Hello-world example of Flink Table API using a edited Calcite rule

Hi JingsongLee,

it is still not very clear to me. I imagine that I can create an InMemoryExternalCatalog and insert some tuples there (which will be in memory). Then I can use Calcite to use the values of my InMemoryExternalCatalog and change my plan. Is that correct?

Do you have an example of how to create an InMemoryExternalCatalog using Flink 1.8? Because I guess the Csv [1] class does not exist anymore.

[1] https://github.com/srapisarda/stypes-flink/blob/feature/sql/src/test/scala/uk/ac/bbk/dcs/stypes/flink/CalciteEmptyConsistencyTest.scala#L73

Kind Regards,
Felipe


--
-- Felipe Gutierrez
-- skype: felipe.o.gutierrez


On Wed, Jun 26, 2019 at 1:36 PM JingsongLee <[hidden email]> wrote:
Hi Felipe:
I think your approach is absolutely right. You can try to do some plan test just like [1].
You can find more CalciteConfigBuilder API test in [2].


Best, JingsongLee

------------------------------------------------------------------
From:Felipe Gutierrez <[hidden email]>
Send Time:2019年6月26日(星期三) 18:04
To:user <[hidden email]>
Subject:Hello-world example of Flink Table API using a edited Calcite rule

Hi,

does someone have a simple example using Table API and a Calcite rule which change/optimize the query execution plan of a query in Flink?

From the official documentation, I know that I have to create a CalciteConfig object [1]. Then, I based my firsts tests on this stackoverflow post [2] and I implemented this piece of code:

// change the current calcite config plan
CalciteConfigBuilder ccb = new CalciteConfigBuilder();
RuleSet ruleSets = RuleSets.ofList(FilterMergeRule.INSTANCE);
ccb.addLogicalOptRuleSet(ruleSets);
TableConfig tableConfig = new TableConfig();
tableConfig.setCalciteConfig(ccb.build());

I suppose that with this I can change the query plan of the Flink Table API. I am also not sure if I will need to use an external catalog like this post assumes to use [3].
In a nutshell, I would like to have a simple example where I can execute a query using Flink Table API and change its query execution plan using a Calcite rule. Does anyone have a Hello world of it? I plan to use it on this example [4].

[1] https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/table/common.html#query-optimization
[2] https://stackoverflow.com/questions/54278508/how-can-i-create-an-external-catalog-table-in-apache-flink
[3] https://stackoverflow.com/questions/54278508/how-can-i-create-an-external-catalog-table-in-apache-flink
[4] https://github.com/felipegutierrez/explore-flink/blob/master/src/main/java/org/sense/flink/examples/stream/table/MqttSensorDataAverageTableAPI.java#L40

Kind Regards,
Felipe
--
-- Felipe Gutierrez
-- skype: felipe.o.gutierrez