Hi,
does someone have a simple example using Table API and a Calcite rule which change/optimize the query execution plan of a query in Flink? From the official documentation, I know that I have to create a CalciteConfig object [1]. Then, I based my firsts tests on this stackoverflow post [2] and I implemented this piece of code: // change the current calcite config plan CalciteConfigBuilder ccb = new CalciteConfigBuilder(); RuleSet ruleSets = RuleSets.ofList(FilterMergeRule.INSTANCE); ccb.addLogicalOptRuleSet(ruleSets); TableConfig tableConfig = new TableConfig(); tableConfig.setCalciteConfig(ccb.build()); I suppose that with this I can change the query plan of the Flink Table API. I am also not sure if I will need to use an external catalog like this post assumes to use [3]. In a nutshell, I would like to have a simple example where I can execute a query using Flink Table API and change its query execution plan using a Calcite rule. Does anyone have a Hello world of it? I plan to use it on this example [4]. [1] https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/table/common.html#query-optimization [2] https://stackoverflow.com/questions/54278508/how-can-i-create-an-external-catalog-table-in-apache-flink [3] https://stackoverflow.com/questions/54278508/how-can-i-create-an-external-catalog-table-in-apache-flink [4] https://github.com/felipegutierrez/explore-flink/blob/master/src/main/java/org/sense/flink/examples/stream/table/MqttSensorDataAverageTableAPI.java#L40 Kind Regards, Felipe |
Hi Felipe: I think your approach is absolutely right. You can try to do some plan test just like [1]. You can find more CalciteConfigBuilder API test in [2]. Best, JingsongLee
|
Hi JingsongLee, it is still not very clear to me. I imagine that I can create an InMemoryExternalCatalog and insert some tuples there (which will be in memory). Then I can use Calcite to use the values of my InMemoryExternalCatalog and change my plan. Is that correct? Do you have an example of how to create an InMemoryExternalCatalog using Flink 1.8? Because I guess the Csv [1] class does not exist anymore. [1] https://github.com/srapisarda/stypes-flink/blob/feature/sql/src/test/scala/uk/ac/bbk/dcs/stypes/flink/CalciteEmptyConsistencyTest.scala#L73 Kind Regards, Felipe On Wed, Jun 26, 2019 at 1:36 PM JingsongLee <[hidden email]> wrote:
|
Hi Felipe: Yeah, you can use InMemoryExternalCatalog and CalciteConfig, but I don't quite understand what you mean. InMemoryExternalCatalog provides methods to create, drop, and alter (sub-)catalogs or tables. And CalciteConfig is for defining a custom Calcite configuration. They are two separate things. About InMemoryExternalCatalog, You can take a look at [1]. Csv has been renamed to OldCsv [2], But recommendation using the RFC-compliant `Csv` format in the dedicated `flink-formats/flink-csv` module instead. Best, JingsongLee
|
Hi JingsongLee, Sorry for not explain very well. I am gonna try a clarification of my idea. 1 - I want to use InMemoryExternalCatalog in a way to save some statistics which I create by listening to a stream. 2 - Then I will have my core application using Table API to execute some aggregation/join. 3 - Because the application on 2 uses Table API, I am able to influence its plan through Calcite configuration rules. So, I am gonna use the statistics from 1 to change the rules dynamic on 2. Do you think it is clear? and it is a feasible application with the current capabilities of Table API? ps.: I am gonna look at the links that you mentioned. Thanks for that! Felipe On Thu, Jun 27, 2019 at 7:23 AM JingsongLee <[hidden email]> wrote:
|
Got it, it's clear, TableStats is the important functions of ExternalCatalog. It is right way. Best, JingsongLee
|
Free forum by Nabble | Edit this page |