Flink SQL Stream Parser based on calcite

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Flink SQL Stream Parser based on calcite

PedroMrChaves
Hello,

I am pretty new to Apache Flink.

I am trying to figure out how does Flink parses an Apache Calcite sql query to its own Streaming API in order to maybe extend it, because, as far as I know, many operations are still being developed and not currently supported (like TUMBLE windows). I need to be able to load rules from a file , like so:

tableEnv.sql([File])..

in order to do that I need a fully functional Streaming SQL parser.

I am currently analyzing the StreamTableEnvironment class from github [1] in order to understand the method sql but I can't figure out where does the parsing happens.

Can someone point me in the right direction?


[1] https://github.com/apache/flink/blob/d7b59d761601baba6765bb4fc407bcd9fd6a9387/flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/StreamTableEnvironment.scala

Best Regards,
Pedro Chaves

Best Regards,
Pedro Chaves
Reply | Threaded
Open this post in threaded view
|

Re: Flink SQL Stream Parser based on calcite

Fabian Hueske-2
Hi Pedro,

The sql() method calls the Calcite parser in line 129.

Best, Fabian

2016-10-17 16:43 GMT+02:00 PedroMrChaves <[hidden email]>:
Hello,

I am pretty new to Apache Flink.

I am trying to figure out how does Flink parses an Apache Calcite sql query
to its own Streaming API in order to maybe extend it, because, as far as I
know, many operations are still being developed and not currently supported
(like TUMBLE windows). I need to be able to load rules from a file , like
so:

/tableEnv.sql([File])../

in order to do that I need a fully functional Streaming SQL parser.

I am currently analyzing the StreamTableEnvironment class from github [1] in
order to understand the method sql but I can't figure out where does the
parsing happens.

Can someone point me in the right direction?


[1]
https://github.com/apache/flink/blob/d7b59d761601baba6765bb4fc407bcd9fd6a9387/flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/StreamTableEnvironment.scala
<https://github.com/apache/flink/blob/d7b59d761601baba6765bb4fc407bcd9fd6a9387/flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/StreamTableEnvironment.scala>

Best Regards,
Pedro Chaves





--
View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-SQL-Stream-Parser-based-on-calcite-tp9592.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.

Reply | Threaded
Open this post in threaded view
|

Re: Flink SQL Stream Parser based on calcite

PedroMrChaves
Thank you for the response.

I'm not understanding where does something like this,

SELECT * WHERE action='denied'

gets translated to something similar in the Flink Stream API,

filter.(new FilterFunction<Event>() {
                        public boolean filter(Event event) {
                                return event.action.equals("denied");
                        }
                });


or if that happens at all. My idea was to extend the library to support other unsupported
calls like (TUMBLE -> timeWindow) but it's probably more complex than what I'm thinking.

Regards.
Best Regards,
Pedro Chaves
Reply | Threaded
Open this post in threaded view
|

Re: Flink SQL Stream Parser based on calcite

Fabian Hueske-2
The translation is done in multiple stages.

1. Parsing (syntax check)
2. Validation (semantic check)
3. Query optimization (rule and cost based)
4. Generation of physical plan, incl. code generation (DataStream program)

The final translation happens in the DataStream nodes, e.g., DataStreamCalc [1].
I'd recommend to import the source code and to debug the translation process.

I recently gave a talk about the high-level translation process [2].

Best,
Fabian

[1] https://github.com/apache/flink/blob/master/flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/plan/nodes/datastream/DataStreamCalc.scala
[2] http://www.slideshare.net/fhueske/taking-a-look-under-hood-of-apache-flinks-relational-apis





2016-10-17 18:42 GMT+02:00 PedroMrChaves <[hidden email]>:
Thank you for the response.

I'm not understanding where does something like this,

/SELECT * WHERE action='denied' /

gets translated to something similar in the Flink Stream API,

/filter.(new FilterFunction<Event>() {
                        public boolean filter(Event event) {
                                return event.action.equals("denied");
                        }
                });/

or if that happens at all. My idea was to extend the library to support
other unsupported
calls like (TUMBLE -> timeWindow) but it's probably more complex than what
I'm thinking.

Regards.




--
View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-SQL-Stream-Parser-based-on-calcite-tp9592p9596.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.

Reply | Threaded
Open this post in threaded view
|

Re: Flink SQL Stream Parser based on calcite

PedroMrChaves
Thank you.
Great presentation about the high-level translation process.

Regards,
Pedro
Best Regards,
Pedro Chaves