Hi Flink users,
I am a big fan of table API and we are extensively using it on petabytes with ad-hoc queries. In our system, nobody goes and writes table creation DDL, we inherit table schema from schema registry(avro data in Kafka) dynamically and create temporary tables in the session. I am using methods like `connect`, `registerTableSource` to generate my tables dynamically which are now giving deprecation warnings. It is just simply not practical to write SQL for table generation with schema. I may end-up doing a lot of string operation for generating SQL statement from Avro Schema for generating tables and use `executeSql`, but I don't think it will be as cleaner as using `connect`, `registerTableSource`. I am little bit worried about fetching new Flink releases to my application now. Is there any specific reason low level support for creating tables are getting deprecated? Best regards, -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ |
Hi,
I agree that both `connect` and `registerTableSource` are useful for generating Table API pipelines. It is likely that both API methods will get a replacement in the near future. Let me explain the current status briefly: connect(): The CREATE TABLE DDL evolved faster than connect(). The method still uses the old pre FLIP-95 interface stack. In order to avoid confusion we decided to deprecate it already (without dropping it for now). There are plans to reintroduce it https://cwiki.apache.org/confluence/display/FLINK/FLIP-129%3A+Refactor+Descriptor+API+to+register+connectors+in+Table+API some prerequisite work has been done in https://cwiki.apache.org/confluence/display/FLINK/FLIP-164%3A+Improve+Schema+Handling+in+Catalogs registerTableSource(): This is a bit tricky because the new table sources and sinks will work on internal data structures. The easiest solution is to use the Data Stream API and the new `fromDataStream` or `fromChangelogStream` introduced in Flink 1.13. Otherwise you can implement a helper class that extends `ScanTableSource` and uses a `DataStreamScanProvider` with a factory to call it via SQL DDL. We might provide such a wrapper class in the near future for use cases like yours. I hope this helps. Feel free to use the deprecated method longer until the community provides alternatives that are more helpful. Thanks for the feedback, Timo On 15.05.21 04:11, lalala wrote: > Hi Flink users, > > I am a big fan of table API and we are extensively using it on petabytes > with ad-hoc queries. In our system, nobody goes and writes table creation > DDL, we inherit table schema from schema registry(avro data in Kafka) > dynamically and create temporary tables in the session. > > I am using methods like `connect`, `registerTableSource` to generate my > tables dynamically which are now giving deprecation warnings. It is just > simply not practical to write SQL for table generation with schema. I may > end-up doing a lot of string operation for generating SQL statement from > Avro Schema for generating tables and use `executeSql`, but I don't think it > will be as cleaner as using `connect`, `registerTableSource`. I am little > bit worried about fetching new Flink releases to my application now. > > Is there any specific reason low level support for creating tables are > getting deprecated? > > Best regards, > > > > -- > Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ > |
Free forum by Nabble | Edit this page |