Re: OrcTableSource in flink 1.12

Posted by Timo Walther on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/OrcTableSource-in-flink-1-12-tp42410p42466.html

Hi Nikola,

for the ORC source it is fine to use `TableEnvironment#fromTableSource`.
It is true that this method is deprecated, but as I said not all
connectors have been ported to be supported in the SQL DDL via string
properties. Therefore, `TableEnvironment#fromTableSource` is still
accessible until all connectors are support in the DDL.

Btw it might also make sense to look into the Hive connector for reading
ORC.

Regards,
Timo

On 22.03.21 18:02, Nikola Hrusov wrote:

> Hi Timo,
>
> I need to read ORC files and run a query on them as in the example
> above. Since the example given in docs is not recommended what should I use?
>
> I looked into the method you suggest - TableEnvironment#fromTableSource
> - it shows as Deprecated on the docs:
> https://ci.apache.org/projects/flink/flink-docs-release-1.12/api/java/org/apache/flink/table/api/TableEnvironment.html#fromTableSource-org.apache.flink.table.sources.TableSource- 
> <https://ci.apache.org/projects/flink/flink-docs-release-1.12/api/java/org/apache/flink/table/api/TableEnvironment.html#fromTableSource-org.apache.flink.table.sources.TableSource->
>
> However, it doesn't say what I should use instead?
>
> I have looked in all the docs available for 1.12 but I cannot find how
> to achieve the same result as it was in some previous versions. In some
> previous versions you could define
> `tableEnv.registerTableSource(tableName, orcTableSource);` but that
> method is not available anymore.
>
> What is the way to go from here? I would like to read from orc files,
> run a query and transform the result. I do not necessarily need it to be
> with the DataSet API.
>
> Regards
> ,
> Nikola
>
> On Mon, Mar 22, 2021 at 6:49 PM Timo Walther <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>     Hi Nikola,
>
>
>     the OrcTableSource has not been updated to be used in a SQL DDL. You
>     can
>     define your own table factory [1] that translates properties into a
>     object to create instances or use
>     `org.apache.flink.table.api.TableEnvironment#fromTableSource`. I
>     recommend the latter option.
>
>     Please keep in mind that we are about to drop DataSet support for Table
>     API in 1.13. Batch and streaming use cases are already possible with
>     the
>     unified TableEnvironment.
>
>     Are you sure that you really need DataSet API?
>
>     Regards,
>     Timo
>
>     [1]
>     https://ci.apache.org/projects/flink/flink-docs-master/docs/dev/table/sourcessinks/
>     <https://ci.apache.org/projects/flink/flink-docs-master/docs/dev/table/sourcessinks/>
>
>     On 21.03.21 15:42, Nikola Hrusov wrote:
>      > Hello,
>      >
>      > I am trying to find some examples of how to use the
>     OrcTableSource and
>      > query it.
>      > I got to the documentation here:
>      >
>     https://ci.apache.org/projects/flink/flink-docs-release-1.12/api/java/org/apache/flink/orc/OrcTableSource.html
>     <https://ci.apache.org/projects/flink/flink-docs-release-1.12/api/java/org/apache/flink/orc/OrcTableSource.html>
>
>      >
>     <https://ci.apache.org/projects/flink/flink-docs-release-1.12/api/java/org/apache/flink/orc/OrcTableSource.html
>     <https://ci.apache.org/projects/flink/flink-docs-release-1.12/api/java/org/apache/flink/orc/OrcTableSource.html>>
>
>      > and it says that an OrcTableSource is used as below:
>      >
>      > |OrcTableSource orcSrc = OrcTableSource.builder()
>      > .path("file:///my/data/file.orc")
>      >
>     .forOrcSchema("struct<col1:boolean,col2:tinyint,col3:smallint,col4:int>")
>     .build();
>      > tEnv.registerTableSourceInternal("orcTable", orcSrc); Table res =
>      > tableEnv.sqlQuery("SELECT * FROM orcTable"); |
>      >
>      >
>      > My question is what should tEnv be so that I can use
>      > the registerTableSourceInternal method?
>      > My end goal is to query the orc source and then return a DataSet.
>      >
>      > Regards
>      > ,
>      > Nikola
>