Re: OrcTableSource in flink 1.12
Posted by
Nikola Hrusov on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/OrcTableSource-in-flink-1-12-tp42410p42442.html
Hi Timo,
I need to read ORC files and run a query on them as in the example above. Since the example given in docs is not recommended what should I use?
However, it doesn't say what I should use instead?
I have looked in all the docs available for 1.12 but I cannot find how to achieve the same result as it was in some previous versions. In some previous versions you could define `tableEnv.registerTableSource(tableName, orcTableSource);` but that method is not available anymore.
What is the way to go from here? I would like to read from orc files, run a query and transform the result. I do not necessarily need it to be with the DataSet API.
Hi Nikola,
the OrcTableSource has not been updated to be used in a SQL DDL. You can
define your own table factory [1] that translates properties into a
object to create instances or use
`org.apache.flink.table.api.TableEnvironment#fromTableSource`. I
recommend the latter option.
Please keep in mind that we are about to drop DataSet support for Table
API in 1.13. Batch and streaming use cases are already possible with the
unified TableEnvironment.
Are you sure that you really need DataSet API?
Regards,
Timo
[1]
https://ci.apache.org/projects/flink/flink-docs-master/docs/dev/table/sourcessinks/
On 21.03.21 15:42, Nikola Hrusov wrote:
> Hello,
>
> I am trying to find some examples of how to use the OrcTableSource and
> query it.
> I got to the documentation here:
> https://ci.apache.org/projects/flink/flink-docs-release-1.12/api/java/org/apache/flink/orc/OrcTableSource.html
> <https://ci.apache.org/projects/flink/flink-docs-release-1.12/api/java/org/apache/flink/orc/OrcTableSource.html>
> and it says that an OrcTableSource is used as below:
>
> |OrcTableSource orcSrc = OrcTableSource.builder()
> .path("file:///my/data/file.orc")
> .forOrcSchema("struct<col1:boolean,col2:tinyint,col3:smallint,col4:int>") .build();
> tEnv.registerTableSourceInternal("orcTable", orcSrc); Table res =
> tableEnv.sqlQuery("SELECT * FROM orcTable"); |
>
>
> My question is what should tEnv be so that I can use
> the registerTableSourceInternal method?
> My end goal is to query the orc source and then return a DataSet.
>
> Regards
> ,
> Nikola