I was hoping to join a StreamTableSource to a BatchTableSource, but I find it’s not simple. A couple of questions: 1)
Other than just pushing the DataSet to a Kafka topic (either internally or externally to the application) and reading it into a DataStream are there any means of doing the conversion? 2)
Are there any plans to get OrcTableSource to be both StreamTableSource and BatchTableSource instead of just a BatchTableSource? Thanks, James. ###################################################################### The information contained in this communication is confidential and intended only for the individual(s) named above. If you are not a named addressee, please notify the sender immediately and delete this email from your system and do not disclose the email or any part of it to any person. The views expressed in this email are the views of the author and do not necessarily represent the views of Millennium Capital Partners LLP (MCP LLP) or any of its affiliates. Outgoing and incoming electronic communications of MCP LLP and its affiliates, including telephone communications, may be electronically archived and subject to review and/or disclosure to someone other than the recipient. MCP LLP is authorized and regulated by the Financial Conduct Authority. Millennium Capital Partners LLP is a limited liability partnership registered in England & Wales with number OC312897 and with its registered office at 50 Berkeley Street, London, W1J 8HD. ######################################################################
|
Hi Porritt, Flink does not support streaming and batch join, currently, streaming and batch job are both independent. I guess your use case is streaming and dimension table join? Unfortunately, it's not possible for the Flink SQL API to join a stream with a common dataset now. 1) As a workaround, if the table is just a tiny one, you can achieve a inner/left outer join with the user defined table functions : https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/table/sql.html#joins 2) I did not see any plan about this. Thanks, vino. 2018-07-20 17:29 GMT+08:00 Porritt, James <[hidden email]>:
|
Hi James, 1) Unfortunately, Flink does not support DataSet with DataStream joins as of now. If the "batch" table is small enough you might try the solution suggested by Vino to load it in the UDTF. You can also try implementing the Stream version of this table yourself. You can use the org.apache.flink.table.sources.CsvTableSource and org.apache.flink.orc.OrcRowInputFormat as examples. 2) Providing better out-of-the box support for multiple source and formats in high on the roadmap for upcoming releases. So I would guess you can expect support for orc in stream in the nearest future. Best, Dawid On Fri, 20 Jul 2018 at 11:59, vino yang <[hidden email]> wrote:
|
Free forum by Nabble | Edit this page |