Hi Flink Team,
I'm Mariano & I'm working with Apache Flink to process data and sink from Kafka to Azure Datalake (ADLS Gen1).
We are having problems with the sink in parquet format in the ADLS Gen1, also don't work with the gen2.
We try to do the StreamingFileSink to store in parquet but we can't sink in adls because the hadoop doesn't work fine with the library and adls prefix. (HDFS problem... https://stackoverflow.com/questions/62884450/problem-with-flink-streamingfilesinkgenericrecord-azure-datalake-gen-2)
We change and use the deprecated BucketingSink (Works for adls without .setWriter) but we can't sink in Parquet format using the .setWriter.
You have some suggestion for do the sink to ADLS Gen1 or Gen2 or you have a new feature in the future to use the FileStreaming.
Thank you very much
Best Regards |
Hi Mariano, thanks a lot for your question. The resolution on StackOverflow seems to be that Azure Datalake is not yet (https://issues.apache.org/jira/browse/FLINK-18568) supported by the StreamingFileSink. On Thu, Jul 30, 2020 at 5:34 PM Mariano González Núñez <[hidden email]> wrote:
|
Hi Robert,
Thanks for the answer...
De: Robert Metzger <[hidden email]>
Enviado: martes, 11 de agosto de 2020 3:46 Para: Mariano González Núñez <[hidden email]> Cc: [hidden email] <[hidden email]> Asunto: Re: BucketingSink & StreamingFileSink Hi Mariano,
thanks a lot for your question. The resolution on StackOverflow seems to be that Azure Datalake is not yet (https://issues.apache.org/jira/browse/FLINK-18568) supported by
the StreamingFileSink.
On Thu, Jul 30, 2020 at 5:34 PM Mariano González Núñez <[hidden email]> wrote:
|
Free forum by Nabble | Edit this page |