Hello all, We are trying to write dataset as parquet format, we use AvroParquetOutputFormat but it is not compatible with Flink’s FileOutputFormat. Is there a way to write dataset as parquet? -Ebru
Hi Ebru,AvroParquetOutputFormat seems to implement Hadoop's OutputFormat interface.Flink provides a wrapper for Hadoop's OutputFormat [1], so you can try to wrap AvroParquetOutputFormat in Flink's HadoopOutputFormat.Hope this helps,Fabian[1] https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/batch/hadoop_compatibility.html#using-hadoop-outputformats2017-11-22 15:21 GMT+01:00 ebru <[hidden email]>:Hello all, We are trying to write dataset as parquet format, we use AvroParquetOutputFormat but it is not compatible with Flink’s FileOutputFormat. Is there a way to write dataset as parquet? -Ebru
On 22 Nov 2017, at 20:47, Flavio Pompermaier <[hidden email]> wrote:I usually refer to this:https://github.com/FelixNeutatz/parquet-flinktacularOn 22 Nov 2017 18:29, "Fabian Hueske" <[hidden email]> wrote:Hi Ebru,AvroParquetOutputFormat seems to implement Hadoop's OutputFormat interface.Flink provides a wrapper for Hadoop's OutputFormat [1], so you can try to wrap AvroParquetOutputFormat in Flink's HadoopOutputFormat.Hope this helps,Fabian[1] https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/batch/hadoop_compatibility.html#using-hadoop-outputformats2017-11-22 15:21 GMT+01:00 ebru <[hidden email]>:Hello all, We are trying to write dataset as parquet format, we use AvroParquetOutputFormat but it is not compatible with Flink’s FileOutputFormat. Is there a way to write dataset as parquet? -Ebru