Parquet format in Flink 1.11

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Parquet format in Flink 1.11

Flavio Pompermaier
Hi to all,
in my current code I use the legacy Hadoop Output format to write my Parquet files.
I wanted to use the new Parquet format of Flink 1.11 but I can't find how to migrate the following properties:

ParquetOutputFormat.setBlockSize(job, parquetBlockSize);
ParquetOutputFormat.setEnableDictionary(job, true);
ParquetOutputFormat.setCompression(job, CompressionCodecName.SNAPPY);

Is there a way to set those configs?
And if not, is there a way to handle them without modifying the source of the flink connector (i.e. extending some class)?

Best,
Flavio
Reply | Threaded
Open this post in threaded view
|

Re: Parquet format in Flink 1.11

godfrey he
hi Flavio,

Parquet format supports configuration from ParquetOutputFormat. please refer to [1] for details


Best,
Godfrey



Flavio Pompermaier <[hidden email]> 于2020年7月15日周三 下午8:44写道:
Hi to all,
in my current code I use the legacy Hadoop Output format to write my Parquet files.
I wanted to use the new Parquet format of Flink 1.11 but I can't find how to migrate the following properties:

ParquetOutputFormat.setBlockSize(job, parquetBlockSize);
ParquetOutputFormat.setEnableDictionary(job, true);
ParquetOutputFormat.setCompression(job, CompressionCodecName.SNAPPY);

Is there a way to set those configs?
And if not, is there a way to handle them without modifying the source of the flink connector (i.e. extending some class)?

Best,
Flavio
Reply | Threaded
Open this post in threaded view
|

Re: Parquet format in Flink 1.11

Flavio Pompermaier
Ok, thanks Godfrey.

On Wed, Jul 15, 2020 at 3:03 PM godfrey he <[hidden email]> wrote:
hi Flavio,

Parquet format supports configuration from ParquetOutputFormat. please refer to [1] for details


Best,
Godfrey



Flavio Pompermaier <[hidden email]> 于2020年7月15日周三 下午8:44写道:
Hi to all,
in my current code I use the legacy Hadoop Output format to write my Parquet files.
I wanted to use the new Parquet format of Flink 1.11 but I can't find how to migrate the following properties:

ParquetOutputFormat.setBlockSize(job, parquetBlockSize);
ParquetOutputFormat.setEnableDictionary(job, true);
ParquetOutputFormat.setCompression(job, CompressionCodecName.SNAPPY);

Is there a way to set those configs?
And if not, is there a way to handle them without modifying the source of the flink connector (i.e. extending some class)?

Best,
Flavio