Re: Avro Parquet/Flink/Beam

Posted by Jean-Baptiste Onofré on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/Avro-Parquet-Flink-Beam-tp10590p10593.html

Hi Billy,

I will push my branch with ParquetIO on my github.

Yes, the Beam IO is independent from the runner.

Regards
JB

On 12/12/2016 05:29 PM, Newport, Billy wrote:

> I don't mind writing one, is there a fork for the ParquetIO works that's already been done or is it in trunk?
>
> The ParquetIO is independent of the runner being used? Is that right?
>
> Thanks
>
> -----Original Message-----
> From: Jean-Baptiste Onofré [mailto:[hidden email]]
> Sent: Monday, December 12, 2016 11:25 AM
> To: [hidden email]
> Subject: Re: Avro Parquet/Flink/Beam
>
> Hi,
>
> Beam provides a AvroCoder/AvroIO that you can use, but not yet a
> ParquetIO (I created a Jira about that and started to work on it).
>
> You can use the Avro reader to populate the PCollection and then use a
> custom DoFn to create the Parquet (waiting for the ParquetIO).
>
> Regards
> JB
>
> On 12/12/2016 05:19 PM, Newport, Billy wrote:
>> Are there any examples showing the use of beam with avro/parquet and a
>> flink runner? I see an avro reader for beam, is it a matter of writing
>> another one for avro-parquet or does this need to use the flink
>> HadoopOutputFormat for example?
>>
>>
>>
>> Thanks
>>
>> Billy
>>
>>
>>
>

--
Jean-Baptiste Onofré
[hidden email]
http://blog.nanthrax.net
Talend - http://www.talend.com