Are there any examples showing the use of beam with avro/parquet and a flink runner? I see an avro reader for beam, is it a matter of writing another one for avro-parquet or does this need to use the flink HadoopOutputFormat
for example? Thanks Billy |
Hi,
Beam provides a AvroCoder/AvroIO that you can use, but not yet a ParquetIO (I created a Jira about that and started to work on it). You can use the Avro reader to populate the PCollection and then use a custom DoFn to create the Parquet (waiting for the ParquetIO). Regards JB On 12/12/2016 05:19 PM, Newport, Billy wrote: > Are there any examples showing the use of beam with avro/parquet and a > flink runner? I see an avro reader for beam, is it a matter of writing > another one for avro-parquet or does this need to use the flink > HadoopOutputFormat for example? > > > > Thanks > > Billy > > > -- Jean-Baptiste Onofré [hidden email] http://blog.nanthrax.net Talend - http://www.talend.com |
I don't mind writing one, is there a fork for the ParquetIO works that's already been done or is it in trunk?
The ParquetIO is independent of the runner being used? Is that right? Thanks -----Original Message----- From: Jean-Baptiste Onofré [mailto:[hidden email]] Sent: Monday, December 12, 2016 11:25 AM To: [hidden email] Subject: Re: Avro Parquet/Flink/Beam Hi, Beam provides a AvroCoder/AvroIO that you can use, but not yet a ParquetIO (I created a Jira about that and started to work on it). You can use the Avro reader to populate the PCollection and then use a custom DoFn to create the Parquet (waiting for the ParquetIO). Regards JB On 12/12/2016 05:19 PM, Newport, Billy wrote: > Are there any examples showing the use of beam with avro/parquet and a > flink runner? I see an avro reader for beam, is it a matter of writing > another one for avro-parquet or does this need to use the flink > HadoopOutputFormat for example? > > > > Thanks > > Billy > > > -- Jean-Baptiste Onofré [hidden email] https://urldefense.proofpoint.com/v2/url?u=http-3A__blog.nanthrax.net&d=DgID-g&c=7563p3e2zaQw0AB1wrFVgyagb2IE5rTZOYPxLxfZlX4&r=rlkM70D3djmDN7dGPzzbVKG26ShcTFDMKlX5AWucE5Q&m=wsZfFaIgCU4OQCJzjCyCLIVFFKeRBjbv4lB3kSqYRjw&s=AnmdxwKDl7BYeuvQ001GrywGxW0Kvnwtgs3ikrNou8Y&e= Talend - https://urldefense.proofpoint.com/v2/url?u=http-3A__www.talend.com&d=DgID-g&c=7563p3e2zaQw0AB1wrFVgyagb2IE5rTZOYPxLxfZlX4&r=rlkM70D3djmDN7dGPzzbVKG26ShcTFDMKlX5AWucE5Q&m=wsZfFaIgCU4OQCJzjCyCLIVFFKeRBjbv4lB3kSqYRjw&s=5T8pN5Tz5hIpwH9uf77csajX0wJLjHzJ3kyqSzxQ2Xw&e= |
Hi Billy,
I will push my branch with ParquetIO on my github. Yes, the Beam IO is independent from the runner. Regards JB On 12/12/2016 05:29 PM, Newport, Billy wrote: > I don't mind writing one, is there a fork for the ParquetIO works that's already been done or is it in trunk? > > The ParquetIO is independent of the runner being used? Is that right? > > Thanks > > -----Original Message----- > From: Jean-Baptiste Onofré [mailto:[hidden email]] > Sent: Monday, December 12, 2016 11:25 AM > To: [hidden email] > Subject: Re: Avro Parquet/Flink/Beam > > Hi, > > Beam provides a AvroCoder/AvroIO that you can use, but not yet a > ParquetIO (I created a Jira about that and started to work on it). > > You can use the Avro reader to populate the PCollection and then use a > custom DoFn to create the Parquet (waiting for the ParquetIO). > > Regards > JB > > On 12/12/2016 05:19 PM, Newport, Billy wrote: >> Are there any examples showing the use of beam with avro/parquet and a >> flink runner? I see an avro reader for beam, is it a matter of writing >> another one for avro-parquet or does this need to use the flink >> HadoopOutputFormat for example? >> >> >> >> Thanks >> >> Billy >> >> >> > -- Jean-Baptiste Onofré [hidden email] http://blog.nanthrax.net Talend - http://www.talend.com |
Is your parquetio going to be accepted in to 0.4?
Also, do you have a link to your github? Thanks -----Original Message----- From: Jean-Baptiste Onofré [mailto:[hidden email]] Sent: Monday, December 12, 2016 11:49 AM To: [hidden email] Subject: Re: Avro Parquet/Flink/Beam Hi Billy, I will push my branch with ParquetIO on my github. Yes, the Beam IO is independent from the runner. Regards JB On 12/12/2016 05:29 PM, Newport, Billy wrote: > I don't mind writing one, is there a fork for the ParquetIO works that's already been done or is it in trunk? > > The ParquetIO is independent of the runner being used? Is that right? > > Thanks > > -----Original Message----- > From: Jean-Baptiste Onofré [mailto:[hidden email]] > Sent: Monday, December 12, 2016 11:25 AM > To: [hidden email] > Subject: Re: Avro Parquet/Flink/Beam > > Hi, > > Beam provides a AvroCoder/AvroIO that you can use, but not yet a > ParquetIO (I created a Jira about that and started to work on it). > > You can use the Avro reader to populate the PCollection and then use a > custom DoFn to create the Parquet (waiting for the ParquetIO). > > Regards > JB > > On 12/12/2016 05:19 PM, Newport, Billy wrote: >> Are there any examples showing the use of beam with avro/parquet and a >> flink runner? I see an avro reader for beam, is it a matter of writing >> another one for avro-parquet or does this need to use the flink >> HadoopOutputFormat for example? >> >> >> >> Thanks >> >> Billy >> >> >> > -- Jean-Baptiste Onofré [hidden email] https://urldefense.proofpoint.com/v2/url?u=http-3A__blog.nanthrax.net&d=DgID-g&c=7563p3e2zaQw0AB1wrFVgyagb2IE5rTZOYPxLxfZlX4&r=rlkM70D3djmDN7dGPzzbVKG26ShcTFDMKlX5AWucE5Q&m=EwGuUUxM48zoWoOis4Qf-DWNAER-A45_WBY7OJouJWQ&s=7-6dzKAcQozOmfL30C0Y44i2mkkAf_Vi5CxKjgWgM5Y&e= Talend - https://urldefense.proofpoint.com/v2/url?u=http-3A__www.talend.com&d=DgID-g&c=7563p3e2zaQw0AB1wrFVgyagb2IE5rTZOYPxLxfZlX4&r=rlkM70D3djmDN7dGPzzbVKG26ShcTFDMKlX5AWucE5Q&m=EwGuUUxM48zoWoOis4Qf-DWNAER-A45_WBY7OJouJWQ&s=B9Rvx9ad1wvy-Uc01v9S47e48k1uBZooIucUVuiZr2M&e= |
Hi Billy,
no, ParquetIO is in early stage and won't be included in 0.4.0-incubating (that I will prepare pretty soon). I will push the branch on my github (didn't have time yet, sorry about that). Regards JB On 12/13/2016 05:08 PM, Newport, Billy wrote: > Is your parquetio going to be accepted in to 0.4? > > Also, do you have a link to your github? > > > Thanks > > -----Original Message----- > From: Jean-Baptiste Onofré [mailto:[hidden email]] > Sent: Monday, December 12, 2016 11:49 AM > To: [hidden email] > Subject: Re: Avro Parquet/Flink/Beam > > Hi Billy, > > I will push my branch with ParquetIO on my github. > > Yes, the Beam IO is independent from the runner. > > Regards > JB > > On 12/12/2016 05:29 PM, Newport, Billy wrote: >> I don't mind writing one, is there a fork for the ParquetIO works that's already been done or is it in trunk? >> >> The ParquetIO is independent of the runner being used? Is that right? >> >> Thanks >> >> -----Original Message----- >> From: Jean-Baptiste Onofré [mailto:[hidden email]] >> Sent: Monday, December 12, 2016 11:25 AM >> To: [hidden email] >> Subject: Re: Avro Parquet/Flink/Beam >> >> Hi, >> >> Beam provides a AvroCoder/AvroIO that you can use, but not yet a >> ParquetIO (I created a Jira about that and started to work on it). >> >> You can use the Avro reader to populate the PCollection and then use a >> custom DoFn to create the Parquet (waiting for the ParquetIO). >> >> Regards >> JB >> >> On 12/12/2016 05:19 PM, Newport, Billy wrote: >>> Are there any examples showing the use of beam with avro/parquet and a >>> flink runner? I see an avro reader for beam, is it a matter of writing >>> another one for avro-parquet or does this need to use the flink >>> HadoopOutputFormat for example? >>> >>> >>> >>> Thanks >>> >>> Billy >>> >>> >>> >> > -- Jean-Baptiste Onofré [hidden email] http://blog.nanthrax.net Talend - http://www.talend.com |
Did you manage to push yet?
Thanks -----Original Message----- From: Jean-Baptiste Onofré [mailto:[hidden email]] Sent: Tuesday, December 13, 2016 11:12 AM To: [hidden email] Subject: Re: Avro Parquet/Flink/Beam Hi Billy, no, ParquetIO is in early stage and won't be included in 0.4.0-incubating (that I will prepare pretty soon). I will push the branch on my github (didn't have time yet, sorry about that). Regards JB On 12/13/2016 05:08 PM, Newport, Billy wrote: > Is your parquetio going to be accepted in to 0.4? > > Also, do you have a link to your github? > > > Thanks > > -----Original Message----- > From: Jean-Baptiste Onofré [mailto:[hidden email]] > Sent: Monday, December 12, 2016 11:49 AM > To: [hidden email] > Subject: Re: Avro Parquet/Flink/Beam > > Hi Billy, > > I will push my branch with ParquetIO on my github. > > Yes, the Beam IO is independent from the runner. > > Regards > JB > > On 12/12/2016 05:29 PM, Newport, Billy wrote: >> I don't mind writing one, is there a fork for the ParquetIO works that's already been done or is it in trunk? >> >> The ParquetIO is independent of the runner being used? Is that right? >> >> Thanks >> >> -----Original Message----- >> From: Jean-Baptiste Onofré [mailto:[hidden email]] >> Sent: Monday, December 12, 2016 11:25 AM >> To: [hidden email] >> Subject: Re: Avro Parquet/Flink/Beam >> >> Hi, >> >> Beam provides a AvroCoder/AvroIO that you can use, but not yet a >> ParquetIO (I created a Jira about that and started to work on it). >> >> You can use the Avro reader to populate the PCollection and then use a >> custom DoFn to create the Parquet (waiting for the ParquetIO). >> >> Regards >> JB >> >> On 12/12/2016 05:19 PM, Newport, Billy wrote: >>> Are there any examples showing the use of beam with avro/parquet and a >>> flink runner? I see an avro reader for beam, is it a matter of writing >>> another one for avro-parquet or does this need to use the flink >>> HadoopOutputFormat for example? >>> >>> >>> >>> Thanks >>> >>> Billy >>> >>> >>> >> > -- Jean-Baptiste Onofré [hidden email] https://urldefense.proofpoint.com/v2/url?u=http-3A__blog.nanthrax.net&d=DgID-g&c=7563p3e2zaQw0AB1wrFVgyagb2IE5rTZOYPxLxfZlX4&r=rlkM70D3djmDN7dGPzzbVKG26ShcTFDMKlX5AWucE5Q&m=foW01bjB8Oy4ICqJ1GJc9WFEdV5nC7P6yv_tOZMICIA&s=OYTxPXi8et-CQqmqM0Q2Pa-JltDAlVas6CwMfEPlGhA&e= Talend - https://urldefense.proofpoint.com/v2/url?u=http-3A__www.talend.com&d=DgID-g&c=7563p3e2zaQw0AB1wrFVgyagb2IE5rTZOYPxLxfZlX4&r=rlkM70D3djmDN7dGPzzbVKG26ShcTFDMKlX5AWucE5Q&m=foW01bjB8Oy4ICqJ1GJc9WFEdV5nC7P6yv_tOZMICIA&s=XPIN-RVxb72Xi67lD_FvmvDZXyX8zN_c98au7cUzvWQ&e= |
Free forum by Nabble | Edit this page |