how to use Hadoop Inputformats with flink shaded s3?

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

how to use Hadoop Inputformats with flink shaded s3?

Cliff Resnick
I need to process some Parquet data from S3 as a unioned input in my DataStream pipeline. From what I know, this requires using the hadoop AvroParquetInputFormat.  The problem I'm running into is that also requires using un-shaded hadoop classes that conflict with the Flink shaded hadoop3 FileSystem. The pipleline otherwise runs fine with the shaded fs.

Can anyone successfully read parquet data using the Flink shaded s3 fs? If so can you please clue me in?