I need to process some Parquet data from S3 as a unioned input in my DataStream pipeline. From what I know, this requires using the hadoop AvroParquetInputFormat. The problem I'm running into is that also requires using un-shaded hadoop classes that conflict with the Flink shaded hadoop3 FileSystem. The pipleline otherwise runs fine with the shaded fs.
Can anyone successfully read parquet data using the Flink shaded s3 fs? If so can you please clue me in?