|
Hi Dominik, I think having a single output file is only possible if you set the parallelism of the sink to 1. AFAIK it is not possible to concurrently write to a single HDFS file from multiple clients.
Cheers, Aljoscha Hi everyone,
although this question might sound trivial, I’ve been curious about the following. Given a Flink topology with parallelism level set to 6 for example and outputting the data stream to HDFS using an instance RollingSink, how is the output file structured? By structured, I refer to the fact that this will result in 6 distinct block files, whereas I would like to have a single file containing all of the output values from the DataStream.
Regards,
Dominik
|