Hi Dan
The SQL add the uuid by default is for the case that users want execute
multiple bounded sql and append to the same directory (hive table), thus
a uuid is attached to avoid overriding the previous output.
The datastream could be viewed as providing the low-level api and
thus it does not add the uuid automatically. And as you have pointed out,
by using OutputFileConfig users could also implement the functionality.
Best,
Yun
------------------Original Mail ------------------
Send Date:Mon Feb 8 07:40:36 2021
Subject:UUID in part files
Hi.
Context
I'm migrating my Flink SQL job to DataStream. When switching to StreamingFileSink, I noticed that the part files now do not have a uuid in them. "part-0-0" vs "part-{uuid string}-0-0". This is easy to add with OutputFileConfig.
Question
Is there a reason why the base OutputFileConfig doesn't add the uuid automatically? Is this just a legacy issue? Or do most people not have the uuid in the file outputs?