Hello,
StreamingFileSink's part file naming convention is not adjustable. It has form: part-<integer>-<integer>.
My use case for StreamingFileSink is a Kafka -> S3 pipeline, and files are read and processed from S3 using spark. In almost all cases, I want to compress raw data before writing to S3 using the BulkFormat.
Spark relies on filename extensions to do compression inference, so the current naming scheme results in gibberish. I see that 1.10 currently provides the ability to customize the suffix/prefix, but I really need an alternative solution to this as soon as possible. Can this be backported to 1.9, or are there alternatives?