Hi All,
Is there a way to specify batch size and compression properties when using StreamingFileSink just like we did in bucketing sink? The only parameters it is accepting is Inactivity bucket check interval and avro schema. We have numerous flink jobs pulling data from the same kafka topics, however doing different operations. And each flink job is writing a file with different size and we would want to make it consistent. Regards, Taher Koitawala GS Lab Pune+91 8407979163 |
Can someone please help with this? On Fri 25 Jan, 2019, 1:47 PM Taher Koitawala <[hidden email] wrote:
|
Hi, Generally speaking you can pass the batch size through
RollingPolicy[1]. Unfortunately BulkFormats uses
OnCheckpointRollingPolicy and AFAIK it does not allow adjusting
its behavior on part size. Maybe Kostas have an idea how to do
that in the least invasive way. How to do it for non bulk formats
you can have a look at[2]. I assume that you were using AvroParquetWriter. You can specify compression on the ParquetWriter I guess the same way as before. The code for doing it can look sth like this: StreamingFileSink.forBulkFormat( Best, Dawid On 26/01/2019 08:18, Taher Koitawala
wrote:
signature.asc (849 bytes) Download Attachment |
Thanks, I'll check it out. On Sat 26 Jan, 2019, 10:48 PM Dawid Wysakowicz <[hidden email] wrote:
|
Free forum by Nabble | Edit this page |