Re: StreamingFileSink only writes data to MINIO during savepoint

Posted by David Anderson-4 on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/StreamingFileSink-only-writes-data-to-MINIO-during-savepoint-tp44043p44089.html

The StreamingFileSink requires that you have checkpointing enabled. I'm guessing that you don't have checkpointing enabled, since that would explain the behavior you are seeing.

The relevant section of the docs [1] explains:

Checkpointing needs to be enabled when using the StreamingFileSink. Part files can only be finalized on successful checkpoints. If checkpointing is disabled, part files will forever stay in the in-progress or the pending state, and cannot be safely read by downstream systems.

Regards,
David

[1] https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/connectors/datastream/streamfile_sink/#streaming-file-sink

On Fri, May 28, 2021 at 5:26 PM Robert Cullen <[hidden email]> wrote:
On my kubernetes cluster when I set the StreamingFileSink to write to a local instance of S3 (MINIO - 500 GB) it only writes the data after I execute a savepoint

The expected behavior is to write the data in real-time. I'm guessing the memory requirements have not been met or a configuration in MINIO is missing?  Any ideas?

--
Robert Cullen
240-475-4490