Some users are running into issues when using Azure Blob Storage for the StreamFileSink
The issue is because certain packages are relocated in the POM file and some classes are dropped in the final shaded jar I have attempted to comment out the relocated and recompile the source but I keep hitting roadblocks of other relocation and filtration each time I update a specific pom file How can this be addressed so that these users can be unblocked? Why are the classes filtered out? What is the workaround? I can work on the patch if I have some guidance. This is an issue in Flink 1.9 and 1.10 and I believe 1.11 has the same issue but I am yet to confirm Thanks. |
You can assign the task to me and I will like to collaborate with someone to fix it. On Wed, May 27, 2020 at 5:52 PM Israel Ekpo <[hidden email]> wrote:
|
Hi, I think the StreamingFileSink could not support Azure currently. You could find more detailed info from here[1]. Israel Ekpo <[hidden email]> 于2020年5月28日周四 上午6:04写道:
|
Guowei, What do we need to do to add support for it? How do I get started on that? On Wed, May 27, 2020 at 8:53 PM Guowei Ma <[hidden email]> wrote:
|
Hi Israel, thanks for reaching out to the Flink community. As Guowei said, the StreamingFileSink can currently only recover from faults if it writes to HDFS or S3. Other file systems are currently not supported if you need fault tolerance. Maybe Klou can tell you more about the background and what is needed to make it work with other file systems. He is one of the original authors of the StreamingFileSink. Cheers, Till On Thu, May 28, 2020 at 4:39 PM Israel Ekpo <[hidden email]> wrote:
|
Hi Till, Thanks for your feedback and guidance. It seems similar work was done for S3 filesystem where relocations were removed for those file system plugins. It appears the same needs to be done for Azure File systems. I will attempt to connect with Klou today to collaborate to see what the level of effort is to add this support. Thanks. On Thu, May 28, 2020 at 11:54 AM Till Rohrmann <[hidden email]> wrote:
|
I think what needs to be done is to implement a org.apache.flink.core.fs.RecoverableWriter for the respective file system. Similar to HadoopRecoverableWriter and S3RecoverableWriter. Cheers, Till On Thu, May 28, 2020 at 6:00 PM Israel Ekpo <[hidden email]> wrote:
|
Thanks Till. I will take a look at that tomorrow and let you know if I hit any roadblocks. On Thu, May 28, 2020 at 12:11 PM Till Rohrmann <[hidden email]> wrote:
|
Free forum by Nabble | Edit this page |