When running Flink 1.7 on EMR 5.21 using StreamingFileSink we see java.lang.UnsupportedOperationException: Recoverable writers on Hadoop are only supported for HDFS and for Hadoop version 2.7 or newer. EMR is showing Hadoop version 2.8.5. Is anyone else seeing this issue? |
Hi Kevin, could you check what's on the class path of the Flink cluster? You should see this in the jobmanager.log at the top. It seems as if there is a Hadoop dependency with a lower version. Flink 1.7 is build against which Hadoop version? You should make sure that you either use the Hadoop-free version of the version where the Hadoop version is >= 2.7. Not sure what option EMR offers here. Cheers, Till On Tue, Feb 26, 2019 at 12:23 AM Bohinski, Kevin (Contractor) <[hidden email]> wrote:
|
Hi Till,
The only potential issue in the path I see is `/usr/share/aws/emr/emrfs/lib/emrfs-hadoop-assembly-2.29.0.jar`. I double checked my pom, the project is Hadoop-free. The JM log also shows `INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Hadoop version: 2.8.5-amzn-1`. Best, Kevin -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ |
Hmm good question, I've pulled in Kostas who worked on the StreamingFileSink. He might be able to tell you more in case that there is some special behaviour wrt the Hadoop file systems. Cheers, Till On Tue, Feb 26, 2019 at 3:29 PM kb <[hidden email]> wrote: Hi Till, |
Hi Kevin, I cannot find anything obviously wrong from what you describe. Just to eliminate the obvious, you are specifying "hdfs" as the scheme for your file path, right? Cheers, Kostas On Tue, Feb 26, 2019 at 3:35 PM Till Rohrmann <[hidden email]> wrote:
|
Hi, I am having the same issue, but it is related to what Kostas is pointing out. I was trying to stream to the "s3" scheme and not "hdfs", and then getting that exception. I have realised that somehow I need to reach the S3RecoverableWriter, and found out it is in a difference library "flink-s3-fs-hadoop". Still trying to figure out how to make it work, though. I am aiming for code such as: val sink = StreamingFileSink .forBulkFormat(new Path("s3://...."), ...) .build() Cheers, Bruno On Tue, 26 Feb 2019 at 14:59, Kostas Kloudas <[hidden email]> wrote:
|
Hi Bruno,
Thanks for verifying. We are aiming for the same. Best, Kevin -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ |
Hey, Got it working, basically you need to add the flink-s3-fs-hadoop-1.7.2.jar libraries from the /opt folder of the flink distribution into the /usr/lib/flink/lib. That has done the trick for me. Cheers, Bruno On Tue, 26 Feb 2019 at 16:28, kb <[hidden email]> wrote: Hi Bruno, |
Hi,
So 1.7.2 jar has the fix? Thanks Kevin -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ |
Hi, That Jar must exist for all the 1.7 versions, but I was replacing the libs for the Flink provided by the AWS EMR (1.7.0) by the more recent ones. But you could download the 1.7.0 distribution and copy the flink-s3-fs-hadoop-1.7.0.jar from there into the /usr/lib/flink/lib folder. But knowing there is a more recent 1.7 release out there, I prefer replacing the one in the EMR by this one. To do so, we basically replace the libs in the /usr/lib/flink/lib folder by the ones from the most recent distribution. Cheers, Bruno On Tue, 26 Feb 2019 at 21:37, kb <[hidden email]> wrote: Hi, |
Thanks! This fixed it.
-- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ |
Free forum by Nabble | Edit this page |