Login  Register

Re: Integrate Flink with S3 on EMR cluster

Posted by Stephan Ewen on Mar 07, 2017; 6:23pm
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/Integrate-Flink-with-S3-on-EMR-cluster-tp5894p12083.html

@vinay patil - Can you see if the same problem occurs if you use Flink 1.1 - to see if this is a regression in Flink 1.2?



On Tue, Mar 7, 2017 at 6:43 PM, Shannon Carey <[hidden email]> wrote:
Generally, using S3 filesystem in EMR with Flink has worked pretty well for me in Flink < 1.2 (unless you run out of connections in your HTTP pool). When you say, "using Hadoop File System class", what do you mean? In my experience, it's sufficient to just use the "s3://" filesystem protocol and Flink's Hadoop integration (plus S3 filesystem classes provided by EMR) will do the right thing.

-Shannon