Processing S3 data with Apache Flink

classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|

Processing S3 data with Apache Flink

Kostiantyn Kudriavtsev
Hi guys,

I,m trying to get work Apache Flink 0.9.1 on EMR, basically to read 
data from S3. I tried the following path for data 
<a href="s3://mybucket.s3.amazonaws.com/folder">s3://mybucket.s3.amazonaws.com/folder, but it throws me the following 
exception:

java.io.IOException: Cannot establish connection to Amazon S3: 
com.amazonaws.services.s3.model.AmazonS3Exception: The request signature 
we calculated does not match the signature you provided. Check your key 
and signing method. (Service: Amazon S3; Status Code: 403;

I added access and secret keys, so the problem is not here. I=92m using 
standard region and gave read credential to everyone.

Any ideas how can it be fixed?

Thank you in advance,
Kostia
Reply | Threaded
Open this post in threaded view
|

Re: Processing S3 data with Apache Flink

rmetzger0
Hi Kostia,

thank you for writing to the Flink mailing list. I actually started to try out our S3 File system support after I saw your question on StackOverflow [1].
I found that our S3 connector is very broken. I had to resolve two more issues with it, before I was able to get the same exception you reported.

Another Flink commiter looked into the issue as well (it was confirmed as well) but there was no solution [2].

So for now, I would say we have to assume that our S3 connector is not working. I will start a separate discussion at the developer mailing list to remove our S3 connector.

The good news is that you can just use Hadoop's S3 File System implementation with Flink.

I used this Flink program to verify its working:
public class S3FileSystem {
public static void main(String[] args) throws Exception {
ExecutionEnvironment ee = ExecutionEnvironment.createLocalEnvironment();
DataSet<String> myLines = ee.readTextFile("s3n://my-bucket-name/some-test-file.xml");
myLines.print();
}
}
also, you need to make a Hadoop configuration file available to Flink.
When running flink locally in your IDE, just create a "core-site.xml" in the src/main/resource folder, with the following content:

<configuration>

<property>
<name>fs.s3n.awsAccessKeyId</name>
<value>putKeyHere</value>
</property>

<property>
<name>fs.s3n.awsSecretAccessKey</name>
<value>putSecretHere</value>
</property>
<property>
<name>fs.s3n.impl</name>
<value>org.apache.hadoop.fs.s3native.NativeS3FileSystem</value>
</property>
</configuration>
Maybe you are running on a cluster, then re-use the existing core-site.xml file (= edit it) and point to the directory using Flink's fs.hdfs.hadoopconf configuration option.

With these two things in place, you should be good to go.


On Mon, Oct 5, 2015 at 8:19 PM, Kostiantyn Kudriavtsev <[hidden email]> wrote:
Hi guys,

I,m trying to get work Apache Flink 0.9.1 on EMR, basically to read 
data from S3. I tried the following path for data 
s3://mybucket.s3.amazonaws.com/folder, but it throws me the following 
exception:

java.io.IOException: Cannot establish connection to Amazon S3: 
com.amazonaws.services.s3.model.AmazonS3Exception: The request signature 
we calculated does not match the signature you provided. Check your key 
and signing method. (Service: Amazon S3; Status Code: 403;

I added access and secret keys, so the problem is not here. I=92m using 
standard region and gave read credential to everyone.

Any ideas how can it be fixed?

Thank you in advance,
Kostia

Reply | Threaded
Open this post in threaded view
|

Re: Processing S3 data with Apache Flink

Kostiantyn Kudriavtsev
Hi Robert,

thank you very much for your input!

Have you tried that?
With org.apache.hadoop.fs.s3native.NativeS3FileSystem I moved forward, and now got a new exception:


Caused by: org.jets3t.service.S3ServiceException: S3 HEAD request failed for '/***.csv' - ResponseCode=403, ResponseMessage=Forbidden

it's really strange as far as I gave full permissions to authenticated users and can get target file from s3cmd or s3 browser from the same PC... I realize that it's question not to you, but perhaps you have faced the same issue

Thanks in advance!
Kostia

Thank you,
Konstantin Kudryavtsev

On Mon, Oct 5, 2015 at 10:13 PM, Robert Metzger <[hidden email]> wrote:
Hi Kostia,

thank you for writing to the Flink mailing list. I actually started to try out our S3 File system support after I saw your question on StackOverflow [1].
I found that our S3 connector is very broken. I had to resolve two more issues with it, before I was able to get the same exception you reported.

Another Flink commiter looked into the issue as well (it was confirmed as well) but there was no solution [2].

So for now, I would say we have to assume that our S3 connector is not working. I will start a separate discussion at the developer mailing list to remove our S3 connector.

The good news is that you can just use Hadoop's S3 File System implementation with Flink.

I used this Flink program to verify its working:
public class S3FileSystem {
public static void main(String[] args) throws Exception {
ExecutionEnvironment ee = ExecutionEnvironment.createLocalEnvironment();
DataSet<String> myLines = ee.readTextFile("s3n://my-bucket-name/some-test-file.xml");
myLines.print();
}
}
also, you need to make a Hadoop configuration file available to Flink.
When running flink locally in your IDE, just create a "core-site.xml" in the src/main/resource folder, with the following content:

<configuration>

<property>
<name>fs.s3n.awsAccessKeyId</name>
<value>putKeyHere</value>
</property>

<property>
<name>fs.s3n.awsSecretAccessKey</name>
<value>putSecretHere</value>
</property>
<property>
<name>fs.s3n.impl</name>
<value>org.apache.hadoop.fs.s3native.NativeS3FileSystem</value>
</property>
</configuration>
Maybe you are running on a cluster, then re-use the existing core-site.xml file (= edit it) and point to the directory using Flink's fs.hdfs.hadoopconf configuration option.

With these two things in place, you should be good to go.


On Mon, Oct 5, 2015 at 8:19 PM, Kostiantyn Kudriavtsev <[hidden email]> wrote:
Hi guys,

I,m trying to get work Apache Flink 0.9.1 on EMR, basically to read 
data from S3. I tried the following path for data 
s3://mybucket.s3.amazonaws.com/folder, but it throws me the following 
exception:

java.io.IOException: Cannot establish connection to Amazon S3: 
com.amazonaws.services.s3.model.AmazonS3Exception: The request signature 
we calculated does not match the signature you provided. Check your key 
and signing method. (Service: Amazon S3; Status Code: 403;

I added access and secret keys, so the problem is not here. I=92m using 
standard region and gave read credential to everyone.

Any ideas how can it be fixed?

Thank you in advance,
Kostia


Reply | Threaded
Open this post in threaded view
|

Re: Processing S3 data with Apache Flink

rmetzger0
Mh. I tried out the code I've posted yesterday and it was working immediately.
The security settings of AWS are sometimes a bit complicated.
I think there are some logs for S3 buckets, maybe they contain some more information. 

Maybe there are other users facing the same issue. Since the S3FileSystem class is from Hadoop, I suspect the code to be widely used, and you can probably find answers to the most common problems on google.


On Tue, Oct 6, 2015 at 1:07 PM, KOSTIANTYN Kudriavtsev <[hidden email]> wrote:
Hi Robert,

thank you very much for your input!

Have you tried that?
With org.apache.hadoop.fs.s3native.NativeS3FileSystem I moved forward, and now got a new exception:


Caused by: org.jets3t.service.S3ServiceException: S3 HEAD request failed for '/***.csv' - ResponseCode=403, ResponseMessage=Forbidden

it's really strange as far as I gave full permissions to authenticated users and can get target file from s3cmd or s3 browser from the same PC... I realize that it's question not to you, but perhaps you have faced the same issue

Thanks in advance!
Kostia

Thank you,
Konstantin Kudryavtsev

On Mon, Oct 5, 2015 at 10:13 PM, Robert Metzger <[hidden email]> wrote:
Hi Kostia,

thank you for writing to the Flink mailing list. I actually started to try out our S3 File system support after I saw your question on StackOverflow [1].
I found that our S3 connector is very broken. I had to resolve two more issues with it, before I was able to get the same exception you reported.

Another Flink commiter looked into the issue as well (it was confirmed as well) but there was no solution [2].

So for now, I would say we have to assume that our S3 connector is not working. I will start a separate discussion at the developer mailing list to remove our S3 connector.

The good news is that you can just use Hadoop's S3 File System implementation with Flink.

I used this Flink program to verify its working:
public class S3FileSystem {
public static void main(String[] args) throws Exception {
ExecutionEnvironment ee = ExecutionEnvironment.createLocalEnvironment();
DataSet<String> myLines = ee.readTextFile("s3n://my-bucket-name/some-test-file.xml");
myLines.print();
}
}
also, you need to make a Hadoop configuration file available to Flink.
When running flink locally in your IDE, just create a "core-site.xml" in the src/main/resource folder, with the following content:

<configuration>

<property>
<name>fs.s3n.awsAccessKeyId</name>
<value>putKeyHere</value>
</property>

<property>
<name>fs.s3n.awsSecretAccessKey</name>
<value>putSecretHere</value>
</property>
<property>
<name>fs.s3n.impl</name>
<value>org.apache.hadoop.fs.s3native.NativeS3FileSystem</value>
</property>
</configuration>
Maybe you are running on a cluster, then re-use the existing core-site.xml file (= edit it) and point to the directory using Flink's fs.hdfs.hadoopconf configuration option.

With these two things in place, you should be good to go.


On Mon, Oct 5, 2015 at 8:19 PM, Kostiantyn Kudriavtsev <[hidden email]> wrote:
Hi guys,

I,m trying to get work Apache Flink 0.9.1 on EMR, basically to read 
data from S3. I tried the following path for data 
s3://mybucket.s3.amazonaws.com/folder, but it throws me the following 
exception:

java.io.IOException: Cannot establish connection to Amazon S3: 
com.amazonaws.services.s3.model.AmazonS3Exception: The request signature 
we calculated does not match the signature you provided. Check your key 
and signing method. (Service: Amazon S3; Status Code: 403;

I added access and secret keys, so the problem is not here. I=92m using 
standard region and gave read credential to everyone.

Any ideas how can it be fixed?

Thank you in advance,
Kostia



Reply | Threaded
Open this post in threaded view
|

Re: Processing S3 data with Apache Flink

Kostiantyn Kudriavtsev
Hi Robert,

you are right, I just misspell name of the file :(  Everything works fine!

Basically, I'd suggest to move this workaround into official doc and mark custom S3FileSystem as @Deprecated...
In fact, I like that idea to mark all untested functional with specific annotation, for example @Beta. Just because of a big enterprises won't be like to use any product where documented features don't work. For example, for me it would be difficult to advocate Flink usage on the project as far as S3FileSystem was broken and my opponents will refer to that "who knows what's broken". If some functionality is marked as not properly tested, it's much easier to make decisions because of better visibility

WBR,
Kostia 

Thank you,
Konstantin Kudryavtsev

On Tue, Oct 6, 2015 at 2:12 PM, Robert Metzger <[hidden email]> wrote:
Mh. I tried out the code I've posted yesterday and it was working immediately.
The security settings of AWS are sometimes a bit complicated.
I think there are some logs for S3 buckets, maybe they contain some more information. 

Maybe there are other users facing the same issue. Since the S3FileSystem class is from Hadoop, I suspect the code to be widely used, and you can probably find answers to the most common problems on google.


On Tue, Oct 6, 2015 at 1:07 PM, KOSTIANTYN Kudriavtsev <[hidden email]> wrote:
Hi Robert,

thank you very much for your input!

Have you tried that?
With org.apache.hadoop.fs.s3native.NativeS3FileSystem I moved forward, and now got a new exception:


Caused by: org.jets3t.service.S3ServiceException: S3 HEAD request failed for '/***.csv' - ResponseCode=403, ResponseMessage=Forbidden

it's really strange as far as I gave full permissions to authenticated users and can get target file from s3cmd or s3 browser from the same PC... I realize that it's question not to you, but perhaps you have faced the same issue

Thanks in advance!
Kostia

Thank you,
Konstantin Kudryavtsev

On Mon, Oct 5, 2015 at 10:13 PM, Robert Metzger <[hidden email]> wrote:
Hi Kostia,

thank you for writing to the Flink mailing list. I actually started to try out our S3 File system support after I saw your question on StackOverflow [1].
I found that our S3 connector is very broken. I had to resolve two more issues with it, before I was able to get the same exception you reported.

Another Flink commiter looked into the issue as well (it was confirmed as well) but there was no solution [2].

So for now, I would say we have to assume that our S3 connector is not working. I will start a separate discussion at the developer mailing list to remove our S3 connector.

The good news is that you can just use Hadoop's S3 File System implementation with Flink.

I used this Flink program to verify its working:
public class S3FileSystem {
public static void main(String[] args) throws Exception {
ExecutionEnvironment ee = ExecutionEnvironment.createLocalEnvironment();
DataSet<String> myLines = ee.readTextFile("s3n://my-bucket-name/some-test-file.xml");
myLines.print();
}
}
also, you need to make a Hadoop configuration file available to Flink.
When running flink locally in your IDE, just create a "core-site.xml" in the src/main/resource folder, with the following content:

<configuration>

<property>
<name>fs.s3n.awsAccessKeyId</name>
<value>putKeyHere</value>
</property>

<property>
<name>fs.s3n.awsSecretAccessKey</name>
<value>putSecretHere</value>
</property>
<property>
<name>fs.s3n.impl</name>
<value>org.apache.hadoop.fs.s3native.NativeS3FileSystem</value>
</property>
</configuration>
Maybe you are running on a cluster, then re-use the existing core-site.xml file (= edit it) and point to the directory using Flink's fs.hdfs.hadoopconf configuration option.

With these two things in place, you should be good to go.


On Mon, Oct 5, 2015 at 8:19 PM, Kostiantyn Kudriavtsev <[hidden email]> wrote:
Hi guys,

I,m trying to get work Apache Flink 0.9.1 on EMR, basically to read 
data from S3. I tried the following path for data 
s3://mybucket.s3.amazonaws.com/folder, but it throws me the following 
exception:

java.io.IOException: Cannot establish connection to Amazon S3: 
com.amazonaws.services.s3.model.AmazonS3Exception: The request signature 
we calculated does not match the signature you provided. Check your key 
and signing method. (Service: Amazon S3; Status Code: 403;

I added access and secret keys, so the problem is not here. I=92m using 
standard region and gave read credential to everyone.

Any ideas how can it be fixed?

Thank you in advance,
Kostia




Reply | Threaded
Open this post in threaded view
|

Re: Processing S3 data with Apache Flink

rmetzger0
Hi Kostia,

I understand your concern. I am going to propose to the Flink developers to remove the S3 File System support in Flink.
Also, regarding these annotations, we are actually planning to add them for the 1.0 release so that users know which interfaces they an rely on.

Which other components of Flink are you planning to use?
I can give you some information on how stable/well tested they are.

Usually, everything in Flink is very well tested, but in case of the S3 connector, its hard to do it automatically, because it concerns an external component out of our control.


Regards,
Robert



On Tue, Oct 6, 2015 at 1:44 PM, KOSTIANTYN Kudriavtsev <[hidden email]> wrote:
Hi Robert,

you are right, I just misspell name of the file :(  Everything works fine!

Basically, I'd suggest to move this workaround into official doc and mark custom S3FileSystem as @Deprecated...
In fact, I like that idea to mark all untested functional with specific annotation, for example @Beta. Just because of a big enterprises won't be like to use any product where documented features don't work. For example, for me it would be difficult to advocate Flink usage on the project as far as S3FileSystem was broken and my opponents will refer to that "who knows what's broken". If some functionality is marked as not properly tested, it's much easier to make decisions because of better visibility

WBR,
Kostia 

Thank you,
Konstantin Kudryavtsev

On Tue, Oct 6, 2015 at 2:12 PM, Robert Metzger <[hidden email]> wrote:
Mh. I tried out the code I've posted yesterday and it was working immediately.
The security settings of AWS are sometimes a bit complicated.
I think there are some logs for S3 buckets, maybe they contain some more information. 

Maybe there are other users facing the same issue. Since the S3FileSystem class is from Hadoop, I suspect the code to be widely used, and you can probably find answers to the most common problems on google.


On Tue, Oct 6, 2015 at 1:07 PM, KOSTIANTYN Kudriavtsev <[hidden email]> wrote:
Hi Robert,

thank you very much for your input!

Have you tried that?
With org.apache.hadoop.fs.s3native.NativeS3FileSystem I moved forward, and now got a new exception:


Caused by: org.jets3t.service.S3ServiceException: S3 HEAD request failed for '/***.csv' - ResponseCode=403, ResponseMessage=Forbidden

it's really strange as far as I gave full permissions to authenticated users and can get target file from s3cmd or s3 browser from the same PC... I realize that it's question not to you, but perhaps you have faced the same issue

Thanks in advance!
Kostia

Thank you,
Konstantin Kudryavtsev

On Mon, Oct 5, 2015 at 10:13 PM, Robert Metzger <[hidden email]> wrote:
Hi Kostia,

thank you for writing to the Flink mailing list. I actually started to try out our S3 File system support after I saw your question on StackOverflow [1].
I found that our S3 connector is very broken. I had to resolve two more issues with it, before I was able to get the same exception you reported.

Another Flink commiter looked into the issue as well (it was confirmed as well) but there was no solution [2].

So for now, I would say we have to assume that our S3 connector is not working. I will start a separate discussion at the developer mailing list to remove our S3 connector.

The good news is that you can just use Hadoop's S3 File System implementation with Flink.

I used this Flink program to verify its working:
public class S3FileSystem {
public static void main(String[] args) throws Exception {
ExecutionEnvironment ee = ExecutionEnvironment.createLocalEnvironment();
DataSet<String> myLines = ee.readTextFile("s3n://my-bucket-name/some-test-file.xml");
myLines.print();
}
}
also, you need to make a Hadoop configuration file available to Flink.
When running flink locally in your IDE, just create a "core-site.xml" in the src/main/resource folder, with the following content:

<configuration>

<property>
<name>fs.s3n.awsAccessKeyId</name>
<value>putKeyHere</value>
</property>

<property>
<name>fs.s3n.awsSecretAccessKey</name>
<value>putSecretHere</value>
</property>
<property>
<name>fs.s3n.impl</name>
<value>org.apache.hadoop.fs.s3native.NativeS3FileSystem</value>
</property>
</configuration>
Maybe you are running on a cluster, then re-use the existing core-site.xml file (= edit it) and point to the directory using Flink's fs.hdfs.hadoopconf configuration option.

With these two things in place, you should be good to go.


On Mon, Oct 5, 2015 at 8:19 PM, Kostiantyn Kudriavtsev <[hidden email]> wrote:
Hi guys,

I,m trying to get work Apache Flink 0.9.1 on EMR, basically to read 
data from S3. I tried the following path for data 
s3://mybucket.s3.amazonaws.com/folder, but it throws me the following 
exception:

java.io.IOException: Cannot establish connection to Amazon S3: 
com.amazonaws.services.s3.model.AmazonS3Exception: The request signature 
we calculated does not match the signature you provided. Check your key 
and signing method. (Service: Amazon S3; Status Code: 403;

I added access and secret keys, so the problem is not here. I=92m using 
standard region and gave read credential to everyone.

Any ideas how can it be fixed?

Thank you in advance,
Kostia





Reply | Threaded
Open this post in threaded view
|

Re: Processing S3 data with Apache Flink

snntr
In reply to this post by rmetzger0
Hey everyone,

I was having the same problem with S3 and found this thread very useful. Everything works fine now, when I start Flink from my IDE, but when I run the jar in local mode I keep getting

java.lang.IllegalArgumentException: AWS Access Key ID and Secret Access Key must be specified as the username or password (respectively) of a s3n URL, or by setting the fs.s3n.awsAccessKeyId or fs.s3n.awsSecretAccessKey properties (respectively).

I have set fs.hdfs.hadoopconf to point to a core-site.xml on my local machine with the required properties. What am I missing?

Any advice is highly appreciated ;)

Cheers,

Konstantin

Reply | Threaded
Open this post in threaded view
|

Re: Processing S3 data with Apache Flink

Ufuk Celebi

> On 10 Oct 2015, at 22:59, snntr <[hidden email]> wrote:
>
> Hey everyone,
>
> I was having the same problem with S3 and found this thread very useful.
> Everything works fine now, when I start Flink from my IDE, but when I run
> the jar in local mode I keep getting
>
> java.lang.IllegalArgumentException: AWS Access Key ID and Secret Access Key
> must be specified as the username or password (respectively) of a s3n URL,
> or by setting the fs.s3n.awsAccessKeyId or fs.s3n.awsSecretAccessKey
> properties (respectively).
>
> I have set fs.hdfs.hadoopconf to point to a core-site.xml on my local
> machine with the required properties. What am I missing?
>
> Any advice is highly appreciated ;)

This looks like a problem with picking up the Hadoop config. Can you look into the logs to check whether the configuration is picked up? Change the log settings to DEBUG in log/log4j.properties for this. And can you provide the complete stack trace?

– Ufuk

Reply | Threaded
Open this post in threaded view
|

Re: Processing S3 data with Apache Flink

Stephan Ewen
@Konstantin (2) : Can you try the workaround described by Robert, with the "s3n" file system scheme?

We are removing the custom S3 connector now, simply reusing Hadoop's S3 connector for all cases.

@Kostia:
You are right, there should be no broken stuff that is not clearly marked as "beta". For the S3 connector, that was a problem in the testing on our side and should not have happened.
In general, you can assume that stuff in "flink-contrib" is in beta status, as well as the stuff in "flink-staging" (although much of the staging stuff will graduate with the next release). All code not in these projects should be well functioning. We test a lot, so there should be not many broken cases like that S3 connector.
 
Greetings,
Stephan


On Wed, Oct 14, 2015 at 11:44 AM, Ufuk Celebi <[hidden email]> wrote:

> On 10 Oct 2015, at 22:59, snntr <[hidden email]> wrote:
>
> Hey everyone,
>
> I was having the same problem with S3 and found this thread very useful.
> Everything works fine now, when I start Flink from my IDE, but when I run
> the jar in local mode I keep getting
>
> java.lang.IllegalArgumentException: AWS Access Key ID and Secret Access Key
> must be specified as the username or password (respectively) of a s3n URL,
> or by setting the fs.s3n.awsAccessKeyId or fs.s3n.awsSecretAccessKey
> properties (respectively).
>
> I have set fs.hdfs.hadoopconf to point to a core-site.xml on my local
> machine with the required properties. What am I missing?
>
> Any advice is highly appreciated ;)

This looks like a problem with picking up the Hadoop config. Can you look into the logs to check whether the configuration is picked up? Change the log settings to DEBUG in log/log4j.properties for this. And can you provide the complete stack trace?

– Ufuk


Reply | Threaded
Open this post in threaded view
|

Re: Processing S3 data with Apache Flink

snntr
In reply to this post by Ufuk Celebi
Hi Ufuk,

sorry for not getting back to you for so long, and thanks for your
answer. The problem persists unfortunately. Running the job from the IDE
works (with core-site.xml on classpath), running it in local standalone
mode does not. AccessKeyID and SecretAccesKey are not found.

Attached the jobmanager log on DEBUG level. The core-site.xml is
definitely at the configured location.

I am now on version 0.10.0 and using the binaries for Hadoop 1.2.1 to
run the jar in local mode. Do I have to use the Hadoop 2.x version for
this to work? I have put hadoop-common-2.3.jar into the flink lib folder.

I don't know if it is relevant (but it seems to be related), when I run
the job from my IDE I get the warning:

2015-11-21 12:43:11 WARN  NativeCodeLoader:62 - Unable to load
native-hadoop library for your platform... using builtin-java classes
where applicable

Cheers and thank you,

Konstantin


On 14.10.2015 11:44, Ufuk Celebi wrote:

>
>> On 10 Oct 2015, at 22:59, snntr <[hidden email]> wrote:
>>
>> Hey everyone,
>>
>> I was having the same problem with S3 and found this thread very useful.
>> Everything works fine now, when I start Flink from my IDE, but when I run
>> the jar in local mode I keep getting
>>
>> java.lang.IllegalArgumentException: AWS Access Key ID and Secret Access Key
>> must be specified as the username or password (respectively) of a s3n URL,
>> or by setting the fs.s3n.awsAccessKeyId or fs.s3n.awsSecretAccessKey
>> properties (respectively).
>>
>> I have set fs.hdfs.hadoopconf to point to a core-site.xml on my local
>> machine with the required properties. What am I missing?
>>
>> Any advice is highly appreciated ;)
>
> This looks like a problem with picking up the Hadoop config. Can you look into the logs to check whether the configuration is picked up? Change the log settings to DEBUG in log/log4j.properties for this. And can you provide the complete stack trace?
>
> – Ufuk
>
>
--
Konstantin Knauf * [hidden email] * +49-174-3413182
TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert Dahlke
Sitz: Unterföhring * Amtsgericht München * HRB 135082

jobmanager.log (28K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Processing S3 data with Apache Flink

rmetzger0
Hi,

It seems that you've set the "fs.hdfs.hadoopconf" configuration parameter to a file. I think you have to set it the directory containing the configuration.
Sorry, I know that's not very intuitive, but in Hadoop the settings for in different files (hdfs|yarn|core)-site.xml.


On Sat, Nov 21, 2015 at 12:48 PM, Konstantin Knauf <[hidden email]> wrote:
Hi Ufuk,

sorry for not getting back to you for so long, and thanks for your
answer. The problem persists unfortunately. Running the job from the IDE
works (with core-site.xml on classpath), running it in local standalone
mode does not. AccessKeyID and SecretAccesKey are not found.

Attached the jobmanager log on DEBUG level. The core-site.xml is
definitely at the configured location.

I am now on version 0.10.0 and using the binaries for Hadoop 1.2.1 to
run the jar in local mode. Do I have to use the Hadoop 2.x version for
this to work? I have put hadoop-common-2.3.jar into the flink lib folder.

I don't know if it is relevant (but it seems to be related), when I run
the job from my IDE I get the warning:

2015-11-21 12:43:11 WARN  NativeCodeLoader:62 - Unable to load
native-hadoop library for your platform... using builtin-java classes
where applicable

Cheers and thank you,

Konstantin


On 14.10.2015 11:44, Ufuk Celebi wrote:
>
>> On 10 Oct 2015, at 22:59, snntr <[hidden email]> wrote:
>>
>> Hey everyone,
>>
>> I was having the same problem with S3 and found this thread very useful.
>> Everything works fine now, when I start Flink from my IDE, but when I run
>> the jar in local mode I keep getting
>>
>> java.lang.IllegalArgumentException: AWS Access Key ID and Secret Access Key
>> must be specified as the username or password (respectively) of a s3n URL,
>> or by setting the fs.s3n.awsAccessKeyId or fs.s3n.awsSecretAccessKey
>> properties (respectively).
>>
>> I have set fs.hdfs.hadoopconf to point to a core-site.xml on my local
>> machine with the required properties. What am I missing?
>>
>> Any advice is highly appreciated ;)
>
> This looks like a problem with picking up the Hadoop config. Can you look into the logs to check whether the configuration is picked up? Change the log settings to DEBUG in log/log4j.properties for this. And can you provide the complete stack trace?
>
> – Ufuk
>
>

--
Konstantin Knauf * [hidden email] * <a href="tel:%2B49-174-3413182" value="+491743413182">+49-174-3413182
TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert Dahlke
Sitz: Unterföhring * Amtsgericht München * HRB 135082

Reply | Threaded
Open this post in threaded view
|

Re: Processing S3 data with Apache Flink

snntr
Hi Robert,

thanks a lot, it's working now. Actually, it also says "directory" in
the description. So I should have known :/

On additional question though. If I use the flink binary for Hadoop
1.2.1 and run flink in standalone mode, should I use the *-hadoop1
dependencies even If I am not interacting with HDFS 1.x?

Cheers,

Konstantin

On 21.11.2015 14:52, Robert Metzger wrote:

> Hi,
>
> It seems that you've set the "fs.hdfs.hadoopconf" configuration
> parameter to a file. I think you have to set it the directory containing
> the configuration.
> Sorry, I know that's not very intuitive, but in Hadoop the settings for
> in different files (hdfs|yarn|core)-site.xml.
>
>
> On Sat, Nov 21, 2015 at 12:48 PM, Konstantin Knauf
> <[hidden email] <mailto:[hidden email]>> wrote:
>
>     Hi Ufuk,
>
>     sorry for not getting back to you for so long, and thanks for your
>     answer. The problem persists unfortunately. Running the job from the IDE
>     works (with core-site.xml on classpath), running it in local standalone
>     mode does not. AccessKeyID and SecretAccesKey are not found.
>
>     Attached the jobmanager log on DEBUG level. The core-site.xml is
>     definitely at the configured location.
>
>     I am now on version 0.10.0 and using the binaries for Hadoop 1.2.1 to
>     run the jar in local mode. Do I have to use the Hadoop 2.x version for
>     this to work? I have put hadoop-common-2.3.jar into the flink lib
>     folder.
>
>     I don't know if it is relevant (but it seems to be related), when I run
>     the job from my IDE I get the warning:
>
>     2015-11-21 12:43:11 WARN  NativeCodeLoader:62 - Unable to load
>     native-hadoop library for your platform... using builtin-java classes
>     where applicable
>
>     Cheers and thank you,
>
>     Konstantin
>
>
>     On 14.10.2015 11:44, Ufuk Celebi wrote:
>     >
>     >> On 10 Oct 2015, at 22:59, snntr <[hidden email]
>     <mailto:[hidden email]>> wrote:
>     >>
>     >> Hey everyone,
>     >>
>     >> I was having the same problem with S3 and found this thread very
>     useful.
>     >> Everything works fine now, when I start Flink from my IDE, but
>     when I run
>     >> the jar in local mode I keep getting
>     >>
>     >> java.lang.IllegalArgumentException: AWS Access Key ID and Secret
>     Access Key
>     >> must be specified as the username or password (respectively) of a
>     s3n URL,
>     >> or by setting the fs.s3n.awsAccessKeyId or fs.s3n.awsSecretAccessKey
>     >> properties (respectively).
>     >>
>     >> I have set fs.hdfs.hadoopconf to point to a core-site.xml on my local
>     >> machine with the required properties. What am I missing?
>     >>
>     >> Any advice is highly appreciated ;)
>     >
>     > This looks like a problem with picking up the Hadoop config. Can
>     you look into the logs to check whether the configuration is picked
>     up? Change the log settings to DEBUG in log/log4j.properties for
>     this. And can you provide the complete stack trace?
>     >
>     > – Ufuk
>     >
>     >
>
>     --
>     Konstantin Knauf * [hidden email]
>     <mailto:[hidden email]> * +49-174-3413182
>     <tel:%2B49-174-3413182>
>     TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
>     Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert Dahlke
>     Sitz: Unterföhring * Amtsgericht München * HRB 135082
>
>

--
Konstantin Knauf * [hidden email] * +49-174-3413182
TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert Dahlke
Sitz: Unterföhring * Amtsgericht München * HRB 135082
Reply | Threaded
Open this post in threaded view
|

Re: Processing S3 data with Apache Flink

rmetzger0
Hi,

great to hear that its working. I've updated the documentation (for 1.0) and made the word directory bold ;)

You should try to match your Hadoop version as closely as possible.
Are you not using HDFS at all? Then it doesn't matter which version of Flink you are downloading.
When using Hadoop 2.x then I'd recommend at least a Flink version for Hadoop 2.3.0


On Sat, Nov 21, 2015 at 3:13 PM, Konstantin Knauf <[hidden email]> wrote:
Hi Robert,

thanks a lot, it's working now. Actually, it also says "directory" in
the description. So I should have known :/

On additional question though. If I use the flink binary for Hadoop
1.2.1 and run flink in standalone mode, should I use the *-hadoop1
dependencies even If I am not interacting with HDFS 1.x?

Cheers,

Konstantin

On 21.11.2015 14:52, Robert Metzger wrote:
> Hi,
>
> It seems that you've set the "fs.hdfs.hadoopconf" configuration
> parameter to a file. I think you have to set it the directory containing
> the configuration.
> Sorry, I know that's not very intuitive, but in Hadoop the settings for
> in different files (hdfs|yarn|core)-site.xml.
>
>
> On Sat, Nov 21, 2015 at 12:48 PM, Konstantin Knauf
> <[hidden email] <mailto:[hidden email]>> wrote:
>
>     Hi Ufuk,
>
>     sorry for not getting back to you for so long, and thanks for your
>     answer. The problem persists unfortunately. Running the job from the IDE
>     works (with core-site.xml on classpath), running it in local standalone
>     mode does not. AccessKeyID and SecretAccesKey are not found.
>
>     Attached the jobmanager log on DEBUG level. The core-site.xml is
>     definitely at the configured location.
>
>     I am now on version 0.10.0 and using the binaries for Hadoop 1.2.1 to
>     run the jar in local mode. Do I have to use the Hadoop 2.x version for
>     this to work? I have put hadoop-common-2.3.jar into the flink lib
>     folder.
>
>     I don't know if it is relevant (but it seems to be related), when I run
>     the job from my IDE I get the warning:
>
>     2015-11-21 12:43:11 WARN  NativeCodeLoader:62 - Unable to load
>     native-hadoop library for your platform... using builtin-java classes
>     where applicable
>
>     Cheers and thank you,
>
>     Konstantin
>
>
>     On 14.10.2015 11:44, Ufuk Celebi wrote:
>     >
>     >> On 10 Oct 2015, at 22:59, snntr <[hidden email]
>     <mailto:[hidden email]>> wrote:
>     >>
>     >> Hey everyone,
>     >>
>     >> I was having the same problem with S3 and found this thread very
>     useful.
>     >> Everything works fine now, when I start Flink from my IDE, but
>     when I run
>     >> the jar in local mode I keep getting
>     >>
>     >> java.lang.IllegalArgumentException: AWS Access Key ID and Secret
>     Access Key
>     >> must be specified as the username or password (respectively) of a
>     s3n URL,
>     >> or by setting the fs.s3n.awsAccessKeyId or fs.s3n.awsSecretAccessKey
>     >> properties (respectively).
>     >>
>     >> I have set fs.hdfs.hadoopconf to point to a core-site.xml on my local
>     >> machine with the required properties. What am I missing?
>     >>
>     >> Any advice is highly appreciated ;)
>     >
>     > This looks like a problem with picking up the Hadoop config. Can
>     you look into the logs to check whether the configuration is picked
>     up? Change the log settings to DEBUG in log/log4j.properties for
>     this. And can you provide the complete stack trace?
>     >
>     > – Ufuk
>     >
>     >
>
>     --
>     Konstantin Knauf * [hidden email]
>     <mailto:[hidden email]> * <a href="tel:%2B49-174-3413182" value="+491743413182">+49-174-3413182
>     <tel:%2B49-174-3413182>
>     TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
>     Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert Dahlke
>     Sitz: Unterföhring * Amtsgericht München * HRB 135082
>
>

--
Konstantin Knauf * [hidden email] * <a href="tel:%2B49-174-3413182" value="+491743413182">+49-174-3413182
TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert Dahlke
Sitz: Unterföhring * Amtsgericht München * HRB 135082

Reply | Threaded
Open this post in threaded view
|

Re: Processing S3 data with Apache Flink

snntr
Hi Robert,

I am basically only reading from Kafka and S3 and writing to S3 in this
job. So I am using the Hadoop S3 FileSystem classes, but that's it.

Cheers,

Konstantin


On 21.11.2015 15:16, Robert Metzger wrote:

> Hi,
>
> great to hear that its working. I've updated the documentation (for 1.0)
> and made the word directory bold ;)
>
> You should try to match your Hadoop version as closely as possible.
> Are you not using HDFS at all? Then it doesn't matter which version of
> Flink you are downloading.
> When using Hadoop 2.x then I'd recommend at least a Flink version for
> Hadoop 2.3.0
>
>
> On Sat, Nov 21, 2015 at 3:13 PM, Konstantin Knauf
> <[hidden email] <mailto:[hidden email]>> wrote:
>
>     Hi Robert,
>
>     thanks a lot, it's working now. Actually, it also says "directory" in
>     the description. So I should have known :/
>
>     On additional question though. If I use the flink binary for Hadoop
>     1.2.1 and run flink in standalone mode, should I use the *-hadoop1
>     dependencies even If I am not interacting with HDFS 1.x?
>
>     Cheers,
>
>     Konstantin
>
>     On 21.11.2015 14:52, Robert Metzger wrote:
>     > Hi,
>     >
>     > It seems that you've set the "fs.hdfs.hadoopconf" configuration
>     > parameter to a file. I think you have to set it the directory containing
>     > the configuration.
>     > Sorry, I know that's not very intuitive, but in Hadoop the settings for
>     > in different files (hdfs|yarn|core)-site.xml.
>     >
>     >
>     > On Sat, Nov 21, 2015 at 12:48 PM, Konstantin Knauf
>     > <[hidden email] <mailto:[hidden email]>
>     <mailto:[hidden email]
>     <mailto:[hidden email]>>> wrote:
>     >
>     >     Hi Ufuk,
>     >
>     >     sorry for not getting back to you for so long, and thanks for your
>     >     answer. The problem persists unfortunately. Running the job from the IDE
>     >     works (with core-site.xml on classpath), running it in local standalone
>     >     mode does not. AccessKeyID and SecretAccesKey are not found.
>     >
>     >     Attached the jobmanager log on DEBUG level. The core-site.xml is
>     >     definitely at the configured location.
>     >
>     >     I am now on version 0.10.0 and using the binaries for Hadoop 1.2.1 to
>     >     run the jar in local mode. Do I have to use the Hadoop 2.x version for
>     >     this to work? I have put hadoop-common-2.3.jar into the flink lib
>     >     folder.
>     >
>     >     I don't know if it is relevant (but it seems to be related), when I run
>     >     the job from my IDE I get the warning:
>     >
>     >     2015-11-21 12:43:11 WARN  NativeCodeLoader:62 - Unable to load
>     >     native-hadoop library for your platform... using builtin-java classes
>     >     where applicable
>     >
>     >     Cheers and thank you,
>     >
>     >     Konstantin
>     >
>     >
>     >     On 14.10.2015 11:44, Ufuk Celebi wrote:
>     >     >
>     >     >> On 10 Oct 2015, at 22:59, snntr <[hidden email] <mailto:[hidden email]>
>     >     <mailto:[hidden email] <mailto:[hidden email]>>>
>     wrote:
>     >     >>
>     >     >> Hey everyone,
>     >     >>
>     >     >> I was having the same problem with S3 and found this thread very
>     >     useful.
>     >     >> Everything works fine now, when I start Flink from my IDE, but
>     >     when I run
>     >     >> the jar in local mode I keep getting
>     >     >>
>     >     >> java.lang.IllegalArgumentException: AWS Access Key ID and Secret
>     >     Access Key
>     >     >> must be specified as the username or password (respectively) of a
>     >     s3n URL,
>     >     >> or by setting the fs.s3n.awsAccessKeyId or fs.s3n.awsSecretAccessKey
>     >     >> properties (respectively).
>     >     >>
>     >     >> I have set fs.hdfs.hadoopconf to point to a core-site.xml on my local
>     >     >> machine with the required properties. What am I missing?
>     >     >>
>     >     >> Any advice is highly appreciated ;)
>     >     >
>     >     > This looks like a problem with picking up the Hadoop config. Can
>     >     you look into the logs to check whether the configuration is picked
>     >     up? Change the log settings to DEBUG in log/log4j.properties for
>     >     this. And can you provide the complete stack trace?
>     >     >
>     >     > – Ufuk
>     >     >
>     >     >
>     >
>     >     --
>     >     Konstantin Knauf * [hidden email] <mailto:[hidden email]>
>     >     <mailto:[hidden email]
>     <mailto:[hidden email]>> * +49-174-3413182
>     <tel:%2B49-174-3413182>
>     >     <tel:%2B49-174-3413182>
>     >     TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
>     >     Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert
>     Dahlke
>     >     Sitz: Unterföhring * Amtsgericht München * HRB 135082
>     >
>     >
>
>     --
>     Konstantin Knauf * [hidden email]
>     <mailto:[hidden email]> * +49-174-3413182
>     <tel:%2B49-174-3413182>
>     TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
>     Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert Dahlke
>     Sitz: Unterföhring * Amtsgericht München * HRB 135082
>
>

--
Konstantin Knauf * [hidden email] * +49-174-3413182
TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert Dahlke
Sitz: Unterföhring * Amtsgericht München * HRB 135082
Reply | Threaded
Open this post in threaded view
|

Re: Processing S3 data with Apache Flink

rmetzger0
Ah, I see. Maybe it would make sense then for you to use the latest Hadoop version we are supporting. This way, you get the most recent Hadoop S3 file system implementation.

Note that there might be an issue with starting Flink 0.10.0 for Hadoop 2.7.0. We'll fix it with Flink 0.10.1.
But if everything is working fine ... it might make sense not to change it now ("never change a running system").


On Sat, Nov 21, 2015 at 3:24 PM, Konstantin Knauf <[hidden email]> wrote:
Hi Robert,

I am basically only reading from Kafka and S3 and writing to S3 in this
job. So I am using the Hadoop S3 FileSystem classes, but that's it.

Cheers,

Konstantin


On 21.11.2015 15:16, Robert Metzger wrote:
> Hi,
>
> great to hear that its working. I've updated the documentation (for 1.0)
> and made the word directory bold ;)
>
> You should try to match your Hadoop version as closely as possible.
> Are you not using HDFS at all? Then it doesn't matter which version of
> Flink you are downloading.
> When using Hadoop 2.x then I'd recommend at least a Flink version for
> Hadoop 2.3.0
>
>
> On Sat, Nov 21, 2015 at 3:13 PM, Konstantin Knauf
> <[hidden email] <mailto:[hidden email]>> wrote:
>
>     Hi Robert,
>
>     thanks a lot, it's working now. Actually, it also says "directory" in
>     the description. So I should have known :/
>
>     On additional question though. If I use the flink binary for Hadoop
>     1.2.1 and run flink in standalone mode, should I use the *-hadoop1
>     dependencies even If I am not interacting with HDFS 1.x?
>
>     Cheers,
>
>     Konstantin
>
>     On 21.11.2015 14:52, Robert Metzger wrote:
>     > Hi,
>     >
>     > It seems that you've set the "fs.hdfs.hadoopconf" configuration
>     > parameter to a file. I think you have to set it the directory containing
>     > the configuration.
>     > Sorry, I know that's not very intuitive, but in Hadoop the settings for
>     > in different files (hdfs|yarn|core)-site.xml.
>     >
>     >
>     > On Sat, Nov 21, 2015 at 12:48 PM, Konstantin Knauf
>     > <[hidden email] <mailto:[hidden email]>
>     <mailto:[hidden email]
>     <mailto:[hidden email]>>> wrote:
>     >
>     >     Hi Ufuk,
>     >
>     >     sorry for not getting back to you for so long, and thanks for your
>     >     answer. The problem persists unfortunately. Running the job from the IDE
>     >     works (with core-site.xml on classpath), running it in local standalone
>     >     mode does not. AccessKeyID and SecretAccesKey are not found.
>     >
>     >     Attached the jobmanager log on DEBUG level. The core-site.xml is
>     >     definitely at the configured location.
>     >
>     >     I am now on version 0.10.0 and using the binaries for Hadoop 1.2.1 to
>     >     run the jar in local mode. Do I have to use the Hadoop 2.x version for
>     >     this to work? I have put hadoop-common-2.3.jar into the flink lib
>     >     folder.
>     >
>     >     I don't know if it is relevant (but it seems to be related), when I run
>     >     the job from my IDE I get the warning:
>     >
>     >     2015-11-21 12:43:11 WARN  NativeCodeLoader:62 - Unable to load
>     >     native-hadoop library for your platform... using builtin-java classes
>     >     where applicable
>     >
>     >     Cheers and thank you,
>     >
>     >     Konstantin
>     >
>     >
>     >     On 14.10.2015 11:44, Ufuk Celebi wrote:
>     >     >
>     >     >> On 10 Oct 2015, at 22:59, snntr <[hidden email] <mailto:[hidden email]>
>     >     <mailto:[hidden email] <mailto:[hidden email]>>>
>     wrote:
>     >     >>
>     >     >> Hey everyone,
>     >     >>
>     >     >> I was having the same problem with S3 and found this thread very
>     >     useful.
>     >     >> Everything works fine now, when I start Flink from my IDE, but
>     >     when I run
>     >     >> the jar in local mode I keep getting
>     >     >>
>     >     >> java.lang.IllegalArgumentException: AWS Access Key ID and Secret
>     >     Access Key
>     >     >> must be specified as the username or password (respectively) of a
>     >     s3n URL,
>     >     >> or by setting the fs.s3n.awsAccessKeyId or fs.s3n.awsSecretAccessKey
>     >     >> properties (respectively).
>     >     >>
>     >     >> I have set fs.hdfs.hadoopconf to point to a core-site.xml on my local
>     >     >> machine with the required properties. What am I missing?
>     >     >>
>     >     >> Any advice is highly appreciated ;)
>     >     >
>     >     > This looks like a problem with picking up the Hadoop config. Can
>     >     you look into the logs to check whether the configuration is picked
>     >     up? Change the log settings to DEBUG in log/log4j.properties for
>     >     this. And can you provide the complete stack trace?
>     >     >
>     >     > – Ufuk
>     >     >
>     >     >
>     >
>     >     --
>     >     Konstantin Knauf * [hidden email] <mailto:[hidden email]>
>     >     <mailto:[hidden email]
>     <mailto:[hidden email]>> * <a href="tel:%2B49-174-3413182" value="+491743413182">+49-174-3413182
>     <tel:%2B49-174-3413182>
>     >     <tel:%2B49-174-3413182>
>     >     TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
>     >     Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert
>     Dahlke
>     >     Sitz: Unterföhring * Amtsgericht München * HRB 135082
>     >
>     >
>
>     --
>     Konstantin Knauf * [hidden email]
>     <mailto:[hidden email]> * <a href="tel:%2B49-174-3413182" value="+491743413182">+49-174-3413182
>     <tel:%2B49-174-3413182>
>     TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
>     Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert Dahlke
>     Sitz: Unterföhring * Amtsgericht München * HRB 135082
>
>

--
Konstantin Knauf * [hidden email] * <a href="tel:%2B49-174-3413182" value="+491743413182">+49-174-3413182
TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert Dahlke
Sitz: Unterföhring * Amtsgericht München * HRB 135082

Reply | Threaded
Open this post in threaded view
|

Re: Processing S3 data with Apache Flink

snntr
I see, thank you, Robert.

On 21.11.2015 15:28, Robert Metzger wrote:

> Ah, I see. Maybe it would make sense then for you to use the latest
> Hadoop version we are supporting. This way, you get the most recent
> Hadoop S3 file system implementation.
>
> Note that there might be an issue with starting Flink 0.10.0 for Hadoop
> 2.7.0. We'll fix it with Flink 0.10.1.
> But if everything is working fine ... it might make sense not to change
> it now ("never change a running system").
>
>
> On Sat, Nov 21, 2015 at 3:24 PM, Konstantin Knauf
> <[hidden email] <mailto:[hidden email]>> wrote:
>
>     Hi Robert,
>
>     I am basically only reading from Kafka and S3 and writing to S3 in this
>     job. So I am using the Hadoop S3 FileSystem classes, but that's it.
>
>     Cheers,
>
>     Konstantin
>
>
>     On 21.11.2015 15:16, Robert Metzger wrote:
>     > Hi,
>     >
>     > great to hear that its working. I've updated the documentation (for 1.0)
>     > and made the word directory bold ;)
>     >
>     > You should try to match your Hadoop version as closely as possible.
>     > Are you not using HDFS at all? Then it doesn't matter which version of
>     > Flink you are downloading.
>     > When using Hadoop 2.x then I'd recommend at least a Flink version for
>     > Hadoop 2.3.0
>     >
>     >
>     > On Sat, Nov 21, 2015 at 3:13 PM, Konstantin Knauf
>     > <[hidden email] <mailto:[hidden email]>
>     <mailto:[hidden email]
>     <mailto:[hidden email]>>> wrote:
>     >
>     >     Hi Robert,
>     >
>     >     thanks a lot, it's working now. Actually, it also says "directory" in
>     >     the description. So I should have known :/
>     >
>     >     On additional question though. If I use the flink binary for Hadoop
>     >     1.2.1 and run flink in standalone mode, should I use the *-hadoop1
>     >     dependencies even If I am not interacting with HDFS 1.x?
>     >
>     >     Cheers,
>     >
>     >     Konstantin
>     >
>     >     On 21.11.2015 14:52, Robert Metzger wrote:
>     >     > Hi,
>     >     >
>     >     > It seems that you've set the "fs.hdfs.hadoopconf" configuration
>     >     > parameter to a file. I think you have to set it the directory containing
>     >     > the configuration.
>     >     > Sorry, I know that's not very intuitive, but in Hadoop the settings for
>     >     > in different files (hdfs|yarn|core)-site.xml.
>     >     >
>     >     >
>     >     > On Sat, Nov 21, 2015 at 12:48 PM, Konstantin Knauf
>     >     > <[hidden email] <mailto:[hidden email]>
>     <mailto:[hidden email]
>     <mailto:[hidden email]>>
>     >     <mailto:[hidden email]
>     <mailto:[hidden email]>
>     >     <mailto:[hidden email]
>     <mailto:[hidden email]>>>> wrote:
>     >     >
>     >     >     Hi Ufuk,
>     >     >
>     >     >     sorry for not getting back to you for so long, and thanks for your
>     >     >     answer. The problem persists unfortunately. Running the job from the IDE
>     >     >     works (with core-site.xml on classpath), running it in local standalone
>     >     >     mode does not. AccessKeyID and SecretAccesKey are not found.
>     >     >
>     >     >     Attached the jobmanager log on DEBUG level. The core-site.xml is
>     >     >     definitely at the configured location.
>     >     >
>     >     >     I am now on version 0.10.0 and using the binaries for Hadoop 1.2.1 to
>     >     >     run the jar in local mode. Do I have to use the Hadoop 2.x version for
>     >     >     this to work? I have put hadoop-common-2.3.jar into the flink lib
>     >     >     folder.
>     >     >
>     >     >     I don't know if it is relevant (but it seems to be related), when I run
>     >     >     the job from my IDE I get the warning:
>     >     >
>     >     >     2015-11-21 12:43:11 WARN  NativeCodeLoader:62 - Unable to load
>     >     >     native-hadoop library for your platform... using builtin-java classes
>     >     >     where applicable
>     >     >
>     >     >     Cheers and thank you,
>     >     >
>     >     >     Konstantin
>     >     >
>     >     >
>     >     >     On 14.10.2015 11:44, Ufuk Celebi wrote:
>     >     >     >
>     >     >     >> On 10 Oct 2015, at 22:59, snntr <[hidden email] <mailto:[hidden email]>
>     <mailto:[hidden email]
>     <mailto:[hidden email]>>
>     >     >     <mailto:[hidden email]
>     <mailto:[hidden email]>
>     <mailto:[hidden email]
>     <mailto:[hidden email]>>>>
>     >     wrote:
>     >     >     >>
>     >     >     >> Hey everyone,
>     >     >     >>
>     >     >     >> I was having the same problem with S3 and found this thread very
>     >     >     useful.
>     >     >     >> Everything works fine now, when I start Flink from my IDE, but
>     >     >     when I run
>     >     >     >> the jar in local mode I keep getting
>     >     >     >>
>     >     >     >> java.lang.IllegalArgumentException: AWS Access Key ID and Secret
>     >     >     Access Key
>     >     >     >> must be specified as the username or password (respectively) of a
>     >     >     s3n URL,
>     >     >     >> or by setting the fs.s3n.awsAccessKeyId or fs.s3n.awsSecretAccessKey
>     >     >     >> properties (respectively).
>     >     >     >>
>     >     >     >> I have set fs.hdfs.hadoopconf to point to a core-site.xml on my local
>     >     >     >> machine with the required properties. What am I missing?
>     >     >     >>
>     >     >     >> Any advice is highly appreciated ;)
>     >     >     >
>     >     >     > This looks like a problem with picking up the Hadoop config. Can
>     >     >     you look into the logs to check whether the configuration is picked
>     >     >     up? Change the log settings to DEBUG in log/log4j.properties for
>     >     >     this. And can you provide the complete stack trace?
>     >     >     >
>     >     >     > – Ufuk
>     >     >     >
>     >     >     >
>     >     >
>     >     >     --
>     >     >     Konstantin Knauf * [hidden email] <mailto:[hidden email]>
>     <mailto:[hidden email]
>     <mailto:[hidden email]>>
>     >     >     <mailto:[hidden email]
>     <mailto:[hidden email]>
>     >     <mailto:[hidden email]
>     <mailto:[hidden email]>>> * +49-174-3413182
>     <tel:%2B49-174-3413182>
>     >     <tel:%2B49-174-3413182>
>     >     >     <tel:%2B49-174-3413182>
>     >     >     TNG Technology Consulting GmbH, Betastr. 13a, 85774
>     Unterföhring
>     >     >     Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert
>     >     Dahlke
>     >     >     Sitz: Unterföhring * Amtsgericht München * HRB 135082
>     >     >
>     >     >
>     >
>     >     --
>     >     Konstantin Knauf * [hidden email]
>     <mailto:[hidden email]>
>     >     <mailto:[hidden email]
>     <mailto:[hidden email]>> * +49-174-3413182
>     <tel:%2B49-174-3413182>
>     >     <tel:%2B49-174-3413182>
>     >     TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
>     >     Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert
>     Dahlke
>     >     Sitz: Unterföhring * Amtsgericht München * HRB 135082
>     >
>     >
>
>     --
>     Konstantin Knauf * [hidden email]
>     <mailto:[hidden email]> * +49-174-3413182
>     <tel:%2B49-174-3413182>
>     TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
>     Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert Dahlke
>     Sitz: Unterföhring * Amtsgericht München * HRB 135082
>
>

--
Konstantin Knauf * [hidden email] * +49-174-3413182
TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert Dahlke
Sitz: Unterföhring * Amtsgericht München * HRB 135082