Running Flink on an Amazon Elastic MapReduce cluster

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Running Flink on an Amazon Elastic MapReduce cluster

Hanen Borchani

Hi all,

I tried to start a Yarn session on an Amazon EMR cluster with Hadoop 2.6.0 following the instructions provided in this link and using Flink 0.9.1 for Hadoop 2.6.0

https://ci.apache.org/projects/flink/flink-docs-release-0.9/setup/yarn_setup.html

Running the following command line: ./bin/yarn-session.sh -n 2 -jm 1024 -tm 2048 generated the following error message

 ------------

12:53:47,633 INFO  org.apache.hadoop.yarn.client.RMProxy                         - Connecting to ResourceManager at /0.0.0.0:8032

12:53:47,805 WARN  org.apache.hadoop.util.NativeCodeLoader                       - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

12:53:48,226 WARN  org.apache.flink.yarn.FlinkYarnClient                         - Neither the HADOOP_CONF_DIR nor the YARN_CONF_DIR environment variable is set.The Flink YARN Client needs one of these to be set to properly load the Hadoop configuration for accessing YARN.

12:53:48,227 INFO  org.apache.flink.yarn.FlinkYarnClient                         - Using values:

12:53:48,228 INFO  org.apache.flink.yarn.FlinkYarnClient                         - TaskManager count = 2

12:53:48,229 INFO  org.apache.flink.yarn.FlinkYarnClient                         - JobManager memory = 1024

12:53:48,229 INFO  org.apache.flink.yarn.FlinkYarnClient                         - TaskManager memory = 2048

12:53:48,580 WARN  org.apache.flink.yarn.FlinkYarnClient                         - The file system scheme is 'file'. This indicates that the specified Hadoop configuration path is wrong and the sytem is using the default Hadoop configuration values.The Flink YARN client needs to store its files in a distributed file system

12:53:48,593 INFO  org.apache.flink.yarn.Utils                                   - Copying from file:/home/hadoop/flink-0.9.1/lib/flink-dist-0.9.1.jar to file:/home/hadoop/.flink/application_1444046049303_0008/flink-dist-0.9.1.jar

12:53:49,245 INFO  org.apache.flink.yarn.Utils                                   - Copying from /home/hadoop/flink-0.9.1/conf/flink-conf.yaml to file:/home/hadoop/.flink/application_1444046049303_0008/flink-conf.yaml

12:53:49,251 INFO  org.apache.flink.yarn.Utils                                   - Copying from file:/home/hadoop/flink-0.9.1/lib/flink-python-0.9.1.jar to file:/home/hadoop/.flink/application_1444046049303_0008/flink-python-0.9.1.jar

12:53:49,278 INFO  org.apache.flink.yarn.Utils                                   - Copying from file:/home/hadoop/flink-0.9.1/conf/logback.xml to file:/home/hadoop/.flink/application_1444046049303_0008/logback.xml

12:53:49,285 INFO  org.apache.flink.yarn.Utils                                   - Copying from file:/home/hadoop/flink-0.9.1/conf/log4j.properties to file:/home/hadoop/.flink/application_1444046049303_0008/log4j.properties

12:53:49,304 INFO  org.apache.flink.yarn.FlinkYarnClient                         - Submitting application master application_1444046049303_0008

12:53:49,347 INFO  org.apache.hadoop.yarn.client.api.impl.YarnClientImpl         - Submitted application application_1444046049303_0008

12:53:49,347 INFO  org.apache.flink.yarn.FlinkYarnClient                         - Waiting for the cluster to be allocated

12:53:49,349 INFO  org.apache.flink.yarn.FlinkYarnClient                         - Deploying cluster, current state ACCEPTED

12:53:50,351 INFO  org.apache.flink.yarn.FlinkYarnClient                         - Deploying cluster, current state ACCEPTED

Error while deploying YARN cluster: The YARN application unexpectedly switched to state FAILED during deployment.

Diagnostics from YARN: Application application_1444046049303_0008 failed 1 times due to AM Container for appattempt_1444046049303_0008_000001 exited with  exitCode: -1000

For more detailed output, check application tracking page:http://ip-172-31-10-16.us-west-2.compute.internal:20888/proxy/application_1444046049303_0008/Then, click on links to logs of each attempt.

Diagnostics: File file:/home/hadoop/.flink/application_1444046049303_0008/flink-conf.yaml does not exist

java.io.FileNotFoundException: File file:/home/hadoop/.flink/application_1444046049303_0008/flink-conf.yaml does not exist

         at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:539)

         at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:752)

         at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:529)

         at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:419)

         at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:251)

         at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:61)

         at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)

         at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:357)

         at java.security.AccessController.doPrivileged(Native Method)

         at javax.security.auth.Subject.doAs(Subject.java:415)

         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)

         at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:356)

         at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60)

         at java.util.concurrent.FutureTask.run(FutureTask.java:262)

         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)

         at java.util.concurrent.FutureTask.run(FutureTask.java:262)

         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

         at java.lang.Thread.run(Thread.java:745)

 

Failing this attempt. Failing the application.

If log aggregation is enabled on your cluster, use this command to further invesitage the issue:

yarn logs -applicationId application_1444046049303_0008

org.apache.flink.yarn.FlinkYarnClient$YarnDeploymentException: The YARN application unexpectedly switched to state FAILED during deployment.

Diagnostics from YARN: Application application_1444046049303_0008 failed 1 times due to AM Container for appattempt_1444046049303_0008_000001 exited with  exitCode: -1000

For more detailed output, check application tracking page:http://ip-172-31-10-16.us-west-2.compute.internal:20888/proxy/application_1444046049303_0008/Then, click on links to logs of each attempt.

Diagnostics: File file:/home/hadoop/.flink/application_1444046049303_0008/flink-conf.yaml does not exist

java.io.FileNotFoundException: File file:/home/hadoop/.flink/application_1444046049303_0008/flink-conf.yaml does not exist

         at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:539)

         at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:752)

         at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:529)

         at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:419)

         at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:251)

         at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:61)

         at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)

         at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:357)

         at java.security.AccessController.doPrivileged(Native Method)

         at javax.security.auth.Subject.doAs(Subject.java:415)

         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)

         at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:356)

         at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60)

         at java.util.concurrent.FutureTask.run(FutureTask.java:262)

         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)

         at java.util.concurrent.FutureTask.run(FutureTask.java:262)

         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

         at java.lang.Thread.run(Thread.java:745)

 

Failing this attempt. Failing the application.

If log aggregation is enabled on your cluster, use this command to further invesitage the issue:

yarn logs -applicationId application_1444046049303_0008

         at org.apache.flink.yarn.FlinkYarnClient.deployInternal(FlinkYarnClient.java:627)

         at org.apache.flink.yarn.FlinkYarnClient.deploy(FlinkYarnClient.java:335)

         at org.apache.flink.client.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:403)

            at org.apache.flink.client.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:344)

------------

Could anyone help me to solve this?

Thanks & Best Regards,

Hanen

Reply | Threaded
Open this post in threaded view
|

Re: Running Flink on an Amazon Elastic MapReduce cluster

Maximilian Michels
Hi Hanen,

It appears that the environment variables are not set. Thus, Flink cannot pick up the Hadoop configuration. Could you please paste the output of "echo $HADOOP_HOME" and "echo $HADOOP_CONF_DIR" here?

In any case, your problem looks similar to the one discussed here: http://stackoverflow.com/questions/31991934/cannot-use-apache-flink-in-amazon-emr Please execute

export HADOOP_CONF_DIR=/etc/hadoop/conf

and you should be good to go.

Cheers,
Max

On Mon, Oct 5, 2015 at 3:37 PM, Hanen Borchani <[hidden email]> wrote:

Hi all,

I tried to start a Yarn session on an Amazon EMR cluster with Hadoop 2.6.0 following the instructions provided in this link and using Flink 0.9.1 for Hadoop 2.6.0

https://ci.apache.org/projects/flink/flink-docs-release-0.9/setup/yarn_setup.html

Running the following command line: ./bin/yarn-session.sh -n 2 -jm 1024 -tm 2048 generated the following error message

 ------------

12:53:47,633 INFO  org.apache.hadoop.yarn.client.RMProxy                         - Connecting to ResourceManager at /0.0.0.0:8032

12:53:47,805 WARN  org.apache.hadoop.util.NativeCodeLoader                       - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

12:53:48,226 WARN  org.apache.flink.yarn.FlinkYarnClient                         - Neither the HADOOP_CONF_DIR nor the YARN_CONF_DIR environment variable is set.The Flink YARN Client needs one of these to be set to properly load the Hadoop configuration for accessing YARN.

12:53:48,227 INFO  org.apache.flink.yarn.FlinkYarnClient                         - Using values:

12:53:48,228 INFO  org.apache.flink.yarn.FlinkYarnClient                         - TaskManager count = 2

12:53:48,229 INFO  org.apache.flink.yarn.FlinkYarnClient                         - JobManager memory = 1024

12:53:48,229 INFO  org.apache.flink.yarn.FlinkYarnClient                         - TaskManager memory = 2048

12:53:48,580 WARN  org.apache.flink.yarn.FlinkYarnClient                         - The file system scheme is 'file'. This indicates that the specified Hadoop configuration path is wrong and the sytem is using the default Hadoop configuration values.The Flink YARN client needs to store its files in a distributed file system

12:53:48,593 INFO  org.apache.flink.yarn.Utils                                   - Copying from file:/home/hadoop/flink-0.9.1/lib/flink-dist-0.9.1.jar to file:/home/hadoop/.flink/application_1444046049303_0008/flink-dist-0.9.1.jar

12:53:49,245 INFO  org.apache.flink.yarn.Utils                                   - Copying from /home/hadoop/flink-0.9.1/conf/flink-conf.yaml to file:/home/hadoop/.flink/application_1444046049303_0008/flink-conf.yaml

12:53:49,251 INFO  org.apache.flink.yarn.Utils                                   - Copying from file:/home/hadoop/flink-0.9.1/lib/flink-python-0.9.1.jar to file:/home/hadoop/.flink/application_1444046049303_0008/flink-python-0.9.1.jar

12:53:49,278 INFO  org.apache.flink.yarn.Utils                                   - Copying from file:/home/hadoop/flink-0.9.1/conf/logback.xml to file:/home/hadoop/.flink/application_1444046049303_0008/logback.xml

12:53:49,285 INFO  org.apache.flink.yarn.Utils                                   - Copying from file:/home/hadoop/flink-0.9.1/conf/log4j.properties to file:/home/hadoop/.flink/application_1444046049303_0008/log4j.properties

12:53:49,304 INFO  org.apache.flink.yarn.FlinkYarnClient                         - Submitting application master application_1444046049303_0008

12:53:49,347 INFO  org.apache.hadoop.yarn.client.api.impl.YarnClientImpl         - Submitted application application_1444046049303_0008

12:53:49,347 INFO  org.apache.flink.yarn.FlinkYarnClient                         - Waiting for the cluster to be allocated

12:53:49,349 INFO  org.apache.flink.yarn.FlinkYarnClient                         - Deploying cluster, current state ACCEPTED

12:53:50,351 INFO  org.apache.flink.yarn.FlinkYarnClient                         - Deploying cluster, current state ACCEPTED

Error while deploying YARN cluster: The YARN application unexpectedly switched to state FAILED during deployment.

Diagnostics from YARN: Application application_1444046049303_0008 failed 1 times due to AM Container for appattempt_1444046049303_0008_000001 exited with  exitCode: -1000

For more detailed output, check application tracking page:http://ip-172-31-10-16.us-west-2.compute.internal:20888/proxy/application_1444046049303_0008/Then, click on links to logs of each attempt.

Diagnostics: File file:/home/hadoop/.flink/application_1444046049303_0008/flink-conf.yaml does not exist

java.io.FileNotFoundException: File file:/home/hadoop/.flink/application_1444046049303_0008/flink-conf.yaml does not exist

         at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:539)

         at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:752)

         at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:529)

         at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:419)

         at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:251)

         at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:61)

         at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)

         at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:357)

         at java.security.AccessController.doPrivileged(Native Method)

         at javax.security.auth.Subject.doAs(Subject.java:415)

         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)

         at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:356)

         at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60)

         at java.util.concurrent.FutureTask.run(FutureTask.java:262)

         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)

         at java.util.concurrent.FutureTask.run(FutureTask.java:262)

         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

         at java.lang.Thread.run(Thread.java:745)

 

Failing this attempt. Failing the application.

If log aggregation is enabled on your cluster, use this command to further invesitage the issue:

yarn logs -applicationId application_1444046049303_0008

org.apache.flink.yarn.FlinkYarnClient$YarnDeploymentException: The YARN application unexpectedly switched to state FAILED during deployment.

Diagnostics from YARN: Application application_1444046049303_0008 failed 1 times due to AM Container for appattempt_1444046049303_0008_000001 exited with  exitCode: -1000

For more detailed output, check application tracking page:http://ip-172-31-10-16.us-west-2.compute.internal:20888/proxy/application_1444046049303_0008/Then, click on links to logs of each attempt.

Diagnostics: File file:/home/hadoop/.flink/application_1444046049303_0008/flink-conf.yaml does not exist

java.io.FileNotFoundException: File file:/home/hadoop/.flink/application_1444046049303_0008/flink-conf.yaml does not exist

         at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:539)

         at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:752)

         at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:529)

         at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:419)

         at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:251)

         at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:61)

         at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)

         at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:357)

         at java.security.AccessController.doPrivileged(Native Method)

         at javax.security.auth.Subject.doAs(Subject.java:415)

         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)

         at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:356)

         at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60)

         at java.util.concurrent.FutureTask.run(FutureTask.java:262)

         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)

         at java.util.concurrent.FutureTask.run(FutureTask.java:262)

         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

         at java.lang.Thread.run(Thread.java:745)

 

Failing this attempt. Failing the application.

If log aggregation is enabled on your cluster, use this command to further invesitage the issue:

yarn logs -applicationId application_1444046049303_0008

         at org.apache.flink.yarn.FlinkYarnClient.deployInternal(FlinkYarnClient.java:627)

         at org.apache.flink.yarn.FlinkYarnClient.deploy(FlinkYarnClient.java:335)

         at org.apache.flink.client.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:403)

            at org.apache.flink.client.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:344)

------------

Could anyone help me to solve this?

Thanks & Best Regards,

Hanen


Reply | Threaded
Open this post in threaded view
|

Re: Running Flink on an Amazon Elastic MapReduce cluster

Hanen Borchani
Hi Max,

You are right the problem is related to Hadoop configuration, both HADOOP_HOME and HADOOP_CONF_DIR environment variables were empty

Executing  export HADOOP_CONF_DIR=/etc/hadoop/conf  solved the problem, and everything works fine now!

Many thanks for help :)

Best regards,
Hanen