(DEPRECATED) Apache Flink User Mailing List archive.

Streaming job on YARN - ClassNotFoundException

Classic

List

Threaded

3 messages Options

Theofilos Kakantousis

Streaming job on YARN - ClassNotFoundException

Hi everyone,

Flink 1.0.2
Hadoop 2.4.0

I am running Flink on Yarn by using FlinkYarnClient to launch a Flink cluster and Flink Client to submit a PackagedProgram. To keep it simple, for batch jobs I use the WordCount example and for streaming the IterateExample and IncrementalLearning ones without args.

Batch job executes successfully. However, the streaming ones fail with ClassNotFoundException.
For example the IncrementalLearning job throws this exception:
Caused by: java.lang.RuntimeException: Could not look up the main(String[]) method from the class org.apache.flink.streaming.examples.ml.IncrementalLearningSkeleton: org/apache/flink/streaming/api/functions/source/SourceFunction
    at org.apache.flink.client.program.PackagedProgram.hasMainMethod(PackagedProgram.java:479)
    at org.apache.flink.client.program.PackagedProgram.<init>(PackagedProgram.java:216)
    at org.apache.flink.client.program.PackagedProgram.<init>(PackagedProgram.java:106)
    [..]
Caused by: java.lang.NoClassDefFoundError: org/apache/flink/streaming/api/functions/source/SourceFunction
                                           org/apache/flink/streaming/api/functions/source/SourceFunction.class
    at java.lang.Class.getDeclaredMethods0(Native Method)
    at java.lang.Class.privateGetDeclaredMethods(Class.java:2615)
    at java.lang.Class.getMethod0(Class.java:2856)
    at java.lang.Class.getMethod(Class.java:1668)
    at org.apache.flink.client.program.PackagedProgram.hasMainMethod(PackagedProgram.java:473)
    ... 45 more
Caused by: java.lang.ClassNotFoundException: org.apache.flink.streaming.api.functions.source.SourceFunction
    at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
    ... 50 more

The taskmanager classpath includes the following:
Classpath: /srv/hadoop-2.4.0/tmp/nm-local-dir/usercache/myuser/appcache/application_1462487692793_0012/container_1462487692793_0012_01_000002/flink-dist_2.10-1.0.2.jar

It could be my pom Yarn dependency which I am not so sure about if I'm using the proper version:
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-clients_2.10</artifactId>
<version>1.0.2</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-yarn_2.10</artifactId>
<version>1.1-SNAPSHOT</version>
</dependency>

Thanks for you help!

Cheers,
Theo

rmetzger0

Re: Streaming job on YARN - ClassNotFoundException

Hi Theo,

you can't mix different Flink versions in your dependencies. Please use 1.0.2 for the flink_yarn client as well or 1.1-SNAPSHOT everywhere.

On Fri, May 6, 2016 at 7:02 PM, Theofilos Kakantousis <[hidden email]> wrote:

Hi everyone,
Flink 1.0.2
Hadoop 2.4.0

I am running Flink on Yarn by using FlinkYarnClient to launch a Flink cluster and Flink Client to submit a PackagedProgram. To keep it simple, for batch jobs I use the WordCount example and for streaming the IterateExample and IncrementalLearning ones without args.

Batch job executes successfully. However, the streaming ones fail with ClassNotFoundException.
For example the IncrementalLearning job throws this exception:
Caused by: java.lang.RuntimeException: Could not look up the main(String[]) method from the class org.apache.flink.streaming.examples.ml.IncrementalLearningSkeleton: org/apache/flink/streaming/api/functions/source/SourceFunction
    at org.apache.flink.client.program.PackagedProgram.hasMainMethod(PackagedProgram.java:479)
    at org.apache.flink.client.program.PackagedProgram.<init>(PackagedProgram.java:216)
    at org.apache.flink.client.program.PackagedProgram.<init>(PackagedProgram.java:106)
    [..]
Caused by: java.lang.NoClassDefFoundError: org/apache/flink/streaming/api/functions/source/SourceFunction
                                           org/apache/flink/streaming/api/functions/source/SourceFunction.class
    at java.lang.Class.getDeclaredMethods0(Native Method)
    at java.lang.Class.privateGetDeclaredMethods(Class.java:2615)
    at java.lang.Class.getMethod0(Class.java:2856)
    at java.lang.Class.getMethod(Class.java:1668)
    at org.apache.flink.client.program.PackagedProgram.hasMainMethod(PackagedProgram.java:473)
    ... 45 more
Caused by: java.lang.ClassNotFoundException: org.apache.flink.streaming.api.functions.source.SourceFunction
    at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
    ... 50 more

The taskmanager classpath includes the following:
Classpath: /srv/hadoop-2.4.0/tmp/nm-local-dir/usercache/myuser/appcache/application_1462487692793_0012/container_1462487692793_0012_01_000002/flink-dist_2.10-1.0.2.jar

It could be my pom Yarn dependency which I am not so sure about if I'm using the proper version:
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-clients_2.10</artifactId>
<version>1.0.2</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-yarn_2.10</artifactId>
<version>1.1-SNAPSHOT</version>
</dependency>

Thanks for you help!

Cheers,
Theo

Theofilos Kakantousis

Re: Streaming job on YARN - ClassNotFoundException

Hi Robert,

Thank you for the prompt reply. You're right, it was a left over from a previous build. With the fixed dependencies, I get the same error though.

A have a question on job submission as well. I use the following code to submit the job:

InetSocketAddress jobManagerAddress = cluster.getJobManagerAddress();
org.apache.flink.configuration.Configuration clientConf = new org.apache.flink.configuration.Configuration();
clientConf.setInteger(ConfigConstants.JOB_MANAGER_IPC_PORT_KEY, jobManagerAddress.getPort());
clientConf.setString(ConfigConstants.JOB_MANAGER_IPC_ADDRESS_KEY, jobManagerAddress.getHostName());
Client client = new Client(clientConf);
PackagedProgram program = new PackagedProgram(file,args);
JobSubmissionResult res = client.runDetached(program, parallelism);

Is there a way to submit the job to the cluster and avoid setting the parallelism explicitly?

Thanks,
Theo

On 2016-05-06 19:50, Robert Metzger wrote:

Hi Theo,

you can't mix different Flink versions in your dependencies. Please use 1.0.2 for the flink_yarn client as well or 1.1-SNAPSHOT everywhere.

On Fri, May 6, 2016 at 7:02 PM, Theofilos Kakantousis <[hidden email]> wrote:

Hi everyone,
Flink 1.0.2
Hadoop 2.4.0

I am running Flink on Yarn by using FlinkYarnClient to launch a Flink cluster and Flink Client to submit a PackagedProgram. To keep it simple, for batch jobs I use the WordCount example and for streaming the IterateExample and IncrementalLearning ones without args.

Batch job executes successfully. However, the streaming ones fail with ClassNotFoundException.
For example the IncrementalLearning job throws this exception:
Caused by: java.lang.RuntimeException: Could not look up the main(String[]) method from the class org.apache.flink.streaming.examples.ml.IncrementalLearningSkeleton: org/apache/flink/streaming/api/functions/source/SourceFunction
    at org.apache.flink.client.program.PackagedProgram.hasMainMethod(PackagedProgram.java:479)
    at org.apache.flink.client.program.PackagedProgram.<init>(PackagedProgram.java:216)
    at org.apache.flink.client.program.PackagedProgram.<init>(PackagedProgram.java:106)
    [..]
Caused by: java.lang.NoClassDefFoundError: org/apache/flink/streaming/api/functions/source/SourceFunction
                                           org/apache/flink/streaming/api/functions/source/SourceFunction.class
    at java.lang.Class.getDeclaredMethods0(Native Method)
    at java.lang.Class.privateGetDeclaredMethods(Class.java:2615)
    at java.lang.Class.getMethod0(Class.java:2856)
    at java.lang.Class.getMethod(Class.java:1668)
    at org.apache.flink.client.program.PackagedProgram.hasMainMethod(PackagedProgram.java:473)
    ... 45 more
Caused by: java.lang.ClassNotFoundException: org.apache.flink.streaming.api.functions.source.SourceFunction
    at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
    ... 50 more

The taskmanager classpath includes the following:
Classpath: /srv/hadoop-2.4.0/tmp/nm-local-dir/usercache/myuser/appcache/application_1462487692793_0012/container_1462487692793_0012_01_000002/flink-dist_2.10-1.0.2.jar

It could be my pom Yarn dependency which I am not so sure about if I'm using the proper version:
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-clients_2.10</artifactId>
<version>1.0.2</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-yarn_2.10</artifactId>
<version>1.1-SNAPSHOT</version>
</dependency>

Thanks for you help!

Cheers,
Theo