Streaming job on YARN - ClassNotFoundException

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Streaming job on YARN - ClassNotFoundException

Theofilos Kakantousis

Hi everyone,

Flink 1.0.2
Hadoop 2.4.0

I am running Flink on Yarn by using FlinkYarnClient to launch a Flink cluster and Flink Client to submit a PackagedProgram. To keep it simple, for batch jobs I use the WordCount example and for streaming the IterateExample and IncrementalLearning ones without args.

Batch job executes successfully. However, the streaming ones fail with ClassNotFoundException.
For example the IncrementalLearning job throws this exception:
Caused by: java.lang.RuntimeException: Could not look up the main(String[]) method from the class org.apache.flink.streaming.examples.ml.IncrementalLearningSkeleton: org/apache/flink/streaming/api/functions/source/SourceFunction
    at org.apache.flink.client.program.PackagedProgram.hasMainMethod(PackagedProgram.java:479)
    at org.apache.flink.client.program.PackagedProgram.<init>(PackagedProgram.java:216)
    at org.apache.flink.client.program.PackagedProgram.<init>(PackagedProgram.java:106)
    [..]
Caused by: java.lang.NoClassDefFoundError: org/apache/flink/streaming/api/functions/source/SourceFunction
                                           org/apache/flink/streaming/api/functions/source/SourceFunction.class
    at java.lang.Class.getDeclaredMethods0(Native Method)
    at java.lang.Class.privateGetDeclaredMethods(Class.java:2615)
    at java.lang.Class.getMethod0(Class.java:2856)
    at java.lang.Class.getMethod(Class.java:1668)
    at org.apache.flink.client.program.PackagedProgram.hasMainMethod(PackagedProgram.java:473)
    ... 45 more
Caused by: java.lang.ClassNotFoundException: org.apache.flink.streaming.api.functions.source.SourceFunction
    at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
    ... 50 more

The taskmanager classpath includes the following:
Classpath: /srv/hadoop-2.4.0/tmp/nm-local-dir/usercache/myuser/appcache/application_1462487692793_0012/container_1462487692793_0012_01_000002/flink-dist_2.10-1.0.2.jar

It could be my pom Yarn dependency which I am not so sure about if I'm using the proper version:
<dependency>
<groupId>org.apache.flink</groupId>
  <artifactId>flink-clients_2.10</artifactId>
  <version>1.0.2</version>
</dependency>
<dependency>
  <groupId>org.apache.flink</groupId>
  <artifactId>flink-yarn_2.10</artifactId>
  <version>1.1-SNAPSHOT</version>
</dependency>


Thanks for you help!

Cheers,
Theo
Reply | Threaded
Open this post in threaded view
|

Re: Streaming job on YARN - ClassNotFoundException

rmetzger0
Hi Theo,

you can't mix different Flink versions in your dependencies. Please use 1.0.2 for the flink_yarn client as well or 1.1-SNAPSHOT everywhere.

On Fri, May 6, 2016 at 7:02 PM, Theofilos Kakantousis <[hidden email]> wrote:

Hi everyone,

Flink 1.0.2
Hadoop 2.4.0

I am running Flink on Yarn by using FlinkYarnClient to launch a Flink cluster and Flink Client to submit a PackagedProgram. To keep it simple, for batch jobs I use the WordCount example and for streaming the IterateExample and IncrementalLearning ones without args.

Batch job executes successfully. However, the streaming ones fail with ClassNotFoundException.
For example the IncrementalLearning job throws this exception:
Caused by: java.lang.RuntimeException: Could not look up the main(String[]) method from the class org.apache.flink.streaming.examples.ml.IncrementalLearningSkeleton: org/apache/flink/streaming/api/functions/source/SourceFunction
    at org.apache.flink.client.program.PackagedProgram.hasMainMethod(PackagedProgram.java:479)
    at org.apache.flink.client.program.PackagedProgram.<init>(PackagedProgram.java:216)
    at org.apache.flink.client.program.PackagedProgram.<init>(PackagedProgram.java:106)
    [..]
Caused by: java.lang.NoClassDefFoundError: org/apache/flink/streaming/api/functions/source/SourceFunction
                                           org/apache/flink/streaming/api/functions/source/SourceFunction.class
    at java.lang.Class.getDeclaredMethods0(Native Method)
    at java.lang.Class.privateGetDeclaredMethods(Class.java:2615)
    at java.lang.Class.getMethod0(Class.java:2856)
    at java.lang.Class.getMethod(Class.java:1668)
    at org.apache.flink.client.program.PackagedProgram.hasMainMethod(PackagedProgram.java:473)
    ... 45 more
Caused by: java.lang.ClassNotFoundException: org.apache.flink.streaming.api.functions.source.SourceFunction
    at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
    ... 50 more

The taskmanager classpath includes the following:
Classpath: /srv/hadoop-2.4.0/tmp/nm-local-dir/usercache/myuser/appcache/application_1462487692793_0012/container_1462487692793_0012_01_000002/flink-dist_2.10-1.0.2.jar

It could be my pom Yarn dependency which I am not so sure about if I'm using the proper version:
<dependency>
<groupId>org.apache.flink</groupId>
  <artifactId>flink-clients_2.10</artifactId>
  <version>1.0.2</version>
</dependency>
<dependency>
  <groupId>org.apache.flink</groupId>
  <artifactId>flink-yarn_2.10</artifactId>
  <version>1.1-SNAPSHOT</version>
</dependency>


Thanks for you help!

Cheers,
Theo

Reply | Threaded
Open this post in threaded view
|

Re: Streaming job on YARN - ClassNotFoundException

Theofilos Kakantousis
Hi Robert,

Thank you for the prompt reply. You're right, it was a left over from a previous build. With the fixed dependencies, I get the same error though.

A have a question on job submission as well. I use the following  code to submit the job:

InetSocketAddress jobManagerAddress = cluster.getJobManagerAddress();
org.apache.flink.configuration.Configuration clientConf = new org.apache.flink.configuration.Configuration();
clientConf.setInteger(ConfigConstants.JOB_MANAGER_IPC_PORT_KEY, jobManagerAddress.getPort());
clientConf.setString(ConfigConstants.JOB_MANAGER_IPC_ADDRESS_KEY, jobManagerAddress.getHostName());
Client client = new Client(clientConf);
PackagedProgram program = new PackagedProgram(file,args);
JobSubmissionResult res =  client.runDetached(program, parallelism);
 
Is there a way to submit the job to the cluster and avoid setting the parallelism explicitly?

Thanks,
Theo

On 2016-05-06 19:50, Robert Metzger wrote:
Hi Theo,

you can't mix different Flink versions in your dependencies. Please use 1.0.2 for the flink_yarn client as well or 1.1-SNAPSHOT everywhere.

On Fri, May 6, 2016 at 7:02 PM, Theofilos Kakantousis <[hidden email]> wrote:

Hi everyone,

Flink 1.0.2
Hadoop 2.4.0

I am running Flink on Yarn by using FlinkYarnClient to launch a Flink cluster and Flink Client to submit a PackagedProgram. To keep it simple, for batch jobs I use the WordCount example and for streaming the IterateExample and IncrementalLearning ones without args.

Batch job executes successfully. However, the streaming ones fail with ClassNotFoundException.
For example the IncrementalLearning job throws this exception:
Caused by: java.lang.RuntimeException: Could not look up the main(String[]) method from the class org.apache.flink.streaming.examples.ml.IncrementalLearningSkeleton: org/apache/flink/streaming/api/functions/source/SourceFunction
    at org.apache.flink.client.program.PackagedProgram.hasMainMethod(PackagedProgram.java:479)
    at org.apache.flink.client.program.PackagedProgram.<init>(PackagedProgram.java:216)
    at org.apache.flink.client.program.PackagedProgram.<init>(PackagedProgram.java:106)
    [..]
Caused by: java.lang.NoClassDefFoundError: org/apache/flink/streaming/api/functions/source/SourceFunction
                                           org/apache/flink/streaming/api/functions/source/SourceFunction.class
    at java.lang.Class.getDeclaredMethods0(Native Method)
    at java.lang.Class.privateGetDeclaredMethods(Class.java:2615)
    at java.lang.Class.getMethod0(Class.java:2856)
    at java.lang.Class.getMethod(Class.java:1668)
    at org.apache.flink.client.program.PackagedProgram.hasMainMethod(PackagedProgram.java:473)
    ... 45 more
Caused by: java.lang.ClassNotFoundException: org.apache.flink.streaming.api.functions.source.SourceFunction
    at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
    ... 50 more

The taskmanager classpath includes the following:
Classpath: /srv/hadoop-2.4.0/tmp/nm-local-dir/usercache/myuser/appcache/application_1462487692793_0012/container_1462487692793_0012_01_000002/flink-dist_2.10-1.0.2.jar

It could be my pom Yarn dependency which I am not so sure about if I'm using the proper version:
<dependency>
<groupId>org.apache.flink</groupId>
  <artifactId>flink-clients_2.10</artifactId>
  <version>1.0.2</version>
</dependency>
<dependency>
  <groupId>org.apache.flink</groupId>
  <artifactId>flink-yarn_2.10</artifactId>
  <version>1.1-SNAPSHOT</version>
</dependency>


Thanks for you help!

Cheers,
Theo