Re: problems starting the training exercise TaxiRideCleansing on local cluster

Posted by Günter Hipler on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/problems-starting-the-training-exercise-TaxiRideCleansing-on-local-cluster-tp14160p14390.html

Hi Nico,

thanks for looking into it. The reason for the behavior on my system: I
had two different jdk versions installed (openjdk and oracle jdk) - I
wasn't aware of because I prefer to use generally the oracle jdk.
Somehow, I didn't analyze at greater depth, both versions were used in
different scenarios which seemed to cause the error. After removing
openjdk completely from my system I could use the local flink cluster
with my application jar.

Sorry for the inconvenience!

Günter


On 10.07.2017 15:43, Nico Kruber wrote:

> Hi Günter,
> unfortunately, I cannot reproduce your error. This is what I did (following
> http://training.data-artisans.com/devEnvSetup.html):
>
> * clone and build the flink-training-exercises project:
> git clone https://github.com/dataArtisans/flink-training-exercises.git
> cd flink-training-exercises
> mvn clean install
>
> * downloaded & extracted flink 1.3.1 (hadoop 2.7, Scala 2.10 - but that should
> not matter)
> * <path-to-flink-1.3.1>/bin/start-local.sh
>
> * create a development project:
> mvn archetype:generate                             \
>      -DarchetypeGroupId=org.apache.flink            \
>      -DarchetypeArtifactId=flink-quickstart-java    \
>      -DarchetypeVersion=1.3.1                       \
>      -DgroupId=org.apache.flink.quickstart          \
>      -DartifactId=flink-java-project                \
>      -Dversion=0.1                                  \
>      -Dpackage=org.apache.flink.quickstart          \
>      -DinteractiveMode=false
> * add flink-training-exercises 0.10.0 dependency
> <dependency>
>    <groupId>com.data-artisans</groupId>
>    <artifactId>flink-training-exercises</artifactId>
>    <version>0.10.0</version>
> </dependency>
> * implement the task (http://training.data-artisans.com/exercises/
> rideCleansing.html)
> * <path-to-flink-1.3.1>/flink run -c
> org.apache.flink.quickstart.TaxiStreamCleansingJob ./flink-java-project/target/
> flink-java-project-0.1.jar
>
> What I noticed though is that my dependency tree only contains joda-
> time-2.7.jar not 2.9.9 as in your case - did you change the dependencies
> somehow?
> mvn clean package
> ...
> INFO] Including joda-time:joda-time:jar:2.7 in the shaded jar.
> ...
>
> Could you try with a new development project set up the way above and copy
> your code into this?
>
> If that doesn't help, try with a freshly extracted unaltered flink archive.
>
>
> Nico
>
>
> On Sunday, 9 July 2017 17:32:25 CEST Günter Hipler wrote:
>> Thanks for response.
>>
>> My classpath contains a version
>>
>>    mvn dependency:build-classpath
>> [INFO] Scanning for projects...
>> [INFO]
>> [INFO]
>> ------------------------------------------------------------------------
>> [INFO] Building Flink Quickstart Job 0.1
>> [INFO]
>> ------------------------------------------------------------------------
>> [INFO]
>> [INFO] --- maven-dependency-plugin:2.8:build-classpath (default-cli) @
>> flink-java-project ---
>> [INFO] Dependencies classpath:
>> <snip>
>>
>> togram/2.1.6/HdrHistogram-2.1.6.jar:/home/swissbib/.m2/repository/com/twitte
>> r/jsr166e/1.1.0/jsr166e-1.1.0.jar:/home/swissbib/.m2/repository/joda-time/jo
>> da-time/2.9.9/joda-time-2.9.9.jar:
>>
>> <snip>
>>
>> which contains definitely the required method.
>> (http://www.joda.org/joda-time/apidocs/org/joda/time/format/DateTimeFormatte
>> r.html#withZoneUTC--)
>>
>> Something else is going wrong. I guess the way how I started (or
>> configured) the local cluster (but it's done as described in the
>> training setup (http://training.data-artisans.com/devEnvSetup.html) -
>> which is very straightforward.
>>
>> Günter
>>
>> On 09.07.2017 16:17, Ted Yu wrote:
>>> Since the exception was about a missing method (withZoneUTC) instead
>>> of class not found, it was likely due to a conflicting joda time jar
>>> being on the classpath.
>>>
>>> Cheers
>>>
>>> On Sun, Jul 9, 2017 at 1:22 AM, Günter Hipler
>>>
>>> <[hidden email] <mailto:[hidden email]>> wrote:
>>>      Hi,
>>>      
>>>      sorry for this newbie question...
>>>      
>>>      I'm following the data artisans exercises and wanted to run the
>>>      TaxiRide Cleansing job on my local cluster (version 1.3.1)
>>>      (http://training.data-artisans.com/exercises/rideCleansing.html
>>>      <http://training.data-artisans.com/exercises/rideCleansing.html>)
>>>      
>>>      While this is possible within my IDE the cluster throws an
>>>      exception because of a missing type although the missed type is
>>>      part of the application jar the cluster is provided with.
>>>      
>>>      swissbib@ub-sbhp02:~/environment/code/flink_einarbeitung/training/flin
>>>      k-java-project/target$ jar tf flink-java-project-0.1.jar | grep
>>>      DateTimeFormatter
>>>      org/elasticsearch/common/joda/FormatDateTimeFormatter.class
>>>      org/joda/time/format/DateTimeFormatter.class
>>>      org/joda/time/format/DateTimeFormatterBuilder$CharacterLiteral.class
>>>      org/joda/time/format/DateTimeFormatterBuilder$Composite.class
>>>      org/joda/time/format/DateTimeFormatterBuilder$FixedNumber.class
>>>      org/joda/time/format/DateTimeFormatterBuilder$Fraction.class
>>>      org/joda/time/format/DateTimeFormatterBuilder$MatchingParser.class
>>>      org/joda/time/format/DateTimeFormatterBuilder$NumberFormatter.class
>>>      org/joda/time/format/DateTimeFormatterBuilder$PaddedNumber.class
>>>      org/joda/time/format/DateTimeFormatterBuilder$StringLiteral.class
>>>      org/joda/time/format/DateTimeFormatterBuilder$TextField.class
>>>      org/joda/time/format/DateTimeFormatterBuilder$TimeZoneId.class
>>>      org/joda/time/format/DateTimeFormatterBuilder$TimeZoneName.class
>>>      org/joda/time/format/DateTimeFormatterBuilder$TimeZoneOffset.class
>>>      org/joda/time/format/DateTimeFormatterBuilder$TwoDigitYear.class
>>>      org/joda/time/format/DateTimeFormatterBuilder$UnpaddedNumber.class
>>>      org/joda/time/format/DateTimeFormatterBuilder.class
>>>      
>>>      Any advice? Thanks!
>>>      
>>>      Günter
>>>      
>>>      
>>>      swissbib@ub-sbhp02:/usr/local/swissbib/flink$ bin/flink run -c
>>>      org.apache.flink.quickstart.St
>>>      <http://org.apache.flink.quickstart.St>reamingJob
>>>      /home/swissbib/environment/code/flink_einarbeitung/training/flink-java
>>>      -project/target/flink-java-project-0.1.jar --input
>>>      /home/swissbib/environment/code/flink_einarbeitung/training/flink-java
>>>      -project/data/nycTaxiRides.gz
>>>      
>>>      SLF4J: Class path contains multiple SLF4J bindings.
>>>      SLF4J: Found binding in
>>>      [jar:file:/usr/local/swissbib/flink-1.3.1/lib/slf4j-log4j12-1.7.7.jar!
>>>      /org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in
>>>      [jar:file:/home/swissbib/environment/tools/hbase-1.2.1/lib/slf4j-log4j
>>>      12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found
>>>      binding in
>>>      [jar:file:/home/swissbib/environment/tools/hadoop-2.5.1/share/hadoop/c
>>>      ommon/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.c
>>>      lass] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings
>>>      <http://www.slf4j.org/codes.html#multiple_bindings> for an
>>>      explanation.
>>>      SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
>>>      Cluster configuration: Standalone cluster with JobManager at
>>>      localhost/127.0.0.1:6123 <http://127.0.0.1:6123>
>>>      Using address localhost:6123 to connect to JobManager.
>>>      JobManager web interface address http://localhost:8081
>>>      Starting execution of program
>>>      Submitting job with JobID: 32c7f2d0bbcac4d8c0367639ea928014.
>>>      Waiting for job completion.
>>>      Connected to JobManager at
>>>      Actor[akka.tcp://flink@localhost:6123/user/jobmanager#-1464375722]
>>>      with leader session id 00000000-0000-0000-0000-000000000000.
>>>      07/09/2017 09:31:51    Job execution switched to status RUNNING.
>>>      07/09/2017 09:31:51    Source: Custom Source -> Filter -> Sink:
>>>      Unnamed(1/1) switched to SCHEDULED
>>>      07/09/2017 09:31:51    Source: Custom Source -> Filter -> Sink:
>>>      Unnamed(1/1) switched to DEPLOYING
>>>      07/09/2017 09:31:51    Source: Custom Source -> Filter -> Sink:
>>>      Unnamed(1/1) switched to RUNNING
>>>      07/09/2017 09:31:51    Source: Custom Source -> Filter -> Sink:
>>>      Unnamed(1/1) switched to FAILED
>>>      java.lang.NoSuchMethodError:
>>>      org.joda.time.format.DateTimeFormatter.withZoneUTC()Lorg/joda/time/for
>>>      mat/DateTimeFormatter;>
>>>          at
>>>      
>>>      com.dataartisans.flinktraining.exercises.datastream_java.datatypes.Tax
>>>      iRide.<clinit>(TaxiRide.java:43)>
>>>          at
>>>      
>>>      com.dataartisans.flinktraining.exercises.datastream_java.sources.TaxiR
>>>      ideSource.generateUnorderedStream(TaxiRideSource.java:142)>
>>>          at
>>>      
>>>      com.dataartisans.flinktraining.exercises.datastream_java.sources.TaxiR
>>>      ideSource.run(TaxiRideSource.java:113)>
>>>          at
>>>      
>>>      org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource
>>>      .java:87)>
>>>          at
>>>      
>>>      org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource
>>>      .java:55)>
>>>          at
>>>      
>>>      org.apache.flink.streaming.runtime.tasks.SourceStreamTask.run(SourceSt
>>>      reamTask.java:95)>
>>>          at
>>>      
>>>      org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.
>>>      java:263)>
>>>          at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702)
>>>          at java.lang.Thread.run(Thread.java:745)
>>>      
>>>      07/09/2017 09:31:51    Job execution switched to status FAILING.
>>>      java.lang.NoSuchMethodError:
>>>      org.joda.time.format.DateTimeFormatter.withZoneUTC()Lorg/joda/time/for
>>>      mat/DateTimeFormatter;>
>>>          at
>>>      
>>>      com.dataartisans.flinktraining.exercises.datastream_java.datatypes.Tax
>>>      iRide.<clinit>(TaxiRide.java:43)>
>>>          at
>>>      
>>>      com.dataartisans.flinktraining.exercises.datastream_java.sources.TaxiR
>>>      ideSource.generateUnorderedStream(TaxiRideSource.java:142)>
>>>          at
>>>      
>>>      com.dataartisans.flinktraining.exercises.datastream_java.sources.TaxiR
>>>      ideSource.run(TaxiRideSource.java:113)>
>>>          at
>>>      
>>>      org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource
>>>      .java:87)>
>>>          at
>>>      
>>>      org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource
>>>      .java:55)>
>>>          at
>>>      
>>>      org.apache.flink.streaming.runtime.tasks.SourceStreamTask.run(SourceSt
>>>      reamTask.java:95)>
>>>          at
>>>      
>>>      org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.
>>>      java:263)>
>>>          at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702)
>>>          at java.lang.Thread.run(Thread.java:745)
>>>      
>>>      07/09/2017 09:31:51    Job execution switched to status FAILED.
>>>      
>>>      ------------------------------------------------------------
>>>      
>>>       The program finished with the following exception:
>>>      org.apache.flink.client.program.ProgramInvocationException: The
>>>      program execution failed: Job execution failed.
>>>      
>>>          at
>>>      
>>>      org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:4
>>>      78)
>>>      
>>>          at
>>>      
>>>      org.apache.flink.client.program.StandaloneClusterClient.submitJob(Stan
>>>      daloneClusterClient.java:105)>
>>>          at
>>>      
>>>      org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:4
>>>      42)
>>>      
>>>          at
>>>      
>>>      org.apache.flink.streaming.api.environment.StreamContextEnvironment.ex
>>>      ecute(StreamContextEnvironment.java:73)>
>>>          at org.apache.flink.quickstart.St
>>>      
>>>      <http://org.apache.flink.quickstart.St>reamingJob.main(StreamingJob.ja
>>>      va:81)>
>>>          at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>          at
>>>      
>>>      sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j
>>>      ava:62)>
>>>          at
>>>      
>>>      sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess
>>>      orImpl.java:43)>
>>>          at java.lang.reflect.Method.invoke(Method.java:498)
>>>          at
>>>      
>>>      org.apache.flink.client.program.PackagedProgram.callMainMethod(Package
>>>      dProgram.java:528)>
>>>          at
>>>      
>>>      org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeF
>>>      orExecution(PackagedProgram.java:419)>
>>>          at
>>>      
>>>      org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:3
>>>      81)
>>>      
>>>          at
>>>      
>>>      org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:83
>>>      8)
>>>      
>>>          at org.apache.flink.client.CliFrontend.run(CliFrontend.java:259)
>>>          at
>>>      
>>>      org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1
>>>      086)
>>>      
>>>          at
>>>      
>>>      org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1133)
>>>      
>>>          at
>>>      
>>>      org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1130)
>>>      
>>>          at
>>>      
>>>      org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSe
>>>      curityContext.java:43)>
>>>          at java.security.AccessController.doPrivileged(Native Method)
>>>          at javax.security.auth.Subject.do
>>>      
>>>      <http://javax.security.auth.Subject.do>As(Subject.java:422)
>>>      
>>>          at
>>>      
>>>      org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformat
>>>      ion.java:1657)>
>>>          at
>>>      
>>>      org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(Had
>>>      oopSecurityContext.java:40)>
>>>          at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1130)
>>>      
>>>      Caused by: org.apache.flink.runtime.client.JobExecutionException:
>>>      Job execution failed.
>>>      
>>>          at
>>>      
>>>      org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$
>>>      1$$anonfun$applyOrElse$6.apply$mcV$sp(JobManager.scala:933)>
>>>          at
>>>      
>>>      org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$
>>>      1$$anonfun$applyOrElse$6.apply(JobManager.scala:876)>
>>>          at
>>>      
>>>      org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$
>>>      1$$anonfun$applyOrElse$6.apply(JobManager.scala:876)>
>>>          at
>>>      
>>>      scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(F
>>>      uture.scala:24)>
>>>          at
>>>      
>>>      scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scal
>>>      a:24)
>>>      
>>>          at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
>>>          at
>>>      
>>>      akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(Abstr
>>>      actDispatcher.scala:397)>
>>>          at
>>>      
>>>      scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>>>      
>>>          at
>>>      
>>>      scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.
>>>      java:1339)>
>>>          at
>>>      
>>>      scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:197
>>>      9)
>>>      
>>>          at
>>>      
>>>      scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThrea
>>>      d.java:107) Caused by: java.lang.NoSuchMethodError:
>>>      org.joda.time.format.DateTimeFormatter.withZoneUTC()Lorg/joda/time/for
>>>      mat/DateTimeFormatter;>
>>>          at
>>>      
>>>      com.dataartisans.flinktraining.exercises.datastream_java.datatypes.Tax
>>>      iRide.<clinit>(TaxiRide.java:43)>
>>>          at
>>>      
>>>      com.dataartisans.flinktraining.exercises.datastream_java.sources.TaxiR
>>>      ideSource.generateUnorderedStream(TaxiRideSource.java:142)>
>>>          at
>>>      
>>>      com.dataartisans.flinktraining.exercises.datastream_java.sources.TaxiR
>>>      ideSource.run(TaxiRideSource.java:113)>
>>>          at
>>>      
>>>      org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource
>>>      .java:87)>
>>>          at
>>>      
>>>      org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource
>>>      .java:55)>
>>>          at
>>>      
>>>      org.apache.flink.streaming.runtime.tasks.SourceStreamTask.run(SourceSt
>>>      reamTask.java:95)>
>>>          at
>>>      
>>>      org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.
>>>      java:263)>
>>>          at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702)
>>>          at java.lang.Thread.run(Thread.java:745)

--
Universität Basel
Universitätsbibliothek
Günter Hipler
Projekt SwissBib
Schoenbeinstrasse 18-20
4056 Basel, Schweiz
Tel.: + 41 (0)61 267 31 12 Fax: ++41 61 267 3103
E-Mail [hidden email]
URL: www.swissbib.org  / http://www.ub.unibas.ch/