okkam@okkamVM:~/git/flink-batch-processor/flink-batch-processor/scripts$ ./mongo_quality_analysis.sh ./../ mongodb://localhost:27017/tagcloud.entitonstmp moving to folder ./../ to analyze quality of MongoDB collection mongodb://localhost:27017/tagcloud.entitonstmp [INFO] Scanning for projects... [INFO] [INFO] ------------------------------------------------------------------------ [INFO] Building Apache Flink Batch Processor 1.1-SNAPSHOT [INFO] ------------------------------------------------------------------------ [INFO] [INFO] --- maven-clean-plugin:2.3:clean (default-clean) @ flink-batch-processor --- [INFO] Deleting file set: /home/okkam/git/flink-batch-processor/flink-batch-processor/target (included: [**], excluded: []) [INFO] [INFO] --- maven-resources-plugin:2.3:resources (default-resources) @ flink-batch-processor --- [INFO] Using 'UTF-8' encoding to copy filtered resources. [INFO] Copying 1 resource [INFO] [INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ flink-batch-processor --- [INFO] Changes detected - recompiling the module! [INFO] Compiling 15 source files to /home/okkam/git/flink-batch-processor/flink-batch-processor/target/classes [WARNING] /home/okkam/git/flink-batch-processor/flink-batch-processor/src/main/java/org/tagcloud/persistence/batch/quality/operator/QualityCompletenessAnalyzerFlatMap.java: Some input files use unchecked or unsafe operations. [WARNING] /home/okkam/git/flink-batch-processor/flink-batch-processor/src/main/java/org/tagcloud/persistence/batch/quality/operator/QualityCompletenessAnalyzerFlatMap.java: Recompile with -Xlint:unchecked for details. [INFO] [INFO] --- maven-resources-plugin:2.3:testResources (default-testResources) @ flink-batch-processor --- [INFO] Using 'UTF-8' encoding to copy filtered resources. [INFO] Copying 0 resource [INFO] [INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ flink-batch-processor --- [INFO] Nothing to compile - all classes are up to date [INFO] [INFO] --- maven-surefire-plugin:2.10:test (default-test) @ flink-batch-processor --- [INFO] Tests are skipped. [INFO] [INFO] --- maven-jar-plugin:2.2:jar (default-jar) @ flink-batch-processor --- [INFO] Building jar: /home/okkam/git/flink-batch-processor/flink-batch-processor/target/flink-batch-processor-1.1-SNAPSHOT.jar [INFO] [INFO] --- maven-install-plugin:2.3:install (default-install) @ flink-batch-processor --- [INFO] Installing /home/okkam/git/flink-batch-processor/flink-batch-processor/target/flink-batch-processor-1.1-SNAPSHOT.jar to /home/okkam/.m2/repository/org/tagcloud/persistence/flink-batch-processor/1.1-SNAPSHOT/flink-batch-processor-1.1-SNAPSHOT.jar [INFO] Installing /home/okkam/git/flink-batch-processor/flink-batch-processor/pom.xml to /home/okkam/.m2/repository/org/tagcloud/persistence/flink-batch-processor/1.1-SNAPSHOT/flink-batch-processor-1.1-SNAPSHOT.pom [INFO] ------------------------------------------------------------------------ [INFO] BUILD SUCCESS [INFO] ------------------------------------------------------------------------ [INFO] Total time: 4.973s [INFO] Finished at: Fri Jul 24 09:31:46 CEST 2015 [INFO] Final Memory: 39M/286M [INFO] ------------------------------------------------------------------------ [INFO] Scanning for projects... [INFO] [INFO] ------------------------------------------------------------------------ [INFO] Building Apache Flink Batch Processor 1.1-SNAPSHOT [INFO] ------------------------------------------------------------------------ [INFO] [INFO] --- exec-maven-plugin:1.4.0:java (default-cli) @ flink-batch-processor --- SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/okkam/.m2/repository/org/slf4j/slf4j-log4j12/1.7.7/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/okkam/.m2/repository/org/slf4j/slf4j-jdk14/1.7.7/slf4j-jdk14-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 2015-07-24 09:31:51 INFO deprecation:840 - mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir 2015-07-24 09:31:51 INFO deprecation:840 - mapred.child.tmp is deprecated. Instead, use mapreduce.task.tmp.dir 2015-07-24 09:31:52 INFO ExecutionEnvironment:975 - The job has 0 registered types and 0 default Kryo serializers 2015-07-24 09:31:53 INFO Slf4jLogger:80 - Slf4jLogger started 2015-07-24 09:31:53 INFO BlobServer:83 - Created BLOB server storage directory /tmp/blobStore-dd411ea6-59e0-4b7a-b6ad-a907fb6ca1ec 2015-07-24 09:31:53 INFO BlobServer:122 - Started BLOB server at 0.0.0.0:60553 - max concurrent requests: 50 - max backlog: 1000 2015-07-24 09:31:53 INFO JobManager:128 - Starting JobManager at akka://flink/user/jobmanager#1433586880. 2015-07-24 09:31:53 INFO TaskManager:128 - Messages between TaskManager and JobManager have a max timeout of 100000 milliseconds 2015-07-24 09:31:53 INFO TaskManager:128 - Temporary file directory '/tmp': total 61 GB, usable 9 GB (14,75% usable) 2015-07-24 09:31:53 INFO NetworkBufferPool:101 - Allocated 64 MB for network buffer pool (number of memory segments: 2048, bytes per segment: 32768). 2015-07-24 09:31:53 INFO TaskManager:128 - Using 781 MB for Flink managed memory. 2015-07-24 09:31:54 INFO IOManager:97 - I/O manager uses directory /tmp/flink-io-921839fb-dc76-4a0f-a49b-6d86656f2a71 for spill files. 2015-07-24 09:31:54 INFO FileCache:88 - User file cache uses directory /tmp/flink-dist-cache-e633b2f6-6090-4a59-8617-4632ef1ca43a 2015-07-24 09:31:54 INFO TaskManager:128 - Starting TaskManager actor at akka://flink/user/taskmanager_1#-1381605794. 2015-07-24 09:31:54 INFO TaskManager:128 - TaskManager data connection information: localhost (dataPort=43675) 2015-07-24 09:31:54 INFO TaskManager:128 - TaskManager has 16 task slot(s). 2015-07-24 09:31:54 INFO TaskManager:128 - Memory usage stats: [HEAP: 880/1450/2380 MB, NON HEAP: 32/53/130 MB (used/committed/max)] 2015-07-24 09:31:54 INFO TaskManager:128 - Trying to register at JobManager akka://flink/user/jobmanager (attempt 1, timeout: 500 milliseconds) 2015-07-24 09:31:54 INFO InstanceManager:161 - Registered TaskManager at localhost (akka://flink/user/taskmanager_1) as 9371bef5251426005ca8f21c6f93a56d. Current number of registered hosts is 1. 2015-07-24 09:31:54 INFO TaskManager:128 - Successful registration at JobManager (akka://flink/user/jobmanager), starting network stack and library cache. 2015-07-24 09:31:54 INFO TaskManager:128 - Determined BLOB server address to be localhost/127.0.0.1:60553. Starting BLOB cache. 2015-07-24 09:31:54 INFO BlobCache:70 - Created BLOB cache storage directory /tmp/blobStore-f2a87292-4818-4da9-a6d9-84978f8379bb 2015-07-24 09:31:54 INFO JobClient:79 - Sending message to JobManager akka://flink/user/jobmanager to submit job Mongodb Entiton Quality Analysis (d388ce29c65a0cfba361aafcc396e06d) and wait for progress 2015-07-24 09:31:54 INFO JobManager:128 - Received job d388ce29c65a0cfba361aafcc396e06d (Mongodb Entiton Quality Analysis). 2015-07-24 09:31:54 ERROR JobManager:116 - Failed to submit job d388ce29c65a0cfba361aafcc396e06d (Mongodb Entiton Quality Analysis) org.apache.flink.runtime.client.JobExecutionException: Cannot initialize task 'DataSink (org.tagcloud.persistence.batch.MongoHadoopOutputFormat@19b88891)': Deserializing the OutputFormat (org.tagcloud.persistence.batch.MongoHadoopOutputFormat@19b88891) failed: Could not read the user code wrapper: unexpected block data at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$4.apply(JobManager.scala:523) at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$4.apply(JobManager.scala:507) at scala.collection.Iterator$class.foreach(Iterator.scala:727) at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at scala.collection.AbstractIterable.foreach(Iterable.scala:54) at org.apache.flink.runtime.jobmanager.JobManager.org$apache$flink$runtime$jobmanager$JobManager$$submitJob(JobManager.scala:507) at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$receiveWithLogMessages$1.applyOrElse(JobManager.scala:190) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25) at org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:36) at org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:29) at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118) at org.apache.flink.runtime.ActorLogMessages$$anon$1.applyOrElse(ActorLogMessages.scala:29) at akka.actor.Actor$class.aroundReceive(Actor.scala:465) at org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:92) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516) at akka.actor.ActorCell.invoke(ActorCell.scala:487) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254) at akka.dispatch.Mailbox.run(Mailbox.scala:221) at akka.dispatch.Mailbox.exec(Mailbox.scala:231) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) Caused by: java.lang.Exception: Deserializing the OutputFormat (org.tagcloud.persistence.batch.MongoHadoopOutputFormat@19b88891) failed: Could not read the user code wrapper: unexpected block data at org.apache.flink.runtime.jobgraph.OutputFormatVertex.initializeOnMaster(OutputFormatVertex.java:63) at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$4.apply(JobManager.scala:520) ... 25 more Caused by: org.apache.flink.runtime.operators.util.CorruptConfigurationException: Could not read the user code wrapper: unexpected block data at org.apache.flink.runtime.operators.util.TaskConfig.getStubWrapper(TaskConfig.java:286) at org.apache.flink.runtime.jobgraph.OutputFormatVertex.initializeOnMaster(OutputFormatVertex.java:60) ... 26 more Caused by: java.io.StreamCorruptedException: unexpected block data at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1364) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) at org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:302) at org.apache.flink.util.InstantiationUtil.readObjectFromConfig(InstantiationUtil.java:264) at org.apache.flink.runtime.operators.util.TaskConfig.getStubWrapper(TaskConfig.java:282) ... 27 more 2015-07-24 09:31:54 INFO FlinkMiniCluster:148 - Stopping FlinkMiniCluster. 2015-07-24 09:31:54 INFO TaskManager:128 - Stopping TaskManager akka://flink/user/taskmanager_1#-1381605794. 2015-07-24 09:31:54 INFO JobManager:128 - Stopping JobManager akka://flink/user/jobmanager#1433586880. 2015-07-24 09:31:54 INFO TaskManager:128 - Disassociating from JobManager 2015-07-24 09:31:55 INFO IOManager:127 - I/O manager removed spill file directory /tmp/flink-io-921839fb-dc76-4a0f-a49b-6d86656f2a71 2015-07-24 09:31:55 INFO TaskManager:128 - Task manager akka://flink/user/taskmanager_1 is completely shut down. [WARNING] java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:293) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.flink.runtime.client.JobExecutionException: Cannot initialize task 'DataSink (org.tagcloud.persistence.batch.MongoHadoopOutputFormat@19b88891)': Deserializing the OutputFormat (org.tagcloud.persistence.batch.MongoHadoopOutputFormat@19b88891) failed: Could not read the user code wrapper: unexpected block data at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$4.apply(JobManager.scala:523) at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$4.apply(JobManager.scala:507) at scala.collection.Iterator$class.foreach(Iterator.scala:727) at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at scala.collection.AbstractIterable.foreach(Iterable.scala:54) at org.apache.flink.runtime.jobmanager.JobManager.org$apache$flink$runtime$jobmanager$JobManager$$submitJob(JobManager.scala:507) at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$receiveWithLogMessages$1.applyOrElse(JobManager.scala:190) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25) at org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:36) at org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:29) at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118) at org.apache.flink.runtime.ActorLogMessages$$anon$1.applyOrElse(ActorLogMessages.scala:29) at akka.actor.Actor$class.aroundReceive(Actor.scala:465) at org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:92) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516) at akka.actor.ActorCell.invoke(ActorCell.scala:487) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254) at akka.dispatch.Mailbox.run(Mailbox.scala:221) at akka.dispatch.Mailbox.exec(Mailbox.scala:231) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) Caused by: java.lang.Exception: Deserializing the OutputFormat (org.tagcloud.persistence.batch.MongoHadoopOutputFormat@19b88891) failed: Could not read the user code wrapper: unexpected block data at org.apache.flink.runtime.jobgraph.OutputFormatVertex.initializeOnMaster(OutputFormatVertex.java:63) at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$4.apply(JobManager.scala:520) ... 25 more Caused by: org.apache.flink.runtime.operators.util.CorruptConfigurationException: Could not read the user code wrapper: unexpected block data at org.apache.flink.runtime.operators.util.TaskConfig.getStubWrapper(TaskConfig.java:286) at org.apache.flink.runtime.jobgraph.OutputFormatVertex.initializeOnMaster(OutputFormatVertex.java:60) ... 26 more Caused by: java.io.StreamCorruptedException: unexpected block data at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1364) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) at org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:302) at org.apache.flink.util.InstantiationUtil.readObjectFromConfig(InstantiationUtil.java:264) at org.apache.flink.runtime.operators.util.TaskConfig.getStubWrapper(TaskConfig.java:282) ... 27 more [WARNING] thread Thread[Timer-0,5,org.tagcloud.persistence.batch.quality.FlinkMongoQualityUpdate] was interrupted but is still alive after waiting at least 14999msecs [WARNING] thread Thread[Timer-0,5,org.tagcloud.persistence.batch.quality.FlinkMongoQualityUpdate] will linger despite being asked to die via interruption [WARNING] thread Thread[ForkJoinPool-1-worker-1,5,org.tagcloud.persistence.batch.quality.FlinkMongoQualityUpdate] will linger despite being asked to die via interruption [WARNING] thread Thread[Timer-1,5,org.tagcloud.persistence.batch.quality.FlinkMongoQualityUpdate] will linger despite being asked to die via interruption [WARNING] NOTE: 3 thread(s) did not finish despite being asked to via interruption. This is not a problem with exec:java, it is a problem with the running code. Although not serious, it should be remedied. [WARNING] Couldn't destroy threadgroup org.codehaus.mojo.exec.ExecJavaMojo$IsolatedThreadGroup[name=org.tagcloud.persistence.batch.quality.FlinkMongoQualityUpdate,maxpri=10] java.lang.IllegalThreadStateException at java.lang.ThreadGroup.destroy(ThreadGroup.java:775) at org.codehaus.mojo.exec.ExecJavaMojo.execute(ExecJavaMojo.java:328) at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:101) at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:209) at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153) at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145) at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:84) at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:59) at org.apache.maven.lifecycle.internal.LifecycleStarter.singleThreadedBuild(LifecycleStarter.java:183) at org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:161) at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:320) at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:156) at org.apache.maven.cli.MavenCli.execute(MavenCli.java:537) at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:196) at org.apache.maven.cli.MavenCli.main(MavenCli.java:141) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:290) at org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:230) at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:409) at org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:352) [INFO] ------------------------------------------------------------------------ [INFO] BUILD FAILURE [INFO] ------------------------------------------------------------------------ [INFO] Total time: 21.002s [INFO] Finished at: Fri Jul 24 09:32:10 CEST 2015 [INFO] Final Memory: 21M/1486M [INFO] ------------------------------------------------------------------------ [ERROR] Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.4.0:java (default-cli) on project flink-batch-processor: An exception occured while executing the Java class. null: InvocationTargetException: Cannot initialize task 'DataSink (org.tagcloud.persistence.batch.MongoHadoopOutputFormat@19b88891)': Deserializing the OutputFormat (org.tagcloud.persistence.batch.MongoHadoopOutputFormat@19b88891) failed: Could not read the user code wrapper: unexpected block data -> [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException