Hello all,
I was trying the examples provided in the Flink 0.6 bundle and there was an interesting jarfile: flink-java-examples-0.6-incubating-WebLogAnalysis.jar However, trying the command to display command-line arguments : $ ./bin/flink info -d ./examples/flink-java-examples-0.6-incubating-WebLogAnalysis.jar does not give any useful information. Could anyone share the details about the arguments and the formats required. Also, where to find the source code (all dependencies) for the example jarfiles ? Where are the instructions to compile them (pom.xml or Makefile). We are using command-line instructions. IMHO, the examples are good starting point for modifying and writing complex workflows for Flink. Thanks in advance, Anirvan |
Hi Anirvan,
sorry for the late response. You've posted the question to Nabble, which is only a mirror of our actual mailing list at user@flink.incubator.apache.org. Sadly, the message is not automatically posted to the apache list because the apache server is rejecting the mails from nabble. I've already asked and there is no way to change this behavior. So I actually saw the two messages you posted here by accident. Regarding your actual question: - The command line arguments for the WebLogAnalysis example are: "WebLogAnalysis <documents path> <ranks path> <visits path> <result path>" - Regarding the "info -d" command. I think its an artifact from our old java API. I've filed an issue in JIRA: https://issues.apache.org/jira/browse/FLINK-1095 Lets see how we resolve it. Thanks for reporting this! You can find the source code of all of our examples in the source release of Flink (in the flink-examples/flink-java-examples project. You can also access the source (and hence the examples) through GitHub: https://github.com/apache/incubator-flink/blob/master/flink-examples/flink-java-examples/src/main/java/org/apache/flink/example/java/relational/WebLogAnalysis.java. To build the examples, you can run: "mvn clean package -DskipTests" in the "flink-examples/flink-java-examples" directory. This will re-build them. If you don't want to import the whole Flink project just for playing around with the examples, you can also create an empty maven project. This script: curl https://raw.githubusercontent.com/apache/incubator-flink/master/flink-quickstart/quickstart.sh | bash will automatically set everything up for you. Just import the "quickstart" project into Eclipse or IntelliJ. It will download all dependencies and package everything correctly. If you want to use an example there, just copy the Java file into the "quickstart" project. The examples are indeed a very good way to learn how to write Flink jobs. Please continue asking if you have further questions! Best, Robert |
Hello Robert,
Thanks as usual for all your help with the information. I'm trying in vain to create the different input files from the program source code but running into difficulties. Could you (or anyone else) please post here samples of the 4 inputs that are required to run this program ? Thanks in advance, Anirvan Le 09/09/2014 23:54, rmetzger0 [via
Apache Flink (Incubator) User Mailing List archive.] a écrit :
Hi Anirvan, |
Hi, you have to use the "WebLogDataGenerator" found here: https://github.com/apache/incubator-flink/blob/master/flink-examples/flink-java-examples/src/main/java/org/apache/flink/examples/java/relational/util/WebLogDataGenerator.java It accepts two arguments, the number of documents and visits. The generated files are located in /tmp/documents /tmp/ranks and /tmp/visits. I've generated some sample data for you, located here: https://github.com/rmetzger/scratch/tree/weblogdataexample/weblog Best, Robert On Tue, Sep 23, 2014 at 4:05 PM, nirvanesque [via Apache Flink (Incubator) User Mailing List archive.] <[hidden email]> wrote: Hello Robert, |
Thanks Robert so much ! You rock ! I’ll get back to you later on similar questions for ML, graph and clustering examples (not so soon!) BTW, I like the (Java) coding style of the Stratosphere team – smart and with excellent commenting – very easy to learn and follow! From: rmetzger0 [via Apache Flink (Incubator) User Mailing List archive.] [mailto:[hidden email]] Hi, you have to use the "WebLogDataGenerator" found here: https://github.com/apache/incubator-flink/blob/master/flink-examples/flink-java-examples/src/main/java/org/apache/flink/examples/java/relational/util/WebLogDataGenerator.java It accepts two arguments, the number of documents and visits. The generated files are located in /tmp/documents /tmp/ranks and /tmp/visits. I've generated some sample data for you, located here: https://github.com/rmetzger/scratch/tree/weblogdataexample/weblog Best, Robert On Tue, Sep 23, 2014 at 4:05 PM, nirvanesque [via Apache Flink (Incubator) User Mailing List archive.] <[hidden email]> wrote: Hello Robert, Le 09/09/2014 23:54, rmetzger0 [via Apache Flink (Incubator) User Mailing List archive.] a écrit :
If you reply to this email, your message will be added to the discussion below: To unsubscribe from Looking for instructions & source for flink-java-examples-0.6-incubating-WebLogAnalysis.jar, click here. If you reply to this email, your message will be added to the discussion below: To unsubscribe from Looking for instructions & source for flink-java-examples-0.6-incubating-WebLogAnalysis.jar, click here. |
In reply to this post by rmetzger0
Hello robert,
Apologies for getting back to you on this one :-( Es tut mir wirklich Leid! I've been trying today several times in vain this programme, and following is the error output : [QUOTE] abasu@edel-4:~$ ./flink/bin/flink run -v flink/examples/flink-java-examples-0.6-incubating-WebLogAnalysis.jar file:///home/abasu/examples/Weblogs/documents file:///home/abasu/examples/Weblogs/ranks file:///home/abasu/examples/Weblogs/visits file:///home/abasu/examples/Weblogs/results Error: The main method caused an error. org.apache.flink.client.program.ProgramInvocationException: The main method caused an error. at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:404) at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:307) at org.apache.flink.client.program.Client.run(Client.java:244) at org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:332) at org.apache.flink.client.CliFrontend.run(CliFrontend.java:319) at org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:930) at org.apache.flink.client.CliFrontend.main(CliFrontend.java:954) Caused by: java.lang.IllegalArgumentException at org.apache.flink.compiler.dag.OptimizerNode.setDegreeOfParallelism(OptimizerNode.java:414) at org.apache.flink.compiler.PactCompiler$GraphCreatingVisitor.preVisit(PactCompiler.java:772) at org.apache.flink.compiler.PactCompiler$GraphCreatingVisitor.preVisit(PactCompiler.java:616) at org.apache.flink.api.common.operators.base.GenericDataSinkBase.accept(GenericDataSinkBase.java:287) at org.apache.flink.api.common.Plan.accept(Plan.java:281) at org.apache.flink.compiler.PactCompiler.compile(PactCompiler.java:511) at org.apache.flink.compiler.PactCompiler.compile(PactCompiler.java:460) at org.apache.flink.client.program.Client.getOptimizedPlan(Client.java:196) at org.apache.flink.client.program.Client.getOptimizedPlan(Client.java:209) at org.apache.flink.client.program.Client.run(Client.java:285) at org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:54) at org.apache.flink.example.java.relational.WebLogAnalysis.main(WebLogAnalysis.java:148) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:389) ... 6 more abasu@edel-4:~$ [/QUOTE] I'm using Flink version 0.6 on a cluster of 10 nodes. Job launched from Job manager node. As per your instructions, I've used the set for <<documents>> <<ranks>> <<visits>> <<results>> that I downloaded from your scratch tree branch. It seems that the main function fails because of other reasons (even before entering the programme logic). I'm trying to break my head on the exact reason .... If you could help me in this direction ... ? Thanks a million in advance, Anirvan Le 23/09/2014 17:22, rmetzger0 [via
Apache Flink (Incubator) User Mailing List archive.] a écrit :
|
Hi, this looks like a minor issue in your flink configuration. Can you have a look into your config (in the conf/flink-conf.yaml file) and tell me the value of the "parallelization.degree.default" property? If it is set to 0 or a negative value, thats the source of the issue. Set it to something higher (I recommend numOfNodes*numOfCoresPerNode). But I have to admit that our error reporting here is not very helpful, we'll improve that... Cheers, Robert On Wed, Sep 24, 2014 at 4:23 PM, nirvanesque [via Apache Flink (Incubator) User Mailing List archive.] <[hidden email]> wrote: Hello robert, |
I get a similar error when I use a negative value for the default parallelism in the flink-config.yaml I'll push a patch that gives better error messages and sanity checks the config value, prints a warning, and falls back to DOP 1 if the config is invalid and no manual value is provided. Stephan On Wed, Sep 24, 2014 at 5:48 PM, rmetzger0 <[hidden email]> wrote:
|
Robert, Stephan and the "powers that be" ...
Thanks for your valuable inputs. I tried them a few times today and here's the output and the config infos. Please see my suggestions at the end also. Looking forward to your helpful replies. The failure output : [QUOTE] abasu@edel-59:~$ ./flink/bin/flink run -v ./flink/examples/flink-java-examples-0.6-incubating-WebLogAnalysis.jar file:///home/abasu/examples/Weblogs/documents file:///home/abasu/examples/Weblogs/ranks file:///home/abasu/examples/Weblogs/visits file:///home/abasu/examples/Weblogs/result Error: The main method caused an error. org.apache.flink.client.program.ProgramInvocationException: The main method caused an error. at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:404) at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:307) at org.apache.flink.client.program.Client.run(Client.java:244) at org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:332) at org.apache.flink.client.CliFrontend.run(CliFrontend.java:319) at org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:930) at org.apache.flink.client.CliFrontend.main(CliFrontend.java:954) Caused by: java.lang.IllegalArgumentException at org.apache.flink.compiler.dag.OptimizerNode.setDegreeOfParallelism(OptimizerNode.java:414) at org.apache.flink.compiler.PactCompiler$GraphCreatingVisitor.preVisit(PactCompiler.java:772) at org.apache.flink.compiler.PactCompiler$GraphCreatingVisitor.preVisit(PactCompiler.java:616) at org.apache.flink.api.common.operators.base.GenericDataSinkBase.accept(GenericDataSinkBase.java:287) at org.apache.flink.api.common.Plan.accept(Plan.java:281) at org.apache.flink.compiler.PactCompiler.compile(PactCompiler.java:511) at org.apache.flink.compiler.PactCompiler.compile(PactCompiler.java:460) at org.apache.flink.client.program.Client.getOptimizedPlan(Client.java:196) at org.apache.flink.client.program.Client.getOptimizedPlan(Client.java:209) at org.apache.flink.client.program.Client.run(Client.java:285) at org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:54) at org.apache.flink.example.java.relational.WebLogAnalysis.main(WebLogAnalysis.java:148) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:389) ... 6 more abasu@edel-59:~$ [/QUOTE] The config file flink-conf.yaml (changed only in lines below "abasu" : [QUOTE] #============================================================================== # Common #============================================================================== # abasu : modified for master node jobmanager.rpc.address: edel-59.grenoble.grid5000.fr jobmanager.rpc.port: 6123 jobmanager.heap.mb: 256 taskmanager.heap.mb: 512 # abasu : modified from -1 according to configuration pages taskmanager.numberOfTaskSlots: 1 # abasu : modified from 1 according to advice of Robert Metzger on configuration parallelization.degree.default: 20 #============================================================================== # Web Frontend #============================================================================== jobmanager.web.port: 8081 webclient.port: 8080 #============================================================================== # Advanced #============================================================================== # The number of buffers for the network stack. # # taskmanager.network.numberOfBuffers: 2048 # Directories for temporary files. # # Add a delimited list for multiple directories, using the system directory # delimiter (colon ':' on unix) or a comma, e.g.: # /data1/tmp:/data2/tmp:/data3/tmp # # Note: Each directory entry is read from and written to by a different I/O # thread. You can include the same directory multiple times in order to create # multiple I/O threads against that directory. This is for example relevant for # high-throughput RAIDs. # # If not specified, the system-specific Java temporary directory (java.io.tmpdir # property) is taken. # # taskmanager.tmp.dirs: /tmp # Path to the Hadoop configuration directory. # # This configuration is used when writing into HDFS. Unless specified otherwise, # HDFS file creation will use HDFS default settings with respect to block-size, # replication factor, etc. # # You can also directly specify the paths to hdfs-default.xml and hdfs-site.xml # via keys 'fs.hdfs.hdfsdefault' and 'fs.hdfs.hdfssite'. # # abasu : uncommented fs.hdfs.hadoopconf: /opt/hadoop/conf/ [/QUOTE] A suggestion: For reading parameters from conf files, especially for "parallelization.degree.default" and may be for "taskmanager.numberOfTaskSlots" would it be possible to have the following (if not already implemented): If set to a positive value, the Job manager will use these values explicitly set by the user. If set to -1, the Jobmanager reads the # of nodes used and the # of cores in each node; then it does a simple arithmetic to calculate the degree of parallelisation, also perhaps the number of Tasks ... Of course, the user is free to use a different configuration using the "-m" option. What do you think ? Thanks in advance, Anirvan Le 24/09/2014 18:59, Stephan Ewen [via
Apache Flink (Incubator) User Mailing List archive.] a écrit :
|
Hi, the configuration file looks correct now. Did you restart Flink after you updated the configuration? Also, the config needs to be changed on all cluster nodes (we use a NFS shared folder, you can also use rsync to synchronize the files of the "master" (JobManager) to all the "worker" nodes (TaskManager). Are there other configuration files in the conf/ directory? I think our code is loading all configuration files by their extension. So if there is a flink-conf-defaults.yaml in the same dir, it might still contain the erroneous values. Regarding your suggestion: We already implemented part of your suggestion: if the user sets the dop to -1, we'll print a warning and set the DOP to 1. We discussed setting the "numberOfTaskSlots" to the number of CPUs by default, but that would basically mean that Flink is grabbing all the resources on the cluster. We therefore decided for being conservative and use only one core by default. By the way, I would recommend giving the Flink TaskManager (and also the jobManager) a bit more memory. Your current settings are: jobmanager.heap.mb: 256 taskmanager.heap.mb: 512These are MBs, so the TaskManagers will only have 512 MB. If your cluster nodes have, say 20GB, set the TaskManager heap space to 15000. This will lead to huge performance improvements for your jobs! I hope restarting resolves the issue. Best, Robert On Thu, Sep 25, 2014 at 4:26 PM, nirvanesque [via Apache Flink (Incubator) User Mailing List archive.] <[hidden email]> wrote: Robert, Stephan and the "powers that be" ... |
Thanks Robert for all the helpful tips!
Yes, of course, I did restart Flink everytime. Generally, in our Grid'5000, we reserve some nodes and then run our experiments (including installing hadoop, flink, etc). During the reservation, all nodes are nfs-ed to the access site, so all files & folders are mirrored on each node, no need to scp or rsync. I usually, reserve 10 nodes. Then I install flink on the same nodes as hadoop and the job manager runs on the hadoop namenode (master). Then I ssh to the job manager node to launch my jobs. I'm not sure if there was some issue in configuration after the last overhaul of the Grid'5000 this Monday. Anyway, I cleaned up everything, downloaded flink 0.6 and installed on Hadoop 1.2.1 The job manager and task managers start fine. Then the problem starts when I run a simple command as: abasu@adonis-10:~$ ./flink/bin/flink run -v ./flink/examples/flink-java-examples-0.6-incubating-WordCount.jar file:///home/abasu/examples/wordcount/input/gutenberg.txt file:///home/abasu/examples/wordcount/output/ Error: The program execution failed: org.apache.flink.runtime.jobmanager.scheduler.SchedulingException: Not enough slots to schedule job 72eda81dfd32200015712eaafd609000 at org.apache.flink.runtime.jobmanager.scheduler.DefaultScheduler.scheduleJob(DefaultScheduler.java:155) at org.apache.flink.runtime.jobmanager.JobManager.submitJob(JobManager.java:510) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.flink.runtime.ipc.RPC$Server.call(RPC.java:422) at org.apache.flink.runtime.ipc.Server$Handler.run(Server.java:958) org.apache.flink.client.program.ProgramInvocationException: The program execution failed: org.apache.flink.runtime.jobmanager.scheduler.SchedulingException: Not enough slots to schedule job 72eda81dfd32200015712eaafd609000 at org.apache.flink.runtime.jobmanager.scheduler.DefaultScheduler.scheduleJob(DefaultScheduler.java:155) at org.apache.flink.runtime.jobmanager.JobManager.submitJob(JobManager.java:510) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.flink.runtime.ipc.RPC$Server.call(RPC.java:422) at org.apache.flink.runtime.ipc.Server$Handler.run(Server.java:958) at org.apache.flink.client.program.Client.run(Client.java:325) at org.apache.flink.client.program.Client.run(Client.java:291) at org.apache.flink.client.program.Client.run(Client.java:285) at org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:54) at org.apache.flink.example.java.wordcount.WordCount.main(WordCount.java:82) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:389) at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:307) at org.apache.flink.client.program.Client.run(Client.java:244) at org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:332) at org.apache.flink.client.CliFrontend.run(CliFrontend.java:319) at org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:930) at org.apache.flink.client.CliFrontend.main(CliFrontend.java:954) The conf file is as follows (with your suggestions) No default conf files in the conf folder. #============================================================================== # Common #============================================================================== # abasu : modified for master node jobmanager.rpc.address: adonis-10.grenoble.grid5000.fr jobmanager.rpc.port: 6123 jobmanager.heap.mb: 1024 taskmanager.heap.mb: 8192 # abasu : modified from -1 according to configuration pages taskmanager.numberOfTaskSlots: 1 # abasu : modified from 1 according to advice of Robert Metzger on configuration parallelization.degree.default: 20 #============================================================================== # Web Frontend #============================================================================== jobmanager.web.port: 8081 webclient.port: 8080 #============================================================================== # Advanced #============================================================================== # The number of buffers for the network stack. # # taskmanager.network.numberOfBuffers: 2048 # Directories for temporary files. # # Add a delimited list for multiple directories, using the system directory # delimiter (colon ':' on unix) or a comma, e.g.: # /data1/tmp:/data2/tmp:/data3/tmp # # Note: Each directory entry is read from and written to by a different I/O # thread. You can include the same directory multiple times in order to create # multiple I/O threads against that directory. This is for example relevant for # high-throughput RAIDs. # # If not specified, the system-specific Java temporary directory (java.io.tmpdir # property) is taken. # # taskmanager.tmp.dirs: /tmp # Path to the Hadoop configuration directory. # # This configuration is used when writing into HDFS. Unless specified otherwise, # HDFS file creation will use HDFS default settings with respect to block-size, # replication factor, etc. # # You can also directly specify the paths to hdfs-default.xml and hdfs-site.xml # via keys 'fs.hdfs.hdfsdefault' and 'fs.hdfs.hdfssite'. # # abasu : uncommented fs.hdfs.hadoopconf: /opt/hadoop/conf/ Thanks in advance for your help. Seems like I'm going a few steps backward now. FYI, Hadoop runs well, tested Terasort. Best! Anirvan Le 25/09/2014 16:46, rmetzger0 [via
Apache Flink (Incubator) User Mailing List archive.] a écrit :
|
Hi, Flink has so-called "Slots" to execute tasks (smaller parts of a job) in. Every TaskManager has n slots. n is set by the configuration value "taskmanager.numberOfTaskSlots". In your case, every TaskManager provides one slot. Since you are having 10 TaskManagers, your cluster provides, when started, a total number of 10 slots. The bin/flink tool allows users to pass the "-p" argument, for the parallelism. So if you start a WordCount with -p 5, it will use 5 slots. If the user does not specify the "-p" argument, we use the "parallelization.degree.default" value. In your case, you've set it to 20 and provided no "-p" argument, hence, your job failed with the exception. So resolve the issue, either set parallelization.degree.default to "10" or pass "-p 10" to bin/flink. Cheers, Robert On Thu, Sep 25, 2014 at 9:56 PM, nirvanesque [via Apache Flink (Incubator) User Mailing List archive.] <[hidden email]> wrote: Thanks Robert for all the helpful tips! |
Hello Robert, Stephan and others, Hope you had a pleasurable weekend! Thanks for your helpful instructions and guidance on the operational / configurational aspects of Flink. Based on them, I did several "Flinking" experimentations this weekend and a few points to note here : 1. For the config parameter, "parallelization.degree.default" , while using a cluster of "n" nodes, it can be set to maximum of "n-1". When we reach "n" it gives the "insufficient slots" error message. So perhaps, the nth node is reserved for jobmanager's administrative tasks (??) (something similar to Hadoop's number of reducers set by the option -D mapred.reduce.tasks=n ) It would be helpful if (following Hadoop MapRed framework), the value would cap to n-1 if the above conf parameter was wrongly set to >= n (where n is the no. of nodes) 2. This follows our previous discussion (part of it is already implemented as I understand). Here in Grid'5000 (as well as in some other grid-based systems which I have seen), for each new reservation, different types of nodes are assigned (unless otherwise explicitly requested). Meaning, in one reservation of 10 nodes, I could have 16-core nodes, while in another reservation, a mix of 4-core and 8-core nodes, and so on ... In such circumstances, would it be possible to use conf value OR calculate automatically, the total no. of cores available ? If set to a positive value, the Job manager will use these values explicitly set by the user. If set to -1, the Jobmanager reads the # of nodes used and the # of cores in each node; then it does a simple arithmetic to calculate the degree of parallelisation, also perhaps the number of Tasks ... Of course, I agree with you that you do not want a single Flink job to swallow as much resources (processor power). In which case, use another % value which limits the max. usage of CPU cores. (Again, this idea can go in circles also, because somebody can argue to use a node-specific % value, instead of a global % value for all CPUs. This scenario arises, when some nodes in teh cluster are serving other frameworks/jobs - hadoop, spark, etc) I let you debate which would be a quick & easy solution. 3. In the Flink runtime, is there any protocol for backing-up "straggler" tasks ? For e.g. In a workflow, if subsequent join tasks are waiting due to a "straggling" reduce job on a particular node, how (based on what protocol) does the Job manager backup the straggler task and free the bottleneck? In Hadoop, there exist different policies for culling or backing up (some implemented with additional modules like "mantri" - I'm sure you are familiar with the literature on this topic). Thanks in advance and Happy Flinking ! Anirvan From: "rmetzger0" <[hidden email]> |
Hi Anirvan! Thanks for you observations and questions. Here are some comments: 1) The Flink resource management is based on slots, like Hadoop had slots for mappers and reducers (we have only one type of slots, though). The number of slots that each machine offers is defined in the config under "taskmanager.numberOfTaskSlots". So mostly, you have #machines x #slotsPerMachine slots available. You can always check the number of registered TaskManagers and the number of available slots at the web frontend (by default http://<jobmanager>:8081) You can set the parallelism to as high as you have slots. No slot is reserved for the job manager. Maybe one machine did not come up properly. 2) We do not reserve heterogeneous resources. A job occupies a number of slots (some, all, whatever), that is all. The managed memory is distributed by the TaskManager among its slots. You can define a heterogeneous cluster, though. The config on some nodes may say that the TaskManager offers 4 slots, while on other machines, it offers 8 slots. 3) We do not yet have a "straggler" resolution, like speculative execution or so. Did that answer your questions? Let us know if you have more questions! Greetings, Stephan On Mon, Sep 29, 2014 at 1:51 PM, Anirvan BASU <[hidden email]> wrote:
|
Hello Robert, Stephan et al, Hope you are doing fine in Berlin. I am getting back to you on my previous problem on the WebLogAnalysis example, after a long time. We are currently using Flink 0.7.0 over a 10-node cluster in Manager-Worker configuration. We ran the following command: $ ./flink/bin/flink run flink/examples/flink-java-examples-0.7.0-incubating-WebLogAnalysis.jar file:///home/abasu/examples/Weblogs/documents file:///home/abasu/examples/Weblogs/ranks file:///home/abasu/examples/Weblogs/visits file:///home/abasu/examples/Weblogs/result For the documents, rank and visits files, we used the data generated by you from this link: The program executed with the following output: 11/04/2014 14:58:12: Job execution switched to status RUNNING 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (1/9) switched to SCHEDULED 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (1/9) switched to DEPLOYING 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (2/9) switched to SCHEDULED 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (2/9) switched to DEPLOYING 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (3/9) switched to SCHEDULED 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (3/9) switched to DEPLOYING 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (4/9) switched to SCHEDULED 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (4/9) switched to DEPLOYING 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (5/9) switched to SCHEDULED 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (5/9) switched to DEPLOYING 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (6/9) switched to SCHEDULED 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (6/9) switched to DEPLOYING 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (7/9) switched to SCHEDULED 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (7/9) switched to DEPLOYING 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (8/9) switched to SCHEDULED 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (8/9) switched to DEPLOYING 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (9/9) switched to SCHEDULED 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (9/9) switched to DEPLOYING 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (1/9) switched to SCHEDULED 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (1/9) switched to DEPLOYING 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (2/9) switched to SCHEDULED 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (2/9) switched to DEPLOYING 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (3/9) switched to SCHEDULED 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (3/9) switched to DEPLOYING 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (4/9) switched to SCHEDULED 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (4/9) switched to DEPLOYING 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (5/9) switched to SCHEDULED 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (5/9) switched to DEPLOYING 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (6/9) switched to SCHEDULED 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (6/9) switched to DEPLOYING 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (7/9) switched to SCHEDULED 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (7/9) switched to DEPLOYING 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (8/9) switched to SCHEDULED 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (8/9) switched to DEPLOYING 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (9/9) switched to SCHEDULED 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (9/9) switched to DEPLOYING 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (1/9) switched to SCHEDULED 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (1/9) switched to DEPLOYING 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (2/9) switched to SCHEDULED 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (2/9) switched to DEPLOYING 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (3/9) switched to SCHEDULED 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (3/9) switched to DEPLOYING 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (4/9) switched to SCHEDULED 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (4/9) switched to DEPLOYING 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (5/9) switched to SCHEDULED 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (5/9) switched to DEPLOYING 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (6/9) switched to SCHEDULED 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (1/9) switched to RUNNING 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (6/9) switched to DEPLOYING 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (7/9) switched to SCHEDULED 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (7/9) switched to DEPLOYING 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (8/9) switched to SCHEDULED 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (8/9) switched to DEPLOYING 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (9/9) switched to SCHEDULED 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (9/9) switched to DEPLOYING 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (2/9) switched to RUNNING 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (5/9) switched to RUNNING 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (6/9) switched to RUNNING 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (7/9) switched to RUNNING 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (8/9) switched to RUNNING 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (2/9) switched to RUNNING 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (9/9) switched to RUNNING 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (5/9) switched to RUNNING 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (6/9) switched to RUNNING 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (8/9) switched to RUNNING 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (7/9) switched to RUNNING 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (3/9) switched to RUNNING 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (9/9) switched to RUNNING 11/04/2014 14:58:12: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (1/9) switched to SCHEDULED 11/04/2014 14:58:12: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (1/9) switched to DEPLOYING 11/04/2014 14:58:12: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (2/9) switched to SCHEDULED 11/04/2014 14:58:12: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (2/9) switched to DEPLOYING 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (8/9) switched to RUNNING 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (2/9) switched to RUNNING 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (3/9) switched to RUNNING 11/04/2014 14:58:12: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (2/9) switched to RUNNING 11/04/2014 14:58:12: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (3/9) switched to SCHEDULED 11/04/2014 14:58:12: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (3/9) switched to DEPLOYING 11/04/2014 14:58:12: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (1/9) switched to SCHEDULED 11/04/2014 14:58:12: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (1/9) switched to DEPLOYING 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (1/9) switched to RUNNING 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (3/9) switched to RUNNING 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (1/9) switched to RUNNING 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (4/9) switched to RUNNING 11/04/2014 14:58:12: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (1/9) switched to RUNNING 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (4/9) switched to RUNNING 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (5/9) switched to RUNNING 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (6/9) switched to RUNNING 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (4/9) switched to RUNNING 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (7/9) switched to RUNNING 11/04/2014 14:58:12: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (9/9) switched to RUNNING 11/04/2014 14:58:12: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (3/9) switched to RUNNING 11/04/2014 14:58:12: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (1/9) switched to RUNNING 11/04/2014 14:58:12: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (4/9) switched to SCHEDULED 11/04/2014 14:58:12: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (4/9) switched to DEPLOYING 11/04/2014 14:58:12: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (6/9) switched to SCHEDULED 11/04/2014 14:58:12: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (6/9) switched to DEPLOYING 11/04/2014 14:58:12: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (5/9) switched to SCHEDULED 11/04/2014 14:58:12: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (5/9) switched to DEPLOYING 11/04/2014 14:58:12: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (6/9) switched to RUNNING 11/04/2014 14:58:12: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (5/9) switched to RUNNING 11/04/2014 14:58:12: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (2/9) switched to SCHEDULED 11/04/2014 14:58:12: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (2/9) switched to DEPLOYING 11/04/2014 14:58:12: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (4/9) switched to RUNNING 11/04/2014 14:58:12: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (2/9) switched to RUNNING 11/04/2014 14:58:12: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (7/9) switched to SCHEDULED 11/04/2014 14:58:12: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (7/9) switched to DEPLOYING 11/04/2014 14:58:12: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (8/9) switched to SCHEDULED 11/04/2014 14:58:12: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (8/9) switched to DEPLOYING 11/04/2014 14:58:12: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (7/9) switched to RUNNING 11/04/2014 14:58:12: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (3/9) switched to SCHEDULED 11/04/2014 14:58:12: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (3/9) switched to DEPLOYING 11/04/2014 14:58:12: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (8/9) switched to RUNNING 11/04/2014 14:58:12: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (9/9) switched to SCHEDULED 11/04/2014 14:58:12: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (9/9) switched to DEPLOYING 11/04/2014 14:58:12: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (3/9) switched to RUNNING 11/04/2014 14:58:12: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (9/9) switched to RUNNING 11/04/2014 14:58:12: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (4/9) switched to SCHEDULED 11/04/2014 14:58:12: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (4/9) switched to DEPLOYING 11/04/2014 14:58:12: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (4/9) switched to RUNNING 11/04/2014 14:58:12: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (5/9) switched to SCHEDULED 11/04/2014 14:58:12: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (5/9) switched to DEPLOYING 11/04/2014 14:58:12: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (5/9) switched to RUNNING 11/04/2014 14:58:13: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (6/9) switched to SCHEDULED 11/04/2014 14:58:13: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (6/9) switched to DEPLOYING 11/04/2014 14:58:13: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (6/9) switched to RUNNING 11/04/2014 14:58:13: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (7/9) switched to SCHEDULED 11/04/2014 14:58:13: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (7/9) switched to DEPLOYING 11/04/2014 14:58:13: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (7/9) switched to RUNNING 11/04/2014 14:58:13: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (8/9) switched to SCHEDULED 11/04/2014 14:58:13: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (8/9) switched to DEPLOYING 11/04/2014 14:58:13: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (8/9) switched to RUNNING 11/04/2014 14:58:13: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (9/9) switched to SCHEDULED 11/04/2014 14:58:13: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (9/9) switched to DEPLOYING 11/04/2014 14:58:13: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (9/9) switched to RUNNING 11/04/2014 14:58:13: DataSink(CsvOutputFormat (path: file:/home/abasu/examples/Weblogs/result, delimiter: |)) (7/9) switched to SCHEDULED 11/04/2014 14:58:13: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (7/9) switched to FINISHED 11/04/2014 14:58:13: DataSink(CsvOutputFormat (path: file:/home/abasu/examples/Weblogs/result, delimiter: |)) (7/9) switched to DEPLOYING 11/04/2014 14:58:13: DataSink(CsvOutputFormat (path: file:/home/abasu/examples/Weblogs/result, delimiter: |)) (8/9) switched to SCHEDULED 11/04/2014 14:58:13: DataSink(CsvOutputFormat (path: file:/home/abasu/examples/Weblogs/result, delimiter: |)) (8/9) switched to DEPLOYING 11/04/2014 14:58:13: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (8/9) switched to FINISHED 11/04/2014 14:58:13: DataSink(CsvOutputFormat (path: file:/home/abasu/examples/Weblogs/result, delimiter: |)) (5/9) switched to SCHEDULED 11/04/2014 14:58:13: DataSink(CsvOutputFormat (path: file:/home/abasu/examples/Weblogs/result, delimiter: |)) (5/9) switched to DEPLOYING 11/04/2014 14:58:13: DataSink(CsvOutputFormat (path: file:/home/abasu/examples/Weblogs/result, delimiter: |)) (6/9) switched to SCHEDULED 11/04/2014 14:58:13: DataSink(CsvOutputFormat (path: file:/home/abasu/examples/Weblogs/result, delimiter: |)) (6/9) switched to DEPLOYING 11/04/2014 14:58:13: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (5/9) switched to FINISHED 11/04/2014 14:58:13: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (4/9) switched to FINISHED 11/04/2014 14:58:13: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (6/9) switched to FINISHED 11/04/2014 14:58:13: DataSink(CsvOutputFormat (path: file:/home/abasu/examples/Weblogs/result, delimiter: |)) (4/9) switched to SCHEDULED 11/04/2014 14:58:13: DataSink(CsvOutputFormat (path: file:/home/abasu/examples/Weblogs/result, delimiter: |)) (7/9) switched to RUNNING 11/04/2014 14:58:13: DataSink(CsvOutputFormat (path: file:/home/abasu/examples/Weblogs/result, delimiter: |)) (4/9) switched to DEPLOYING 11/04/2014 14:58:13: DataSink(CsvOutputFormat (path: file:/home/abasu/examples/Weblogs/result, delimiter: |)) (3/9) switched to SCHEDULED 11/04/2014 14:58:13: DataSink(CsvOutputFormat (path: file:/home/abasu/examples/Weblogs/result, delimiter: |)) (3/9) switched to DEPLOYING 11/04/2014 14:58:13: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (3/9) switched to FINISHED 11/04/2014 14:58:13: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (2/9) switched to FINISHED 11/04/2014 14:58:13: DataSink(CsvOutputFormat (path: file:/home/abasu/examples/Weblogs/result, delimiter: |)) (2/9) switched to SCHEDULED 11/04/2014 14:58:13: DataSink(CsvOutputFormat (path: file:/home/abasu/examples/Weblogs/result, delimiter: |)) (2/9) switched to DEPLOYING 11/04/2014 14:58:13: DataSink(CsvOutputFormat (path: file:/home/abasu/examples/Weblogs/result, delimiter: |)) (1/9) switched to SCHEDULED 11/04/2014 14:58:13: DataSink(CsvOutputFormat (path: file:/home/abasu/examples/Weblogs/result, delimiter: |)) (1/9) switched to DEPLOYING 11/04/2014 14:58:13: DataSink(CsvOutputFormat (path: file:/home/abasu/examples/Weblogs/result, delimiter: |)) (5/9) switched to RUNNING 11/04/2014 14:58:13: DataSink(CsvOutputFormat (path: file:/home/abasu/examples/Weblogs/result, delimiter: |)) (4/9) switched to RUNNING 11/04/2014 14:58:13: DataSink(CsvOutputFormat (path: file:/home/abasu/examples/Weblogs/result, delimiter: |)) (3/9) switched to RUNNING 11/04/2014 14:58:13: DataSink(CsvOutputFormat (path: file:/home/abasu/examples/Weblogs/result, delimiter: |)) (2/9) switched to RUNNING 11/04/2014 14:58:13: DataSink(CsvOutputFormat (path: file:/home/abasu/examples/Weblogs/result, delimiter: |)) (8/9) switched to RUNNING 11/04/2014 14:58:13: DataSink(CsvOutputFormat (path: file:/home/abasu/examples/Weblogs/result, delimiter: |)) (6/9) switched to RUNNING 11/04/2014 14:58:13: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (1/9) switched to FINISHED 11/04/2014 14:58:13: DataSink(CsvOutputFormat (path: file:/home/abasu/examples/Weblogs/result, delimiter: |)) (1/9) switched to RUNNING 11/04/2014 14:58:13: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (4/9) switched to FINISHED 11/04/2014 14:58:13: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (2/9) switched to FINISHED 11/04/2014 14:58:13: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (5/9) switched to FINISHED 11/04/2014 14:58:13: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (9/9) switched to FINISHED 11/04/2014 14:58:13: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (4/9) switched to FINISHED 11/04/2014 14:58:13: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (8/9) switched to FINISHED 11/04/2014 14:58:13: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (6/9) switched to FINISHED 11/04/2014 14:58:13: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (2/9) switched to FINISHED 11/04/2014 14:58:13: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (5/9) switched to FINISHED 11/04/2014 14:58:13: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (7/9) switched to FINISHED 11/04/2014 14:58:13: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (9/9) switched to FINISHED 11/04/2014 14:58:13: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (6/9) switched to FINISHED 11/04/2014 14:58:13: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (7/9) switched to FINISHED 11/04/2014 14:58:13: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (8/9) switched to FINISHED 11/04/2014 14:58:13: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (9/9) switched to FINISHED 11/04/2014 14:58:13: DataSink(CsvOutputFormat (path: file:/home/abasu/examples/Weblogs/result, delimiter: |)) (9/9) switched to SCHEDULED 11/04/2014 14:58:13: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (9/9) switched to FINISHED 11/04/2014 14:58:13: DataSink(CsvOutputFormat (path: file:/home/abasu/examples/Weblogs/result, delimiter: |)) (9/9) switched to DEPLOYING 11/04/2014 14:58:13: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (4/9) switched to FINISHED 11/04/2014 14:58:13: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (2/9) switched to FINISHED 11/04/2014 14:58:13: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (7/9) switched to FINISHED 11/04/2014 14:58:13: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (6/9) switched to FINISHED 11/04/2014 14:58:13: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (5/9) switched to FINISHED 11/04/2014 14:58:13: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (8/9) switched to FINISHED 11/04/2014 14:58:13: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (1/9) switched to FINISHED 11/04/2014 14:58:13: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (3/9) switched to FINISHED 11/04/2014 14:58:13: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (1/9) switched to FINISHED 11/04/2014 14:58:13: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (1/9) switched to FINISHED 11/04/2014 14:58:13: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (3/9) switched to FINISHED 11/04/2014 14:58:13: CHAIN DataSource (CSV Input (|) file:/home/abasu/examples/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (3/9) switched to FINISHED 11/04/2014 14:58:13: DataSink(CsvOutputFormat (path: file:/home/abasu/examples/Weblogs/result, delimiter: |)) (9/9) switched to RUNNING 11/04/2014 14:58:13: DataSink(CsvOutputFormat (path: file:/home/abasu/examples/Weblogs/result, delimiter: |)) (7/9) switched to FINISHED 11/04/2014 14:58:13: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (7/9) switched to FINISHED 11/04/2014 14:58:13: DataSink(CsvOutputFormat (path: file:/home/abasu/examples/Weblogs/result, delimiter: |)) (8/9) switched to FINISHED 11/04/2014 14:58:13: DataSink(CsvOutputFormat (path: file:/home/abasu/examples/Weblogs/result, delimiter: |)) (5/9) switched to FINISHED 11/04/2014 14:58:13: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (8/9) switched to FINISHED 11/04/2014 14:58:13: DataSink(CsvOutputFormat (path: file:/home/abasu/examples/Weblogs/result, delimiter: |)) (6/9) switched to FINISHED 11/04/2014 14:58:13: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (5/9) switched to FINISHED 11/04/2014 14:58:13: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (6/9) switched to FINISHED 11/04/2014 14:58:13: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (4/9) switched to FINISHED 11/04/2014 14:58:13: DataSink(CsvOutputFormat (path: file:/home/abasu/examples/Weblogs/result, delimiter: |)) (4/9) switched to FINISHED 11/04/2014 14:58:13: DataSink(CsvOutputFormat (path: file:/home/abasu/examples/Weblogs/result, delimiter: |)) (1/9) switched to FINISHED 11/04/2014 14:58:13: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (1/9) switched to FINISHED 11/04/2014 14:58:13: DataSink(CsvOutputFormat (path: file:/home/abasu/examples/Weblogs/result, delimiter: |)) (3/9) switched to FINISHED 11/04/2014 14:58:13: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (3/9) switched to FINISHED 11/04/2014 14:58:13: DataSink(CsvOutputFormat (path: file:/home/abasu/examples/Weblogs/result, delimiter: |)) (2/9) switched to FINISHED 11/04/2014 14:58:13: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (2/9) switched to FINISHED 11/04/2014 14:58:13: DataSink(CsvOutputFormat (path: file:/home/abasu/examples/Weblogs/result, delimiter: |)) (9/9) switched to FINISHED 11/04/2014 14:58:13: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (9/9) switched to FINISHED 11/04/2014 14:58:13: Job execution switched to status FINISHED The following directory was created: with 9 files (named 1 to 9) All these files are empty! Hence my naive question: Is this the expected output ? Or what should be the expected output for an error-free run ? Please let me know where we are going wrong? If possible do you have other data generated to try the WebLogAnalysis example ? Thanks in advance for your advice and help, Anirvan
|
Hi Anirvan, you specify input and output as files in the local file system (file:///). Each worker needs access to the all input files, which means that each worker needs (a copy of) these files in its local file system. The common setup to use Flink in a distributed cluster is to use a distributed data store such as HDFS (or a data store that can be accessed by each node). Using a shared file system (like NFS) that is mounted into each worker would work, but remember, that all nodes will concurrently read and write to the shared system. Have you checked the local file systems on all workers for output? Did the job process any data at all? The jobs finishes within 1 second (which is still possible for very small input data). You can change the example program to write its output to the stdout by replacing the writeAsCSV() by print(). The stdout of all workers is redirected to the ./log/*.out files. Best, Fabian 2014-11-04 16:08 GMT+01:00 Anirvan BASU <[hidden email]>:
|
Hello Fabian, Thank you for your prompt reply. I did as you explained, this time using the HDFS store instead of our NFS. 1. I copied all the input files from NFS to HDFS store using the copyFromLocal option. As you can see the input files are present in the HDFS store: $ hadoop dfs -ls flink/Weblogs Found 3 items -rw-r--r-- 2 abasu hadoop 21534 2014-11-05 12:09 /user/abasu/flink/Weblogs/documents -rw-r--r-- 2 abasu hadoop 1333 2014-11-05 12:09 /user/abasu/flink/Weblogs/ranks -rw-r--r-- 2 abasu hadoop 840961 2014-11-05 12:09 /user/abasu/flink/Weblogs/visits 2. I ran the flink example jar for Weblogs with the following command: $ ./flink/bin/flink run flink/examples/flink-java-examples-0.7.0-incubating-WebLogAnalysis.jar hdfs:///user/abasu/flink/Weblogs/documents hdfs:///user/abasu/flink/Weblogs/ranks hdfs:///user/abasu/flink/Weblogs/visits hdfs:///user/abasu/flink/Weblogs/result 3. Below is part of the output (please scroll down to see the end of my email :-) ) : ... 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (1/10) switched to SCHEDULED 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (1/10) switched to DEPLOYING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (2/10) switched to SCHEDULED 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (2/10) switched to DEPLOYING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (3/10) switched to SCHEDULED 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (3/10) switched to DEPLOYING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (4/10) switched to SCHEDULED 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (4/10) switched to DEPLOYING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (5/10) switched to SCHEDULED 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (5/10) switched to DEPLOYING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (6/10) switched to SCHEDULED 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (6/10) switched to DEPLOYING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (7/10) switched to SCHEDULED 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (7/10) switched to DEPLOYING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (8/10) switched to SCHEDULED 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (8/10) switched to DEPLOYING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (9/10) switched to SCHEDULED 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (9/10) switched to DEPLOYING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (10/10) switched to SCHEDULED 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (10/10) switched to DEPLOYING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (1/10) switched to SCHEDULED 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (1/10) switched to DEPLOYING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (2/10) switched to SCHEDULED 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (2/10) switched to DEPLOYING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (3/10) switched to SCHEDULED 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (3/10) switched to DEPLOYING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (4/10) switched to SCHEDULED 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (4/10) switched to DEPLOYING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (5/10) switched to SCHEDULED 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (5/10) switched to DEPLOYING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (6/10) switched to SCHEDULED 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (6/10) switched to DEPLOYING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (7/10) switched to SCHEDULED 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (7/10) switched to DEPLOYING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (8/10) switched to SCHEDULED 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (8/10) switched to DEPLOYING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (9/10) switched to SCHEDULED 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (9/10) switched to DEPLOYING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (10/10) switched to SCHEDULED 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (10/10) switched to DEPLOYING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (1/10) switched to SCHEDULED 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (1/10) switched to DEPLOYING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (2/10) switched to SCHEDULED 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (2/10) switched to DEPLOYING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (3/10) switched to SCHEDULED 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (3/10) switched to DEPLOYING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (4/10) switched to SCHEDULED 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (4/10) switched to DEPLOYING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (5/10) switched to SCHEDULED 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (5/10) switched to DEPLOYING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (6/10) switched to SCHEDULED 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (6/10) switched to DEPLOYING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (7/10) switched to SCHEDULED 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (7/10) switched to DEPLOYING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (8/10) switched to SCHEDULED 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (8/10) switched to DEPLOYING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (9/10) switched to SCHEDULED 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (9/10) switched to DEPLOYING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (10/10) switched to SCHEDULED 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (10/10) switched to DEPLOYING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (5/10) switched to RUNNING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (4/10) switched to RUNNING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (1/10) switched to RUNNING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (2/10) switched to RUNNING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (3/10) switched to RUNNING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (6/10) switched to RUNNING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (7/10) switched to RUNNING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (5/10) switched to RUNNING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (6/10) switched to RUNNING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (10/10) switched to RUNNING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (4/10) switched to RUNNING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (9/10) switched to RUNNING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (8/10) switched to RUNNING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (3/10) switched to RUNNING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (7/10) switched to RUNNING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (2/10) switched to RUNNING 11/05/2014 11:13:43: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (1/10) switched to SCHEDULED 11/05/2014 11:13:43: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (1/10) switched to DEPLOYING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (1/10) switched to RUNNING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (8/10) switched to RUNNING 11/05/2014 11:13:43: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (2/10) switched to SCHEDULED 11/05/2014 11:13:43: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (2/10) switched to DEPLOYING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (10/10) switched to RUNNING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (5/10) switched to RUNNING 11/05/2014 11:13:43: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (1/10) switched to RUNNING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (6/10) switched to RUNNING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (4/10) switched to RUNNING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (2/10) switched to RUNNING 11/05/2014 11:13:43: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (5/10) switched to SCHEDULED 11/05/2014 11:13:43: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (5/10) switched to DEPLOYING 11/05/2014 11:13:43: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (9/10) switched to SCHEDULED 11/05/2014 11:13:43: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (9/10) switched to DEPLOYING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (9/10) switched to RUNNING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (3/10) switched to RUNNING 11/05/2014 11:13:43: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (3/10) switched to SCHEDULED 11/05/2014 11:13:43: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (3/10) switched to DEPLOYING 11/05/2014 11:13:43: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (5/10) switched to RUNNING 11/05/2014 11:13:43: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (3/10) switched to RUNNING 11/05/2014 11:13:43: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (4/10) switched to SCHEDULED 11/05/2014 11:13:43: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (4/10) switched to DEPLOYING 11/05/2014 11:13:43: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (4/10) switched to RUNNING 11/05/2014 11:13:43: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (7/10) switched to SCHEDULED 11/05/2014 11:13:43: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (7/10) switched to DEPLOYING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (7/10) switched to RUNNING 11/05/2014 11:13:43: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (1/10) switched to SCHEDULED 11/05/2014 11:13:43: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (1/10) switched to DEPLOYING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (1/10) switched to RUNNING 11/05/2014 11:13:43: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (7/10) switched to RUNNING 11/05/2014 11:13:43: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (1/10) switched to RUNNING 11/05/2014 11:13:43: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (2/10) switched to SCHEDULED 11/05/2014 11:13:43: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (2/10) switched to DEPLOYING 11/05/2014 11:13:43: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (2/10) switched to RUNNING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (8/10) switched to RUNNING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (10/10) switched to RUNNING 11/05/2014 11:13:43: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (2/10) switched to RUNNING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (9/10) switched to RUNNING 11/05/2014 11:13:43: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (9/10) switched to RUNNING 11/05/2014 11:13:43: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (3/10) switched to SCHEDULED 11/05/2014 11:13:43: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (3/10) switched to DEPLOYING 11/05/2014 11:13:43: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (3/10) switched to RUNNING 11/05/2014 11:13:43: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (4/10) switched to SCHEDULED 11/05/2014 11:13:43: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (4/10) switched to DEPLOYING 11/05/2014 11:13:43: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (6/10) switched to SCHEDULED 11/05/2014 11:13:43: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (6/10) switched to DEPLOYING 11/05/2014 11:13:43: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (4/10) switched to RUNNING 11/05/2014 11:13:43: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (10/10) switched to SCHEDULED 11/05/2014 11:13:43: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (10/10) switched to DEPLOYING 11/05/2014 11:13:43: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (6/10) switched to RUNNING 11/05/2014 11:13:43: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (10/10) switched to RUNNING 11/05/2014 11:13:43: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (8/10) switched to SCHEDULED 11/05/2014 11:13:43: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (8/10) switched to DEPLOYING 11/05/2014 11:13:43: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (8/10) switched to RUNNING 11/05/2014 11:13:43: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (5/10) switched to SCHEDULED 11/05/2014 11:13:43: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (5/10) switched to DEPLOYING 11/05/2014 11:13:43: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (5/10) switched to RUNNING 11/05/2014 11:13:43: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (6/10) switched to SCHEDULED 11/05/2014 11:13:43: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (6/10) switched to DEPLOYING 11/05/2014 11:13:43: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (6/10) switched to RUNNING 11/05/2014 11:13:43: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (7/10) switched to SCHEDULED 11/05/2014 11:13:43: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (7/10) switched to DEPLOYING 11/05/2014 11:13:43: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (7/10) switched to RUNNING 11/05/2014 11:13:43: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (9/10) switched to SCHEDULED 11/05/2014 11:13:43: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (9/10) switched to DEPLOYING 11/05/2014 11:13:43: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (8/10) switched to SCHEDULED 11/05/2014 11:13:43: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (8/10) switched to DEPLOYING 11/05/2014 11:13:43: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (10/10) switched to SCHEDULED 11/05/2014 11:13:43: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (10/10) switched to DEPLOYING 11/05/2014 11:13:43: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (8/10) switched to RUNNING 11/05/2014 11:13:43: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (9/10) switched to RUNNING 11/05/2014 11:13:43: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (10/10) switched to RUNNING 11/05/2014 11:13:43: DataSink(CsvOutputFormat (path: hdfs:/user/abasu/flink/Weblogs/result, delimiter: |)) (4/10) switched to SCHEDULED 11/05/2014 11:13:43: DataSink(CsvOutputFormat (path: hdfs:/user/abasu/flink/Weblogs/result, delimiter: |)) (4/10) switched to DEPLOYING 11/05/2014 11:13:43: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (7/10) switched to FINISHED 11/05/2014 11:13:43: DataSink(CsvOutputFormat (path: hdfs:/user/abasu/flink/Weblogs/result, delimiter: |)) (2/10) switched to SCHEDULED 11/05/2014 11:13:43: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (2/10) switched to FINISHED 11/05/2014 11:13:43: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (4/10) switched to FINISHED 11/05/2014 11:13:43: DataSink(CsvOutputFormat (path: hdfs:/user/abasu/flink/Weblogs/result, delimiter: |)) (2/10) switched to DEPLOYING 11/05/2014 11:13:43: DataSink(CsvOutputFormat (path: hdfs:/user/abasu/flink/Weblogs/result, delimiter: |)) (7/10) switched to SCHEDULED 11/05/2014 11:13:43: DataSink(CsvOutputFormat (path: hdfs:/user/abasu/flink/Weblogs/result, delimiter: |)) (1/10) switched to SCHEDULED 11/05/2014 11:13:43: DataSink(CsvOutputFormat (path: hdfs:/user/abasu/flink/Weblogs/result, delimiter: |)) (3/10) switched to SCHEDULED 11/05/2014 11:13:43: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (1/10) switched to FINISHED 11/05/2014 11:13:43: DataSink(CsvOutputFormat (path: hdfs:/user/abasu/flink/Weblogs/result, delimiter: |)) (1/10) switched to DEPLOYING 11/05/2014 11:13:43: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (3/10) switched to FINISHED 11/05/2014 11:13:43: DataSink(CsvOutputFormat (path: hdfs:/user/abasu/flink/Weblogs/result, delimiter: |)) (3/10) switched to DEPLOYING 11/05/2014 11:13:43: DataSink(CsvOutputFormat (path: hdfs:/user/abasu/flink/Weblogs/result, delimiter: |)) (4/10) switched to RUNNING 11/05/2014 11:13:43: DataSink(CsvOutputFormat (path: hdfs:/user/abasu/flink/Weblogs/result, delimiter: |)) (5/10) switched to SCHEDULED 11/05/2014 11:13:43: DataSink(CsvOutputFormat (path: hdfs:/user/abasu/flink/Weblogs/result, delimiter: |)) (7/10) switched to DEPLOYING 11/05/2014 11:13:43: DataSink(CsvOutputFormat (path: hdfs:/user/abasu/flink/Weblogs/result, delimiter: |)) (5/10) switched to DEPLOYING 11/05/2014 11:13:43: DataSink(CsvOutputFormat (path: hdfs:/user/abasu/flink/Weblogs/result, delimiter: |)) (6/10) switched to SCHEDULED 11/05/2014 11:13:43: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (5/10) switched to FINISHED 11/05/2014 11:13:43: DataSink(CsvOutputFormat (path: hdfs:/user/abasu/flink/Weblogs/result, delimiter: |)) (2/10) switched to RUNNING 11/05/2014 11:13:43: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (6/10) switched to FINISHED 11/05/2014 11:13:43: DataSink(CsvOutputFormat (path: hdfs:/user/abasu/flink/Weblogs/result, delimiter: |)) (1/10) switched to RUNNING 11/05/2014 11:13:43: DataSink(CsvOutputFormat (path: hdfs:/user/abasu/flink/Weblogs/result, delimiter: |)) (6/10) switched to DEPLOYING 11/05/2014 11:13:43: DataSink(CsvOutputFormat (path: hdfs:/user/abasu/flink/Weblogs/result, delimiter: |)) (3/10) switched to RUNNING 11/05/2014 11:13:43: DataSink(CsvOutputFormat (path: hdfs:/user/abasu/flink/Weblogs/result, delimiter: |)) (7/10) switched to RUNNING 11/05/2014 11:13:43: DataSink(CsvOutputFormat (path: hdfs:/user/abasu/flink/Weblogs/result, delimiter: |)) (5/10) switched to RUNNING 11/05/2014 11:13:43: DataSink(CsvOutputFormat (path: hdfs:/user/abasu/flink/Weblogs/result, delimiter: |)) (6/10) switched to RUNNING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (2/10) switched to FINISHED 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (5/10) switched to FINISHED 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (1/10) switched to FINISHED 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (6/10) switched to FINISHED 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (9/10) switched to FINISHED 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (7/10) switched to FINISHED 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (3/10) switched to FINISHED 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (8/10) switched to FINISHED 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (4/10) switched to FINISHED 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (9/10) switched to FINISHED 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (2/10) switched to FINISHED 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/documents) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords) -> Map (Projection [0]) (10/10) switched to FINISHED 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (3/10) switched to FINISHED 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (1/10) switched to FINISHED 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (6/10) switched to FINISHED 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (5/10) switched to FINISHED 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (7/10) switched to FINISHED 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (4/10) switched to FINISHED 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (8/10) switched to FINISHED 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/ranks) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank) (10/10) switched to FINISHED 11/05/2014 11:13:43: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (8/10) switched to FINISHED 11/05/2014 11:13:43: DataSink(CsvOutputFormat (path: hdfs:/user/abasu/flink/Weblogs/result, delimiter: |)) (8/10) switched to SCHEDULED 11/05/2014 11:13:43: DataSink(CsvOutputFormat (path: hdfs:/user/abasu/flink/Weblogs/result, delimiter: |)) (8/10) switched to DEPLOYING 11/05/2014 11:13:43: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (9/10) switched to FINISHED 11/05/2014 11:13:43: DataSink(CsvOutputFormat (path: hdfs:/user/abasu/flink/Weblogs/result, delimiter: |)) (9/10) switched to SCHEDULED 11/05/2014 11:13:43: DataSink(CsvOutputFormat (path: hdfs:/user/abasu/flink/Weblogs/result, delimiter: |)) (9/10) switched to DEPLOYING 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (10/10) switched to FINISHED 11/05/2014 11:13:43: Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction) (10/10) switched to FINISHED 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (2/10) switched to FINISHED 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (5/10) switched to FINISHED 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (3/10) switched to FINISHED 11/05/2014 11:13:43: DataSink(CsvOutputFormat (path: hdfs:/user/abasu/flink/Weblogs/result, delimiter: |)) (10/10) switched to SCHEDULED 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (1/10) switched to FINISHED 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (8/10) switched to FINISHED 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (7/10) switched to FINISHED 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (4/10) switched to FINISHED 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (6/10) switched to FINISHED 11/05/2014 11:13:43: CHAIN DataSource (CSV Input (|) hdfs:/user/abasu/flink/Weblogs/visits) -> Filter (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate) -> Map (Projection [0]) (9/10) switched to FINISHED 11/05/2014 11:13:43: DataSink(CsvOutputFormat (path: hdfs:/user/abasu/flink/Weblogs/result, delimiter: |)) (10/10) switched to DEPLOYING 11/05/2014 11:13:43: DataSink(CsvOutputFormat (path: hdfs:/user/abasu/flink/Weblogs/result, delimiter: |)) (8/10) switched to RUNNING 11/05/2014 11:13:43: DataSink(CsvOutputFormat (path: hdfs:/user/abasu/flink/Weblogs/result, delimiter: |)) (9/10) switched to RUNNING 11/05/2014 11:13:43: DataSink(CsvOutputFormat (path: hdfs:/user/abasu/flink/Weblogs/result, delimiter: |)) (10/10) switched to RUNNING 11/05/2014 11:13:43: DataSink(CsvOutputFormat (path: hdfs:/user/abasu/flink/Weblogs/result, delimiter: |)) (4/10) switched to FINISHED 11/05/2014 11:13:43: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (4/10) switched to FINISHED 11/05/2014 11:13:43: DataSink(CsvOutputFormat (path: hdfs:/user/abasu/flink/Weblogs/result, delimiter: |)) (3/10) switched to FINISHED 11/05/2014 11:13:43: DataSink(CsvOutputFormat (path: hdfs:/user/abasu/flink/Weblogs/result, delimiter: |)) (1/10) switched to FINISHED 11/05/2014 11:13:43: DataSink(CsvOutputFormat (path: hdfs:/user/abasu/flink/Weblogs/result, delimiter: |)) (2/10) switched to FINISHED 11/05/2014 11:13:43: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (3/10) switched to FINISHED 11/05/2014 11:13:43: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (1/10) switched to FINISHED 11/05/2014 11:13:43: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (2/10) switched to FINISHED 11/05/2014 11:13:43: DataSink(CsvOutputFormat (path: hdfs:/user/abasu/flink/Weblogs/result, delimiter: |)) (5/10) switched to FINISHED 11/05/2014 11:13:43: DataSink(CsvOutputFormat (path: hdfs:/user/abasu/flink/Weblogs/result, delimiter: |)) (7/10) switched to FINISHED 11/05/2014 11:13:43: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (5/10) switched to FINISHED 11/05/2014 11:13:43: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (7/10) switched to FINISHED 11/05/2014 11:13:43: DataSink(CsvOutputFormat (path: hdfs:/user/abasu/flink/Weblogs/result, delimiter: |)) (6/10) switched to FINISHED 11/05/2014 11:13:43: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (6/10) switched to FINISHED 11/05/2014 11:13:43: DataSink(CsvOutputFormat (path: hdfs:/user/abasu/flink/Weblogs/result, delimiter: |)) (8/10) switched to FINISHED 11/05/2014 11:13:43: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (8/10) switched to FINISHED 11/05/2014 11:13:43: DataSink(CsvOutputFormat (path: hdfs:/user/abasu/flink/Weblogs/result, delimiter: |)) (9/10) switched to FINISHED 11/05/2014 11:13:43: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (9/10) switched to FINISHED 11/05/2014 11:13:43: DataSink(CsvOutputFormat (path: hdfs:/user/abasu/flink/Weblogs/result, delimiter: |)) (10/10) switched to FINISHED 11/05/2014 11:13:43: CoGroup (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits) (10/10) switched to FINISHED 11/05/2014 11:13:43: Job execution switched to status FINISHED 4. However, the files created for each worker output still remain empty: $ hadoop dfs -copyToLocal flink/Weblogs/result examples/Weblogs/ $ ls -al examples/Weblogs/result/ total 8 drwxr-xr-x 2 abasu users 4096 Nov 5 13:49 . drwxr-xr-x 3 abasu users 4096 Nov 5 13:49 .. -rw-r--r-- 1 abasu users 0 Nov 5 13:49 1 -rw-r--r-- 1 abasu users 0 Nov 5 13:49 10 -rw-r--r-- 1 abasu users 0 Nov 5 13:49 2 -rw-r--r-- 1 abasu users 0 Nov 5 13:49 3 -rw-r--r-- 1 abasu users 0 Nov 5 13:49 4 -rw-r--r-- 1 abasu users 0 Nov 5 13:49 5 -rw-r--r-- 1 abasu users 0 Nov 5 13:49 6 -rw-r--r-- 1 abasu users 0 Nov 5 13:49 7 -rw-r--r-- 1 abasu users 0 Nov 5 13:49 8 -rw-r--r-- 1 abasu users 0 Nov 5 13:49 9 $ cat examples/Weblogs/result/1 $ So I'm wondering what I am missing ... ? Thanks in advance for all your help and suggestions. Anirvan From: "Fabian Hueske" <[hidden email]> |
In reply to this post by Fabian Hueske
Hello Fabien and everyone, In my previous post, I missed some of your questions from your last email. Here are my replies: Have you checked the local file systems on all workers for output? Yes, I did (in the case of using "file:///address/to/local/file" using NFS). They were the same empty files. Did the job process any data at all? The jobs finishes within 1 second (which is still possible for very small input data). The data that was used was provided to me by robert Metzger. Please see the link here: https://github.com/rmetzger/scratch/tree/weblogdataexample/weblog Actually, the first lines in the "rank" file had some problem with the separators '|' It may be due to difference in coding between Linux machines ... the programme would end up with some error always. So I deleted the top few lines and then the programme finished with code FINISHED but empty files :-(( You can change the example program to write its output to the stdout by replacing the writeAsCSV() by print(). The stdout of all workers is redirected to the ./log/*.out files. Question to you: What is the location of this stdtout ./log/* ? I could not find it anywhere - neither in my local directories nor in the system root. Question to you: Is it possible to change the location of the stdout by changing the conf file flink-conf.yaml ? Which exact parameter should I change ? Thanks in advance for all your help, Anirvan From: "Fabian Hueske" <[hidden email]> |
Hi Anirvan, just checked the data. The data you use and the WebLogAnalysis example program do not work well together and do not give you any results. All tuples are removed by filters or joins. 2014-11-05 19:42 GMT+01:00 Fabian Hueske <[hidden email]>:
|
Thanks Fabian for your deep analysis. So that explains why none of the worker nodes have any data in the result files ? Now my question is - do you have any datasets that will yield non-zero result dataset ? I want to use (modify) for a demo at EIT-ICT labs , using Flink. Thanks in advance, Anirvan From: "Fabian Hueske" <[hidden email]> |
Free forum by Nabble | Edit this page |