I just realized that Flink program takes a lot of time to run, for example, just the simple word count example in 0.9 takes 18s to run on my laptop (mbp mac os 10.9, i5, 8gb ram, ssd).
Any one can explain this / suggest a work around ? |
Hi, that depends. How are you executing the program? Inside an IDE? By starting a local cluster? And then, how big is your input data? Cheers, Aljoscha On Wed, 15 Jul 2015 at 23:45 Vinh June <[hidden email]> wrote: I just realized that Flink program takes a lot of time to run, for example, |
I ran it on local, from terminal.
And it's the Word Count example so it's small |
HI Vinh, If you run your program locally, then Flink uses the local execution mode which allocates only little managed memory. Managed memory is used by Flink to perform operations on serialized data. These operations can get slow if too little memory gets allocated because data needs to be spilled to disk. That would of course be different in a cluster environment where you configure the memory explicitly.10:12:37,655 INFO org.apache.flink.runtime.taskmanager.TaskManager - Using 1227 MB for Flink managed memory. On Thu, Jul 16, 2015 at 8:54 AM, Vinh June <[hidden email]> wrote: I ran it on local, from terminal. |
Hi Max,
When I call 'flink run', it doesn't show any information like that |
Hey Vinh,
you have to look into the logs folder and find the log of the TaskManager (something like *taskmanager*.log) – Ufuk On 16 Jul 2015, at 11:35, Vinh June <[hidden email]> wrote: > Hi Max, > When I call 'flink run', it doesn't show any information like that > > > > -- > View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-Scala-performance-tp2065p2083.html > Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com. |
In reply to this post by Vinh June
Vinh, Are you using the sample data built into the example, or are you using your own data? On Thu, Jul 16, 2015 at 8:54 AM, Vinh June <[hidden email]> wrote: I ran it on local, from terminal. |
In reply to this post by Ufuk Celebi
I found it in JobManager log
"21:16:54,986 INFO org.apache.flink.runtime.taskmanager.TaskManager - Using 25 MB for Flink managed memory." is there a way to explicitly assign this for local ? |
You can increase Flink managed memory by increasing Taskmanager JVM Heap (taskmanager.heap.mb) in flink-conf.yaml.
There is some explanation of options in Flink documentation [1]. Regards, Chiwan Park [1] https://ci.apache.org/projects/flink/flink-docs-master/setup/config.html#common-options > On Jul 16, 2015, at 7:23 PM, Vinh June <[hidden email]> wrote: > > I found it in JobManager log > > "21:16:54,986 INFO org.apache.flink.runtime.taskmanager.TaskManager > - Using 25 MB for Flink managed memory." > > is there a way to explicitly assign this for local ? > > > > -- > View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-Scala-performance-tp2065p2087.html > Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com. |
In reply to this post by Stephan Ewen
@Stephan: I use the sample data comes with the sample
|
If you use the sample data from the example, there must be an issue with the setup. In Flink's standalone mode, it runs in 100ms on my machine. It may be possible that the command line client takes a long time to start up, so it appears that the program run time is long. If it takes so long, one reason may be slow DNS resolution. You can check that by looking at the logs of the client process (int the "log" folder). Stephan On Thu, Jul 16, 2015 at 2:06 PM, Vinh June <[hidden email]> wrote: @Stephan: I use the sample data comes with the sample |
Here are my logs
http://pastebin.com/AJwiy2D8 http://pastebin.com/K05H3Qur from client log, it seems to take ~2s, but with "time flink run ...", actual time is ~18s |
Is it possible that it takes a long time to spawn JVMs on your system? That this takes up all the time? On Thu, Jul 16, 2015 at 3:34 PM, Vinh June <[hidden email]> wrote: Here are my logs |
I just checked on web job manager, it says that runtime for flink job is 349ms, but actually it takes 18s using "time" command in terminal
Should I care more about the latter timing ? |
The 349ms is how long it takes to run the job. The 18s is what it takes the command line client to submit the job. Like I said before, may be there are super long delays on your system when you spawn JVMs, or in your DNS resolution. Thay way, connecting to the cluster to submit the job will take a long time... On Thu, Jul 16, 2015 at 5:53 PM, Vinh June <[hidden email]> wrote: I just checked on web job manager, it says that runtime for flink job is |
it sounds unreasonable for me, because I'm working on other Java projects also, non of them takes that long to fire up JVM. Strange !
Do you have any suggestion to fix this ? |
hi, actually the same happens to me on my macbook pro when not plugged to power but with battery
and twice if i am using hdfs in my case it seems like in power saving mode jvm commands has a very high latency i.e. a simple "hdfs dfs -ls /“ takes about 20 seconds when only on battery, so it is not related to flink cheers > Il giorno 18/lug/2015, alle ore 23:22, Vinh June <[hidden email]> ha scritto: > > it sounds unreasonable for me, because I'm working on other Java projects > also, non of them takes that long to fire up JVM. Strange ! > Do you have any suggestion to fix this ? > > > > -- > View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-Scala-performance-tp2065p2151.html > Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com. |
Free forum by Nabble | Edit this page |