Currently i'm doing some analysis for some algorithms that i use in Flink, I'm interested in the Space and time it takes to execute them. For the Time i used getNetRuntime() in the executionenvironment, but I have no idea how to analyse the amount of space an algorithm uses.
Space can mean different things here, like Heap space, disk space, overal memory or allocated memory. I would like to analyze some of these. |
Hi, the heap mem usage should be available via Flink's metrics system.taskmanager.tmp.dirs [1]).[1] https://ci.apache.org/projects/flink/flink-docs-release-1.1/setup/config.html#jobmanager-amp-taskmanager 2016-12-09 14:12 GMT+01:00 otherwise777 <[hidden email]>: Currently i'm doing some analysis for some algorithms that i use in Flink, |
We do not measure how much data we are
spilling to disk.
On 09.12.2016 14:43, Fabian Hueske wrote:
|
This does sound like a nice feature, both per-job and per-taskmanager bytes written to and read from disk. On Fri, Dec 9, 2016 at 8:51 AM, Chesnay Schepler <[hidden email]> wrote:
|
In reply to this post by Fabian Hueske-2
Hey Fabian,
Thanks for the quick reply, I was looking through the flink metrics [1] but i couldn't find anything in there how to analyze the environment from start to finish, only for functions that extend the richmapfunction [1] https://ci.apache.org/projects/flink/flink-docs-release-1.1/apis/metrics.html#list-of-all-variables |
The system metrics [1] are only available on a system level, i.e. not for an individual job. The reason is that multiple job might run concurrently on the same task manager JVM process. So it would not be possible to separate their heap usage.2016-12-16 9:08 GMT+01:00 otherwise777 <[hidden email]>: Hey Fabian, |
Thank you for your reply,
I'm afraid i still don't understand it, the part i don't understand is how to actually analyze it. It's ok if i can just analyze the system instead of the actual job, but how would i actually do that? I don't have any function in my program that extends the richfunction afaik, so how would i call the getRuntimeContext() to print or store it? |
Your functions do not need to implement RichFunction (although, each function can be a RichFunction and it should not be a problem to adapt the job). The system metrics are automatically collected. Metrics are exposed via a Reporter [1]. So you do not need to take care of the collection but rather specify where the collected metrics should be reported to. 2016-12-19 9:59 GMT+01:00 otherwise777 <[hidden email]>: Thank you for your reply, |
Free forum by Nabble | Edit this page |