How to analyze space usage of Flink algorithms

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

How to analyze space usage of Flink algorithms

otherwise777
Currently i'm doing some analysis for some algorithms that i use in Flink, I'm interested in the Space and time it takes to execute them. For the Time i used getNetRuntime() in the executionenvironment, but I have no idea how to analyse the amount of space an algorithm uses.
Space can mean different things here, like Heap space, disk space, overal memory or allocated memory. I would like to analyze some of these.
Reply | Threaded
Open this post in threaded view
|

Re: How to analyze space usage of Flink algorithms

Fabian Hueske-2
Hi,

the heap mem usage should be available via Flink's metrics system.
Not sure if that also captures spilled data. Chesnay (in CC) should know that.

If the spilled data is not available as a metric, you can try to write a small script that monitors the directories to which Flink spills (Config parameter: taskmanager.tmp.dirs [1]).
The script would repeatedly list all files and keep for each file the max size (files are deleted once the are not used anymore). This is not super precise but might be good enough.

Hope this helps,
Fabian

[1] https://ci.apache.org/projects/flink/flink-docs-release-1.1/setup/config.html#jobmanager-amp-taskmanager

2016-12-09 14:12 GMT+01:00 otherwise777 <[hidden email]>:
Currently i'm doing some analysis for some algorithms that i use in Flink,
I'm interested in the Space and time it takes to execute them. For the Time
i used getNetRuntime() in the executionenvironment, but I have no idea how
to analyse the amount of space an algorithm uses.
Space can mean different things here, like Heap space, disk space, overal
memory or allocated memory. I would like to analyze some of these.



--
View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/How-to-analyze-space-usage-of-Flink-algorithms-tp10555.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.

Reply | Threaded
Open this post in threaded view
|

Re: How to analyze space usage of Flink algorithms

Chesnay Schepler
We do not measure how much data we are spilling to disk.

On 09.12.2016 14:43, Fabian Hueske wrote:
Hi,

the heap mem usage should be available via Flink's metrics system.
Not sure if that also captures spilled data. Chesnay (in CC) should know that.

If the spilled data is not available as a metric, you can try to write a small script that monitors the directories to which Flink spills (Config parameter: taskmanager.tmp.dirs [1]).
The script would repeatedly list all files and keep for each file the max size (files are deleted once the are not used anymore). This is not super precise but might be good enough.

Hope this helps,
Fabian

[1] https://ci.apache.org/projects/flink/flink-docs-release-1.1/setup/config.html#jobmanager-amp-taskmanager

2016-12-09 14:12 GMT+01:00 otherwise777 <[hidden email]>:
Currently i'm doing some analysis for some algorithms that i use in Flink,
I'm interested in the Space and time it takes to execute them. For the Time
i used getNetRuntime() in the executionenvironment, but I have no idea how
to analyse the amount of space an algorithm uses.
Space can mean different things here, like Heap space, disk space, overal
memory or allocated memory. I would like to analyze some of these.



--
View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/How-to-analyze-space-usage-of-Flink-algorithms-tp10555.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.


Reply | Threaded
Open this post in threaded view
|

Re: How to analyze space usage of Flink algorithms

Greg Hogan
This does sound like a nice feature, both per-job and per-taskmanager bytes written to and read from disk.

On Fri, Dec 9, 2016 at 8:51 AM, Chesnay Schepler <[hidden email]> wrote:
We do not measure how much data we are spilling to disk.


On 09.12.2016 14:43, Fabian Hueske wrote:
Hi,

the heap mem usage should be available via Flink's metrics system.
Not sure if that also captures spilled data. Chesnay (in CC) should know that.

If the spilled data is not available as a metric, you can try to write a small script that monitors the directories to which Flink spills (Config parameter: taskmanager.tmp.dirs [1]).
The script would repeatedly list all files and keep for each file the max size (files are deleted once the are not used anymore). This is not super precise but might be good enough.

Hope this helps,
Fabian

[1] https://ci.apache.org/projects/flink/flink-docs-release-1.1/setup/config.html#jobmanager-amp-taskmanager

2016-12-09 14:12 GMT+01:00 otherwise777 <[hidden email]>:
Currently i'm doing some analysis for some algorithms that i use in Flink,
I'm interested in the Space and time it takes to execute them. For the Time
i used getNetRuntime() in the executionenvironment, but I have no idea how
to analyse the amount of space an algorithm uses.
Space can mean different things here, like Heap space, disk space, overal
memory or allocated memory. I would like to analyze some of these.



--
View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/How-to-analyze-space-usage-of-Flink-algorithms-tp10555.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.



Reply | Threaded
Open this post in threaded view
|

Re: How to analyze space usage of Flink algorithms

otherwise777
In reply to this post by Fabian Hueske-2
Hey Fabian,

Thanks for the quick reply,
I was looking through the flink metrics [1] but i couldn't find anything in there how to analyze the environment from start to finish, only for functions that extend the richmapfunction

[1] https://ci.apache.org/projects/flink/flink-docs-release-1.1/apis/metrics.html#list-of-all-variables
Reply | Threaded
Open this post in threaded view
|

Re: How to analyze space usage of Flink algorithms

Fabian Hueske-2
The system metrics [1] are only available on a system level, i.e. not for an individual job.
The reason is that multiple job might run concurrently on the same task manager JVM process. So it would not be possible to separate their heap usage.
The same would be true for the approach that monitors the task manager tmp directory.

You would need to correlate your measurements with the time range in which a job is executed.

Best, Fabian

2016-12-16 9:08 GMT+01:00 otherwise777 <[hidden email]>:
Hey Fabian,

Thanks for the quick reply,
I was looking through the flink metrics [1] but i couldn't find anything in
there how to analyze the environment from start to finish, only for
functions that extend the richmapfunction

[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.1/apis/metrics.html#list-of-all-variables



--
View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/How-to-analyze-space-usage-of-Flink-algorithms-tp10555p10661.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.

Reply | Threaded
Open this post in threaded view
|

Re: How to analyze space usage of Flink algorithms

otherwise777
Thank you for your reply,
I'm afraid i still don't understand it, the part i don't understand is how to actually analyze it. It's ok if i can just analyze the system instead of the actual job, but how would i actually do that?
I don't have any function in my program that extends the richfunction afaik, so how would i call the getRuntimeContext() to print or store it?
Reply | Threaded
Open this post in threaded view
|

Re: How to analyze space usage of Flink algorithms

Fabian Hueske-2
Your functions do not need to implement RichFunction (although, each function can be a RichFunction and it should not be a problem to adapt the job).
The system metrics are automatically collected. Metrics are exposed via a Reporter [1].
So you do not need to take care of the collection but rather specify where the collected metrics should be reported to.

Best, Fabian

2016-12-19 9:59 GMT+01:00 otherwise777 <[hidden email]>:
Thank you for your reply,
I'm afraid i still don't understand it, the part i don't understand is how
to actually analyze it. It's ok if i can just analyze the system instead of
the actual job, but how would i actually do that?
I don't have any function in my program that extends the richfunction afaik,
so how would i call the getRuntimeContext() to print or store it?



--
View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/How-to-analyze-space-usage-of-Flink-algorithms-tp10555p10686.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.