The best way to get processing time of each operator?

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

The best way to get processing time of each operator?

Folani
I'm going to work on effect of parallelism for different operators on
heterogeneous machines.
I need to know the processing time of each operator instance as well as
overall processing time of all instances of each specific operator.
I think there are different ways for this purpose.
However, what is the best way to getting these times as precise as possible?



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: The best way to get processing time of each operator?

Hequn Cheng
Hi Folani,

I see one option that we can achieve this through metrics[1].
Each operator can report it's processing time as a metric. These metrics can be gathered and queried later. For example, you can get a metric for a specified TaskManager or get max/min/avg value of all TaskManagers. 

Best, Hequn



On Mon, Oct 15, 2018 at 10:26 PM Folani <[hidden email]> wrote:
I'm going to work on effect of parallelism for different operators on
heterogeneous machines.
I need to know the processing time of each operator instance as well as
overall processing time of all instances of each specific operator.
I think there are different ways for this purpose.
However, what is the best way to getting these times as precise as possible?



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: The best way to get processing time of each operator?

Kostas Kloudas
Hi Folani,

Metrics is definitely one way, while the other can be that, depending on your job,
if you have e.g. processFunctions, you can always attach different timestamps
(depending on what you want to measure) and based on these, do the computations
you need. Based on this you can for example compute the per record latency.

Now for the overall latency of an operator (all tasks) you have to be 
more creative, but I am not so sure what is the value of measuring it, as in streaming,
more often than not, you are referring to infinite streams of incoming data.

Cheers,
Kostas

On Oct 16, 2018, at 3:11 AM, Hequn Cheng <[hidden email]> wrote:

Hi Folani,

I see one option that we can achieve this through metrics[1].
Each operator can report it's processing time as a metric. These metrics can be gathered and queried later. For example, you can get a metric for a specified TaskManager or get max/min/avg value of all TaskManagers. 

Best, Hequn



On Mon, Oct 15, 2018 at 10:26 PM Folani <[hidden email]> wrote:
I'm going to work on effect of parallelism for different operators on
heterogeneous machines.
I need to know the processing time of each operator instance as well as
overall processing time of all instances of each specific operator.
I think there are different ways for this purpose.
However, what is the best way to getting these times as precise as possible?



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/