Metrics for received records per TaskManager

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Metrics for received records per TaskManager

Benjamin Burkhardt-2
Hi all,

I’m looking for a metric which allows me keeping track of the records or bytes each TaskManager has received or processed for the current task.

Can anyone help me getting this?

Thanks.

Benjamin
Reply | Threaded
Open this post in threaded view
|

Re: Metrics for received records per TaskManager

Yun Tang
Hi Benjamin

I think 'numBytesInLocalPerSecond' and 'numBytesInRemotePerSecond' which indicate 'The number of bytes this task reads from a local source per second' and 'The number of bytes this task reads from a remote source per second' respectively could help you. If you want to track the information by each TaskManager, please group the metrics by tag 'tm_id'.


Best
Yun Tang


From: Benjamin Burkhardt <[hidden email]>
Sent: Tuesday, April 2, 2019 15:00
To: [hidden email]
Subject: Metrics for received records per TaskManager
 
Hi all,

I’m looking for a metric which allows me keeping track of the records or bytes each TaskManager has received or processed for the current task.

Can anyone help me getting this?

Thanks.

Benjamin
Reply | Threaded
Open this post in threaded view
|

Re: Metrics for received records per TaskManager

Benjamin Burkhardt-2
Hi Yun,

thanks for the hint. I tried to access the metric through the REST API calling http://localhost:8081/taskmanagers/2264f296385854f2d1fb4d121495822a/metrics?get= numBytesInRemotePerSecond.

Unfortunately the metric is not available...

Only these are avaiblable:
[{"id":"Status.Network.AvailableMemorySegments"},{"id":"Status.JVM.Memory.NonHeap.Committed"},{"id":"Status.JVM.Memory.Mapped.TotalCapacity"},{"id":"Status.JVM.Memory.NonHeap.Used"},{"id":"Status.JVM.GarbageCollector.G1_Old_Generation.Count"},{"id":"Status.Network.TotalMemorySegments"},{"id":"Status.JVM.Memory.Direct.MemoryUsed"},{"id":"Status.JVM.Memory.Mapped.MemoryUsed"},{"id":"Status.JVM.CPU.Time"},{"id":"Status.JVM.GarbageCollector.G1_Young_Generation.Count"},{"id":"Status.JVM.Threads.Count"},{"id":"Status.JVM.GarbageCollector.G1_Old_Generation.Time"},{"id":"Status.JVM.Memory.Direct.TotalCapacity"},{"id":"Status.JVM.Memory.Heap.Committed"},{"id":"Status.JVM.ClassLoader.ClassesLoaded"},{"id":"Status.JVM.Memory.Mapped.Count"},{"id":"Status.JVM.Memory.Direct.Count"},{"id":"Status.JVM.CPU.Load"},{"id":"Status.JVM.Memory.Heap.Used"},{"id":"Status.JVM.Memory.Heap.Max"},{"id":"Status.JVM.ClassLoader.ClassesUnloaded"},{"id":"Status.JVM.GarbageCollector.G1_Young_Generation.Time"},{„id“:“Status.JVM.Memory.NonHeap.Max“}]


How do I enable it, maybe in the flink-conf?

Thanks.

Benjamin
Am 2. Apr. 2019, 10:37 +0200 schrieb Yun Tang <[hidden email]>:
Hi Benjamin

I think 'numBytesInLocalPerSecond' and 'numBytesInRemotePerSecond' which indicate 'The number of bytes this task reads from a local source per second' and 'The number of bytes this task reads from a remote source per second' respectively could help you. If you want to track the information by each TaskManager, please group the metrics by tag 'tm_id'.


Best
Yun Tang


From: Benjamin Burkhardt <[hidden email]>
Sent: Tuesday, April 2, 2019 15:00
To: [hidden email]
Subject: Metrics for received records per TaskManager
 
Hi all,

I’m looking for a metric which allows me keeping track of the records or bytes each TaskManager has received or processed for the current task.

Can anyone help me getting this?

Thanks.

Benjamin
Reply | Threaded
Open this post in threaded view
|

Re: Metrics for received records per TaskManager

Yun Tang
Hi Benjamin

Try this
<a href="http://localhost:8081/jobs/{job-id}/vertices/{vertices-id}/subtasks/{subtask-index}/metrics?get=numBytesInLocalPerSecond" id="LPlnk753252">http://localhost:8081/jobs/{job-id}/vertices/{vertices-id}/subtasks/{subtask-index}/metrics?get=numBytesInLocalPerSecond

You could GET <a href="http://localhost:8081/jobs/{job-id}/vertices/{vertices-id}/subtasks/{subtask-index}/metrics?get=numBytesInLocalPerSecond" id="LPlnk617806"> http://localhost:8081/jobs/ to know running jobs,  and GET <a href="http://localhost:8081/jobs/{job-id}/vertices/{vertices-id}/subtasks/{subtask-index}/metrics?get=numBytesInLocalPerSecond" id="LPlnk183946"> http://localhost:8081/jobs/{job-id}/vertices/ to know all vertices similarly.

However, AFAIK, if you use REST API to query I'm afraid you cannot directly know the received records per task manager, and you have to gather these metrics per task.

Best
Yun Tang

From: Benjamin Burkhardt <[hidden email]>
Sent: Tuesday, April 2, 2019 21:56
To: [hidden email]; Yun Tang
Subject: Re: Metrics for received records per TaskManager
 
Hi Yun,

thanks for the hint. I tried to access the metric through the REST API calling http://localhost:8081/taskmanagers/2264f296385854f2d1fb4d121495822a/metrics?get= numBytesInRemotePerSecond.

Unfortunately the metric is not available...

Only these are avaiblable:
[{"id":"Status.Network.AvailableMemorySegments"},{"id":"Status.JVM.Memory.NonHeap.Committed"},{"id":"Status.JVM.Memory.Mapped.TotalCapacity"},{"id":"Status.JVM.Memory.NonHeap.Used"},{"id":"Status.JVM.GarbageCollector.G1_Old_Generation.Count"},{"id":"Status.Network.TotalMemorySegments"},{"id":"Status.JVM.Memory.Direct.MemoryUsed"},{"id":"Status.JVM.Memory.Mapped.MemoryUsed"},{"id":"Status.JVM.CPU.Time"},{"id":"Status.JVM.GarbageCollector.G1_Young_Generation.Count"},{"id":"Status.JVM.Threads.Count"},{"id":"Status.JVM.GarbageCollector.G1_Old_Generation.Time"},{"id":"Status.JVM.Memory.Direct.TotalCapacity"},{"id":"Status.JVM.Memory.Heap.Committed"},{"id":"Status.JVM.ClassLoader.ClassesLoaded"},{"id":"Status.JVM.Memory.Mapped.Count"},{"id":"Status.JVM.Memory.Direct.Count"},{"id":"Status.JVM.CPU.Load"},{"id":"Status.JVM.Memory.Heap.Used"},{"id":"Status.JVM.Memory.Heap.Max"},{"id":"Status.JVM.ClassLoader.ClassesUnloaded"},{"id":"Status.JVM.GarbageCollector.G1_Young_Generation.Time"},{„id“:“Status.JVM.Memory.NonHeap.Max“}]


How do I enable it, maybe in the flink-conf?

Thanks.

Benjamin
Am 2. Apr. 2019, 10:37 +0200 schrieb Yun Tang <[hidden email]>:
Hi Benjamin

I think 'numBytesInLocalPerSecond' and 'numBytesInRemotePerSecond' which indicate 'The number of bytes this task reads from a local source per second' and 'The number of bytes this task reads from a remote source per second' respectively could help you. If you want to track the information by each TaskManager, please group the metrics by tag 'tm_id'.


Best
Yun Tang


From: Benjamin Burkhardt <[hidden email]>
Sent: Tuesday, April 2, 2019 15:00
To: [hidden email]
Subject: Metrics for received records per TaskManager
 
Hi all,

I’m looking for a metric which allows me keeping track of the records or bytes each TaskManager has received or processed for the current task.

Can anyone help me getting this?

Thanks.

Benjamin
Reply | Threaded
Open this post in threaded view
|

Re: Metrics for received records per TaskManager

Benjamin Burkhardt-2
Hi Yun,

thank you for the advice, but how would you suggest doing it to get the metrics also for each TaskManager?
I do not urgently need to use REST because I’m running my code within Flink. Maybe there is another way to access it?

Thanks a lot.

Benjamin
Am 2. Apr. 2019, 18:26 +0200 schrieb Yun Tang <[hidden email]>:
Hi Benjamin

Try this
<a href="http://localhost:8081/jobs/{job-id}/vertices/{vertices-id}/subtasks/{subtask-index}/metrics?get=numBytesInLocalPerSecond" id="LPlnk753252">http://localhost:8081/jobs/{job-id}/vertices/{vertices-id}/subtasks/{subtask-index}/metrics?get=numBytesInLocalPerSecond

You could GET <a href="http://localhost:8081/jobs/{job-id}/vertices/{vertices-id}/subtasks/{subtask-index}/metrics?get=numBytesInLocalPerSecond" id="LPlnk617806">http://localhost:8081/jobs/ to know running jobs,  and GET <a href="http://localhost:8081/jobs/{job-id}/vertices/{vertices-id}/subtasks/{subtask-index}/metrics?get=numBytesInLocalPerSecond" id="LPlnk183946">http://localhost:8081/jobs/{job-id}/vertices/ to know all vertices similarly.

However, AFAIK, if you use REST API to query I'm afraid you cannot directly know the received records per task manager, and you have to gather these metrics per task.

Best
Yun Tang

From: Benjamin Burkhardt <[hidden email]>
Sent: Tuesday, April 2, 2019 21:56
To: [hidden email]; Yun Tang
Subject: Re: Metrics for received records per TaskManager
 
Hi Yun,

thanks for the hint. I tried to access the metric through the REST API calling http://localhost:8081/taskmanagers/2264f296385854f2d1fb4d121495822a/metrics?get= numBytesInRemotePerSecond.

Unfortunately the metric is not available...

Only these are avaiblable:
[{"id":"Status.Network.AvailableMemorySegments"},{"id":"Status.JVM.Memory.NonHeap.Committed"},{"id":"Status.JVM.Memory.Mapped.TotalCapacity"},{"id":"Status.JVM.Memory.NonHeap.Used"},{"id":"Status.JVM.GarbageCollector.G1_Old_Generation.Count"},{"id":"Status.Network.TotalMemorySegments"},{"id":"Status.JVM.Memory.Direct.MemoryUsed"},{"id":"Status.JVM.Memory.Mapped.MemoryUsed"},{"id":"Status.JVM.CPU.Time"},{"id":"Status.JVM.GarbageCollector.G1_Young_Generation.Count"},{"id":"Status.JVM.Threads.Count"},{"id":"Status.JVM.GarbageCollector.G1_Old_Generation.Time"},{"id":"Status.JVM.Memory.Direct.TotalCapacity"},{"id":"Status.JVM.Memory.Heap.Committed"},{"id":"Status.JVM.ClassLoader.ClassesLoaded"},{"id":"Status.JVM.Memory.Mapped.Count"},{"id":"Status.JVM.Memory.Direct.Count"},{"id":"Status.JVM.CPU.Load"},{"id":"Status.JVM.Memory.Heap.Used"},{"id":"Status.JVM.Memory.Heap.Max"},{"id":"Status.JVM.ClassLoader.ClassesUnloaded"},{"id":"Status.JVM.GarbageCollector.G1_Young_Generation.Time"},{„id“:“Status.JVM.Memory.NonHeap.Max“}]


How do I enable it, maybe in the flink-conf?

Thanks.

Benjamin
Am 2. Apr. 2019, 10:37 +0200 schrieb Yun Tang <[hidden email]>:
Hi Benjamin

I think 'numBytesInLocalPerSecond' and 'numBytesInRemotePerSecond' which indicate 'The number of bytes this task reads from a local source per second' and 'The number of bytes this task reads from a remote source per second' respectively could help you. If you want to track the information by each TaskManager, please group the metrics by tag 'tm_id'.


Best
Yun Tang


From: Benjamin Burkhardt <[hidden email]>
Sent: Tuesday, April 2, 2019 15:00
To: [hidden email]
Subject: Metrics for received records per TaskManager
 
Hi all,

I’m looking for a metric which allows me keeping track of the records or bytes each TaskManager has received or processed for the current task.

Can anyone help me getting this?

Thanks.

Benjamin
Reply | Threaded
Open this post in threaded view
|

Re: Metrics for received records per TaskManager

Yun Tang
Hi Benjamin

Flink could support to report its metrics to external system such as Prometheus, Graphite and so on [1]. And you could then use web front end such as Grafana to query those system. Take `numBytesInLocalPerSecond` metrics for example, it would have many metrics tags and one of them is `tm_id` (task manager id). And if you group this metrics by `tm_id` to a specific task manager node, you would view received bytes from local at that task manager.


Best
Yun Tang

From: Benjamin Burkhardt <[hidden email]>
Sent: Wednesday, April 3, 2019 0:21
To: [hidden email]; Yun Tang
Subject: Re: Metrics for received records per TaskManager
 
Hi Yun,

thank you for the advice, but how would you suggest doing it to get the metrics also for each TaskManager?
I do not urgently need to use REST because I’m running my code within Flink. Maybe there is another way to access it?

Thanks a lot.

Benjamin
Am 2. Apr. 2019, 18:26 +0200 schrieb Yun Tang <[hidden email]>:
Hi Benjamin

Try this
<a href="http://localhost:8081/jobs/{job-id}/vertices/{vertices-id}/subtasks/{subtask-index}/metrics?get=numBytesInLocalPerSecond" id="LPlnk753252">http://localhost:8081/jobs/{job-id}/vertices/{vertices-id}/subtasks/{subtask-index}/metrics?get=numBytesInLocalPerSecond

You could GET <a href="http://localhost:8081/jobs/{job-id}/vertices/{vertices-id}/subtasks/{subtask-index}/metrics?get=numBytesInLocalPerSecond" id="LPlnk617806"> http://localhost:8081/jobs/ to know running jobs,  and GET <a href="http://localhost:8081/jobs/{job-id}/vertices/{vertices-id}/subtasks/{subtask-index}/metrics?get=numBytesInLocalPerSecond" id="LPlnk183946"> http://localhost:8081/jobs/{job-id}/vertices/ to know all vertices similarly.

However, AFAIK, if you use REST API to query I'm afraid you cannot directly know the received records per task manager, and you have to gather these metrics per task.

Best
Yun Tang

From: Benjamin Burkhardt <[hidden email]>
Sent: Tuesday, April 2, 2019 21:56
To: [hidden email]; Yun Tang
Subject: Re: Metrics for received records per TaskManager
 
Hi Yun,

thanks for the hint. I tried to access the metric through the REST API calling http://localhost:8081/taskmanagers/2264f296385854f2d1fb4d121495822a/metrics?get= numBytesInRemotePerSecond.

Unfortunately the metric is not available...

Only these are avaiblable:
[{"id":"Status.Network.AvailableMemorySegments"},{"id":"Status.JVM.Memory.NonHeap.Committed"},{"id":"Status.JVM.Memory.Mapped.TotalCapacity"},{"id":"Status.JVM.Memory.NonHeap.Used"},{"id":"Status.JVM.GarbageCollector.G1_Old_Generation.Count"},{"id":"Status.Network.TotalMemorySegments"},{"id":"Status.JVM.Memory.Direct.MemoryUsed"},{"id":"Status.JVM.Memory.Mapped.MemoryUsed"},{"id":"Status.JVM.CPU.Time"},{"id":"Status.JVM.GarbageCollector.G1_Young_Generation.Count"},{"id":"Status.JVM.Threads.Count"},{"id":"Status.JVM.GarbageCollector.G1_Old_Generation.Time"},{"id":"Status.JVM.Memory.Direct.TotalCapacity"},{"id":"Status.JVM.Memory.Heap.Committed"},{"id":"Status.JVM.ClassLoader.ClassesLoaded"},{"id":"Status.JVM.Memory.Mapped.Count"},{"id":"Status.JVM.Memory.Direct.Count"},{"id":"Status.JVM.CPU.Load"},{"id":"Status.JVM.Memory.Heap.Used"},{"id":"Status.JVM.Memory.Heap.Max"},{"id":"Status.JVM.ClassLoader.ClassesUnloaded"},{"id":"Status.JVM.GarbageCollector.G1_Young_Generation.Time"},{„id“:“Status.JVM.Memory.NonHeap.Max“}]


How do I enable it, maybe in the flink-conf?

Thanks.

Benjamin
Am 2. Apr. 2019, 10:37 +0200 schrieb Yun Tang <[hidden email]>:
Hi Benjamin

I think 'numBytesInLocalPerSecond' and 'numBytesInRemotePerSecond' which indicate 'The number of bytes this task reads from a local source per second' and 'The number of bytes this task reads from a remote source per second' respectively could help you. If you want to track the information by each TaskManager, please group the metrics by tag 'tm_id'.


Best
Yun Tang


From: Benjamin Burkhardt <[hidden email]>
Sent: Tuesday, April 2, 2019 15:00
To: [hidden email]
Subject: Metrics for received records per TaskManager
 
Hi all,

I’m looking for a metric which allows me keeping track of the records or bytes each TaskManager has received or processed for the current task.

Can anyone help me getting this?

Thanks.

Benjamin