TimerService Troubleshooting/Metrics

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

TimerService Troubleshooting/Metrics

Sayat Satybaldiyev-2
Dear Flink Community,

Is there a way of troubleshooting timer service? In the docs, it says that the service might degrade a job performance significantly. Is there a way how to expose and see timer service metrics i.e. length of the priority queue, how many time the service fires etc?
Reply | Threaded
Open this post in threaded view
|

Re: TimerService Troubleshooting/Metrics

Andrey Zagrebin
Hi Sayat,

As far as I know, there are no timer service metrics exposed at the moment.
I pull in Stefan into the thread, maybe, he could add more.

In case of RocksDB, you can try enabling RocksDB internal metrics [1].
Timer service uses RocksDB state backend to queue timers and has a dedicated column in RocksDB.
The metrics of this column might help, like 'state.backend.rocksdb.metrics.estimate-num-keys’.

Best,
Andrey

[1] https://ci.apache.org/projects/flink/flink-docs-release-1.7/ops/config.html#rocksdb-native-metrics

> On 14 Dec 2018, at 16:01, sayat <[hidden email]> wrote:
>
> Dear Flink Community,
>
> Is there a way of troubleshooting timer service? In the docs, it says that the service might degrade a job performance significantly. Is there a way how to expose and see timer service metrics i.e. length of the priority queue, how many time the service fires etc?