Retain metrics counters across task restarts

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Retain metrics counters across task restarts

Peter Zende
Hi all

We're exposing Prometheus metrics from our Flink (v1.7.1) pipeline to Prometheus, e.g: the total number of processed records. This works fine until any of the tasks is restarted within this yarn application. Then the counter is reset and it starts incrementing values from 0.
How can we retain such counter through the entire lifetime of the yarn application similarly to Hadoop counters?

Thanks
Peter
Reply | Threaded
Open this post in threaded view
|

Re: Retain metrics counters across task restarts

Zhijiang(wangzhijiang999)
Hi Peter,

The lifecycle of these metrics are coupled with lifecycle of task, So the metrics would be initialized after task is restarted. I think of one possible option is that you could store your required metrics into state, then the metric states would be restored from backend after task is restarted.

Best,
Zhijiang
------------------------------------------------------------------
From:Peter Zende <[hidden email]>
Send Time:2019年4月14日(星期日) 00:25
To:user <[hidden email]>
Subject:Retain metrics counters across task restarts

Hi all

We're exposing Prometheus metrics from our Flink (v1.7.1) pipeline to Prometheus, e.g: the total number of processed records. This works fine until any of the tasks is restarted within this yarn application. Then the counter is reset and it starts incrementing values from 0.
How can we retain such counter through the entire lifetime of the yarn application similarly to Hadoop counters?

Thanks
Peter

Reply | Threaded
Open this post in threaded view
|

Re: Retain metrics counters across task restarts

Peter Zende
Hi Zhijiang
Thanks for the clarification we were thinking about the very same solution, we'll then go in this direction.

Best
Peter

zhijiang <[hidden email]> ezt írta (időpont: 2019. ápr. 15., H, 4:28):
Hi Peter,

The lifecycle of these metrics are coupled with lifecycle of task, So the metrics would be initialized after task is restarted. I think of one possible option is that you could store your required metrics into state, then the metric states would be restored from backend after task is restarted.

Best,
Zhijiang
------------------------------------------------------------------
From:Peter Zende <[hidden email]>
Send Time:2019年4月14日(星期日) 00:25
To:user <[hidden email]>
Subject:Retain metrics counters across task restarts

Hi all

We're exposing Prometheus metrics from our Flink (v1.7.1) pipeline to Prometheus, e.g: the total number of processed records. This works fine until any of the tasks is restarted within this yarn application. Then the counter is reset and it starts incrementing values from 0.
How can we retain such counter through the entire lifetime of the yarn application similarly to Hadoop counters?

Thanks
Peter