Hi, I want to monitor Flink Streaming jobs using Prometheus My first goal is to send alerts when a Flink job has failed. The thing is that looking at the documentation I haven't found a metric that helps me defining an alerting rule. As a starting point i thought that the metric flink_jobmanager_job_downtime could help since the doc says this metric emits -1 for a completed job. But when i tested this i found out this doesn't work since the metric always emits 0 and after the job is completed there is no metric. Has anyone managed to alert when flink job has failed with Prometheus? Thanks for your help. |
You could use “flink_jobmanager_numRunningJobs” to check the number of running jobs. Thanks From: Jesús Vásquez <[hidden email]> Hi, I want to monitor Flink Streaming jobs using Prometheus My first goal is to send alerts when a Flink job has failed. The thing is that looking at the documentation I haven't found a metric that helps me defining an alerting rule. As a starting point i thought that the metric flink_jobmanager_job_downtime could help since the doc says this metric emits -1 for a completed job. But when i tested this i found out this doesn't work since the metric always emits 0 and after the job is completed there is no metric. Has anyone managed to alert when flink job has failed with Prometheus? Thanks for your help. |
The thing about numRunningJobs metric is that i have to configure in advance the Prometheus rules with the number of jobs i expect to be running in order to alert, i kind of need this rule to alert on individual jobs. I initially thought of flink_jobmanager_downtime{job_id=~".*"} == -1 , bit it resulted that the metric just emits 0 on running jobs, and doesn't emit -1 for failed jobs. El lun., 16 dic. 2019 7:01 p. m., PoolakkalMukkath, Shakir <[hidden email]> escribió:
|
Hi Jesús, If your job has checkpointing enabled, you can monitor 'numberOfCompletedCheckpoints' to see wether the job is still alive and healthy. Thanks, Zhu Zhu Jesús Vásquez <[hidden email]> 于2019年12月17日周二 上午2:43写道:
|
Free forum by Nabble | Edit this page |