Hi all,
we are using flink 1.5.2 in batch mode with prometheus monitoring. We noticed that a few metrics do not get unregistered after a job is finished: flink_taskmanager_job_task_operator_numRecordsIn flink_taskmanager_job_task_operator_numRecordsInPerSecond flink_taskmanager_job_task_operator_numRecordsOut flink_taskmanager_job_task_operator_numRecordsOutPerSecond Those metrics stay in the taksmanager metrics list until the task manger gets restarted. Our metrics config is: metrics.reporters: prom metrics.reporter.prom.class: org.apache.flink.metrics.prometheus.PrometheusReporter metrics.reporter.prom.port: 7000-7001 metrics.scope.jm: flink.<host>.jobmanager metrics.scope.tm: flink.<host>.taskmanager.<tm_id> metrics.scope.jm.job: flink.<host>.jobmanager.<job_name> metrics.scope.tm.job: flink.<host>.taskmanager.<tm_id>.<job_name> metrics.scope.task: flink.<host>.taskmanager.<tm_id>.<job_name>.<task_name>.<subtask_index> metrics.scope.operator: flink.<host>.taskmanager.<tm_id>.<job_name>.<operator_name>.<subtask_index> Since we run many batch jobs, this makes prometheus monitoring unusable for us. Is this a known issue? Best, Helmut |
Hi Helmut, Is the metrics of all the sub task instances of a job not unregistered, or part of it is not unregistered. Is there any exception log information available? Please feel free to create a JIRA issue and clearly describe your problem. Thanks, vino. Helmut Zechmann <[hidden email]> 于2018年8月17日周五 下午11:14写道: Hi all, |
Hi Vino,
The log shows no problems. The problem can be reproduced easily. I created https://issues.apache.org/jira/browse/FLINK-10300. Best, Helmut
|
Free forum by Nabble | Edit this page |