Hi,
I'm trying to add some custom metrics for a Flink job, but have bumped into some issues using the PrometheusReporter. If I'm running multiple instances of the same job under the same TaskManager, I'm seeing the following error when the second instance of the job tries to create the metric with the same name:
2018-06-13 11:17:42,512 ERROR org.apache.flink.runtime.metrics.MetricRegistry - Error while registering metric.
java.lang.IllegalArgumentException: Collector already registered that provides name: flink_taskmanager_job_task_operator_myMetric
This is preventing the metric from being created properly. I can work around this by putting the task_attempt_id or some other uuid in the metric name to avoid the collision, but this causes extra clutter and orphaned metrics if the job restarts. Has anyone else run into this? Is there a better approach for handling it?
Thanks,
Russell