Re: Using Prometheus Client Metrics in Flink

Posted by Rion Williams on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/Using-Prometheus-Client-Metrics-in-Flink-tp41768p41783.html

Hi Prassana,

Thanks for that. It’s what I was doing previously as a workaround however I was just curious if there was any Flink-specific functionality to handle this prior to Prometheus.

Additionally from the docs on metrics [0], it seems that there’s a pattern in place to use supported third-party metrics such as those from CodeHale/DropWizard via a Maven package (flink-metrics-dropwizard). I do see a similarly named package for Prometheus which may be what I’m looking for as it’s similarly named (flink-metrics-prometheus), so I may give that a try.

Thanks,

Rion

[0]: https://ci.apache.org/projects/flink/flink-docs-stable/ops/metrics.html

On Feb 28, 2021, at 12:20 AM, Prasanna kumar <[hidden email]> wrote:


Rion,

Regarding the second question , you can aggregate by using sum function  sum(metric_name{jobb_name="JOBNAME"}) .  This works is you are using the metric counter.

Prasanna.

On Sat, Feb 27, 2021 at 9:01 PM Rion Williams <[hidden email]> wrote:
Hi folks,

I’ve just recently started working with Flink and I was in the process of adding some metrics through my existing pipeline with the hopes of building some Grafana dashboards with them to help with observability.

Initially I looked at the built-in Flink metrics that were available, but I didn’t see an easy mechanism for setting/using labels with them. Essentially, I have two properties for my messages coming through the pipeline that I’d like to be able to keep track of (tenant/source) across several metrics (e.g. total_messages with tenant / source labels, etc.). I didn’t see an easy way to adjust this out of the box, or wasn’t aware of a good pattern for handling these.

I had previously used the Prometheus Client metrics [0] to accomplish this in the past but I wasn’t entirely sure how it would/could mesh with Flink. Does anyone have experience in working with these or know if they are supported?

Secondly, when using the Flink metrics, I noticed I was receiving a separate metric for each task that was being spun up. Is there an “easy button” to handle aggregating these to ensure that a single metric (e.g. total_messages) reflects the total processed across all of the tasks instead of each individual one?

Any recommendations / resources / advice would be greatly appreciated!

Thanks,

Rion