How to get Task metrics with StatsD metric reporter?

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

How to get Task metrics with StatsD metric reporter?

John Smith
Hi, running Flink 1.8

I'm declaring my metric as such.
invalidList = getRuntimeContext()
.getMetricGroup()
.addGroup("MyMetrics")
.meter("invalidList", new DropwizardMeterWrapper(new com.codahale.metrics.Meter()));
Then in my code I call.

invalidList.markEvent();

On the task nodes I enabled the Influx Telegraf StatsD server. And I enabled the task node with.

metrics.reporter.stsd.class: org.apache.flink.metrics.statsd.StatsDReporter
metrics.reporter.stsd.host: localhost
metrics.reporter.stsd.port: 8125

The metrics are being pushed to Elasticsearch. So far I only see the Status_JVM_* metrics.

Do the task specific metrics come from the Job nodes? I have not enabled reporting on the Job nodes yet.







Reply | Threaded
Open this post in threaded view
|

Re: How to get Task metrics with StatsD metric reporter?

John Smith
I think I figured it out. I used netcat to debug. I think the Telegraf StatsD server doesn't support spaces in the stats names.

On Mon, 20 Jan 2020 at 12:19, John Smith <[hidden email]> wrote:
Hi, running Flink 1.8

I'm declaring my metric as such.
invalidList = getRuntimeContext()
.getMetricGroup()
.addGroup("MyMetrics")
.meter("invalidList", new DropwizardMeterWrapper(new com.codahale.metrics.Meter()));
Then in my code I call.

invalidList.markEvent();

On the task nodes I enabled the Influx Telegraf StatsD server. And I enabled the task node with.

metrics.reporter.stsd.class: org.apache.flink.metrics.statsd.StatsDReporter
metrics.reporter.stsd.host: localhost
metrics.reporter.stsd.port: 8125

The metrics are being pushed to Elasticsearch. So far I only see the Status_JVM_* metrics.

Do the task specific metrics come from the Job nodes? I have not enabled reporting on the Job nodes yet.







Reply | Threaded
Open this post in threaded view
|

Re: How to get Task metrics with StatsD metric reporter?

Chesnay Schepler
I presume your job/task names contains a space, which is included in the metrics scope?

You can either configure the metric scope such that the job/task ID is included instead, or create a modified version of the StatsDReporter that filters out additional characters(i.e., override #filterCharacters).

When it comes to automatically filtering characters the StatsDReporter is in a bit of a pickle; different backends have different rules for what characters are allowed which also differ with StatsD.
I'm not sure yet what the best solution for this is.

On 21/01/2020 17:18, John Smith wrote:
I think I figured it out. I used netcat to debug. I think the Telegraf StatsD server doesn't support spaces in the stats names.

On Mon, 20 Jan 2020 at 12:19, John Smith <[hidden email]> wrote:
Hi, running Flink 1.8

I'm declaring my metric as such.
invalidList = getRuntimeContext()
      .getMetricGroup()
      .addGroup("MyMetrics")
      .meter("invalidList", new DropwizardMeterWrapper(new com.codahale.metrics.Meter()));
Then in my code I call.

invalidList.markEvent();

On the task nodes I enabled the Influx Telegraf StatsD server. And I enabled the task node with.

metrics.reporter.stsd.class: org.apache.flink.metrics.statsd.StatsDReporter
metrics.reporter.stsd.host: localhost
metrics.reporter.stsd.port: 8125

The metrics are being pushed to Elasticsearch. So far I only see the Status_JVM_* metrics.

Do the task specific metrics come from the Job nodes? I have not enabled reporting on the Job nodes yet.








Reply | Threaded
Open this post in threaded view
|

Re: How to get Task metrics with StatsD metric reporter?

John Smith
Hi,

1- Yes. I have spaces in the job name and task. How do you configure the metric scope for a particular job?
2- I opted for the second solution, I forked my own StatsD reporter and squashed all spaces. Here: https://github.com/javadevmtl/flink/blob/statsd-spaces/flink-metrics/flink-metrics-statsd/src/main/java/org/apache/flink/metrics/statsd/StatsDReporter.java
3- Maybe filter characters or an additional function can take in a config for RegEx that removes any special chars from the RegEx pattern?
4- Another idea I also explored but didn't get around to was to have configurable drop event by RegEx or keep event by RegEx. Not tied to the above but a good option to have as a feature.




On Wed, 22 Jan 2020 at 03:55, Chesnay Schepler <[hidden email]> wrote:
I presume your job/task names contains a space, which is included in the metrics scope?

You can either configure the metric scope such that the job/task ID is included instead, or create a modified version of the StatsDReporter that filters out additional characters(i.e., override #filterCharacters).

When it comes to automatically filtering characters the StatsDReporter is in a bit of a pickle; different backends have different rules for what characters are allowed which also differ with StatsD.
I'm not sure yet what the best solution for this is.

On 21/01/2020 17:18, John Smith wrote:
I think I figured it out. I used netcat to debug. I think the Telegraf StatsD server doesn't support spaces in the stats names.

On Mon, 20 Jan 2020 at 12:19, John Smith <[hidden email]> wrote:
Hi, running Flink 1.8

I'm declaring my metric as such.
invalidList = getRuntimeContext()
      .getMetricGroup()
      .addGroup("MyMetrics")
      .meter("invalidList", new DropwizardMeterWrapper(new com.codahale.metrics.Meter()));
Then in my code I call.

invalidList.markEvent();

On the task nodes I enabled the Influx Telegraf StatsD server. And I enabled the task node with.

metrics.reporter.stsd.class: org.apache.flink.metrics.statsd.StatsDReporter
metrics.reporter.stsd.host: localhost
metrics.reporter.stsd.port: 8125

The metrics are being pushed to Elasticsearch. So far I only see the Status_JVM_* metrics.

Do the task specific metrics come from the Job nodes? I have not enabled reporting on the Job nodes yet.