How to reduce number of metrics pushed to Prometheus Push Gateway

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

How to reduce number of metrics pushed to Prometheus Push Gateway

Alexander Filipchik
Hi,
 
Is there a way to reduce cardinality (preaggregate) metrics that are emitted to Prom Push gateway?

Our metrics infra is struggling to digest per task stats. Any way we can configure it to emit per stage aggregates?

Our current config:
metrics.scope.tmflink.taskmanager
metrics.scope.operatorflink.operator
metrics.scope.jmflink.jobmanager
metrics.scope.jm.jobflink.jobmanager.job
metrics.scope.taskflink.task

But metrics still look like:
{ job_id="b12e2",
job_name="kafka_",
subtask_index="2",
task_attempt_num="14",
task_id="00f9d",
task_name="Source:_Read_from_kafka",
tm_id="17e3c"
}

Am I changing a proper config?

Alex
Reply | Threaded
Open this post in threaded view
|

Re: How to reduce number of metrics pushed to Prometheus Push Gateway

Chesnay Schepler
There is no way to reduce the number of metrics.

The only thing you can do is exclude specific variables (e.g., task_name), like this:
metrics.reporter.<reporter_name>.scope.variables.exclude: task_name[; <any_other_variables_to_exclude]

On 12/9/2020 7:15 PM, Alexander Filipchik wrote:
Hi,
 
Is there a way to reduce cardinality (preaggregate) metrics that are emitted to Prom Push gateway?

Our metrics infra is struggling to digest per task stats. Any way we can configure it to emit per stage aggregates?

Our current config:
metrics.scope.tm flink.taskmanager
metrics.scope.operator flink.operator
metrics.scope.jm flink.jobmanager
metrics.scope.jm.job flink.jobmanager.job
metrics.scope.task flink.task

But metrics still look like:
{ job_id="b12e2",
job_name="kafka_",
subtask_index="2",
task_attempt_num="14",
task_id="00f9d",
task_name="Source:_Read_from_kafka",
tm_id="17e3c"
}

Am I changing a proper config?

Alex


Reply | Threaded
Open this post in threaded view
|

Re: How to reduce number of metrics pushed to Prometheus Push Gateway

Alexander Filipchik
Thank you for replying!

Will exclusion produce proper aggregates? If I drop subtask_index, will resulting metric be a sum of all the subtasks, or it will be just data from one that was reported the last?

Alex

On Thu, Dec 10, 2020 at 4:28 AM Chesnay Schepler <[hidden email]> wrote:
There is no way to reduce the number of metrics.

The only thing you can do is exclude specific variables (e.g., task_name), like this:
metrics.reporter.<reporter_name>.scope.variables.exclude: task_name[; <any_other_variables_to_exclude]

On 12/9/2020 7:15 PM, Alexander Filipchik wrote:
Hi,
 
Is there a way to reduce cardinality (preaggregate) metrics that are emitted to Prom Push gateway?

Our metrics infra is struggling to digest per task stats. Any way we can configure it to emit per stage aggregates?

Our current config:
metrics.scope.tm flink.taskmanager
metrics.scope.operator flink.operator
metrics.scope.jm flink.jobmanager
metrics.scope.jm.job flink.jobmanager.job
metrics.scope.task flink.task

But metrics still look like:
{ job_id="b12e2",
job_name="kafka_",
subtask_index="2",
task_attempt_num="14",
task_id="00f9d",
task_name="Source:_Read_from_kafka",
tm_id="17e3c"
}

Am I changing a proper config?

Alex


Reply | Threaded
Open this post in threaded view
|

Re: How to reduce number of metrics pushed to Prometheus Push Gateway

Chesnay Schepler
It will not produce aggregates. But it may reduce the load a bit without affecting correctness; some variables are not necessarily required for preventing metrics from overriding each other; like the job/task name (because the IDs are good enough).

On 12/10/2020 6:37 PM, Alexander Filipchik wrote:
Thank you for replying!

Will exclusion produce proper aggregates? If I drop subtask_index, will resulting metric be a sum of all the subtasks, or it will be just data from one that was reported the last?

Alex

On Thu, Dec 10, 2020 at 4:28 AM Chesnay Schepler <[hidden email]> wrote:
There is no way to reduce the number of metrics.

The only thing you can do is exclude specific variables (e.g., task_name), like this:
metrics.reporter.<reporter_name>.scope.variables.exclude: task_name[; <any_other_variables_to_exclude]

On 12/9/2020 7:15 PM, Alexander Filipchik wrote:
Hi,
 
Is there a way to reduce cardinality (preaggregate) metrics that are emitted to Prom Push gateway?

Our metrics infra is struggling to digest per task stats. Any way we can configure it to emit per stage aggregates?

Our current config:
metrics.scope.tm flink.taskmanager
metrics.scope.operator flink.operator
metrics.scope.jm flink.jobmanager
metrics.scope.jm.job flink.jobmanager.job
metrics.scope.task flink.task

But metrics still look like:
{ job_id="b12e2",
job_name="kafka_",
subtask_index="2",
task_attempt_num="14",
task_id="00f9d",
task_name="Source:_Read_from_kafka",
tm_id="17e3c"
}

Am I changing a proper config?

Alex