Hi team,
I have two queries as mentioned below: Query1: I am using PrometheusReporter to expose metrics to Prometheus Server. What should be the minimum recommended scrape interval to be defined on Prometheus server? Is there any interval in which Flink reports metrics? Query2: Is there any way I can fetch the metrics of all vertices (including subtasks) of a job through a single Monitoring Rest API of Flink. As of now what I have tried is first finding the vertices and then querying individual vertex for metrics as below: Step 1: Finding jobId (http://<IP>:<Port>/jobs) Step 2: Finding vertices Id (http://<IP>:<Port>/jobs/<jobId>) Step 3: Finding aggregated metrics (including parallelism) of a vertex
(http://<IP>:<Port>/jobs/<jobId>/vertices/<vertexId>/subtasks/metrics?get=<metric1>,<metric2>)
So like wise I have to invoke multiple rest apis for each vertex id . Is there any optimised way to get metrics of all vertices? Thanks & Regards, Ashutosh |
Hi Ashutosh, you can set the metrics update interval through metrics.fetcher.update-interval [1]. Unfortunately, there is no single endpoint to collect all the metrics in a more efficient way other than the metrics endpoints provided in [2]. I hope that helps. Best, Matthias On Wed, May 26, 2021 at 2:01 PM Ashutosh Uttam <[hidden email]> wrote:
|
Thanks Matthias. We are using Prometheus for fetching metrics. Is there any recommended scrape interval ? Also is there any impact if lower scrape intervals are used? Regards, Ashutosh On Fri, May 28, 2021 at 7:17 PM Matthias Pohl <[hidden email]> wrote:
|
There is no recommended scrape interval
because it is largely dependent on your requirements.
For example, if you're fine with
reacting to problems within an hour, then a 5s scrape interval
doesn't make sense.
The lower the interval the more
resources must of course be spent on serving the prometheus
request; you will need to experiment whether this incurs an
unacceptable performance impact.
On 6/8/2021 3:07 PM, Ashutosh Uttam
wrote:
|
Free forum by Nabble | Edit this page |