(DEPRECATED) Apache Flink User Mailing List archive.

EOF on scraping flink metrics

Classic

List

Threaded

3 messages Options

Vishal Santoshi

EOF on scraping flink metrics

A simple query, Is the route to /metrics execute an access to an in memory registry of stats collected OR does it contend with access from JM or do expensive access or computation. I see occasionally our Prometheus scrape fail with the error pasted below. We have had the scrapper do much more elaborate scrape on other systems we maintain so was curious. The server did not have any logs related to the exception and the scraper is . ServiceMonitor from k8s and of course these TMs are hosted no k8s as well

Get http://10.246.254.84:9610/metrics: EOF

Chesnay Schepler

Re: EOF on scraping flink metrics

Since you're using Prometheus I would recommend setting up a PrometheusReporter as described in the metrics documentation and scrape each JM/TM individually. Scraping through the REST API is more expensive and you loose out on a lot of features.
The REST API calls are primarily aimed at the WebUI.

Regardless, as of right now I would doubt that this is a Flink issue, and would recommend heading to the prometheus mailing lists.

On 22/03/2019 17:55, Vishal Santoshi wrote:

A simple query, Is the route to /metrics execute an access to an in memory registry of stats collected OR does it contend with access from JM or do expensive access or computation. I see occasionally our Prometheus scrape fail with the error pasted below. We have had the scrapper do much more elaborate scrape on other systems we maintain so was curious. The server did not have any logs related to the exception and the scraper is . ServiceMonitor from k8s and of course these TMs are hosted no k8s as well
Get http://10.246.254.84:9610/metrics: EOF

Vishal Santoshi

Re: EOF on scraping flink metrics

Thank you,

This is following https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/metrics.html#prometheus-orgapacheflinkmetricsprometheusprometheusreporter .

What might I be doing wrong ?

metrics.reporters: prom

metrics.reporter.prom.port: 9610 . 
metrics.reporter.prom.class: org.apache.flink.metrics.prometheus.PrometheusReporter

and uses  ServiceMonitor

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor

endpoints:
  - port: metrics // named port exposed in the k8s service and is 9610
    scheme: http
    path: /metrics
    interval: 60s
    scrapeTimeout: 59s 
selector:....

Regards.

On Fri, Mar 22, 2019 at 3:05 PM Chesnay Schepler <[hidden email]> wrote:

Since you're using Prometheus I would recommend setting up a PrometheusReporter as described in the metrics documentation and scrape each JM/TM individually. Scraping through the REST API is more expensive and you loose out on a lot of features.
The REST API calls are primarily aimed at the WebUI.

Regardless, as of right now I would doubt that this is a Flink issue, and would recommend heading to the prometheus mailing lists.

On 22/03/2019 17:55, Vishal Santoshi wrote:
A simple query, Is the route to /metrics execute an access to an in memory registry of stats collected OR does it contend with access from JM or do expensive access or computation. I see occasionally our Prometheus scrape fail with the error pasted below. We have had the scrapper do much more elaborate scrape on other systems we maintain so was curious. The server did not have any logs related to the exception and the scraper is . ServiceMonitor from k8s and of course these TMs are hosted no k8s as well
Get http://10.246.254.84:9610/metrics: EOF