(DEPRECATED) Apache Flink User Mailing List archive.

Re: Flink Datadog Timeout

Posted by Chesnay Schepler on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/Flink-Datadog-Timeout-tp41205p41207.html

The reported exception looks quite similar to the one in this thread, which was supposedly caused by Datadog rate limits but I don't think this was thoroughly investigated.

(bear in mind that each container has its own reporter; with the default reporting interval of 10 seconds you quickly reach fairly high reports/second rates)

Alternatively it could just be plain connectivity issues.

If the issues do not persist for a long time then no metrics should be lost however, so you may be able to ignore them.

On 2/2/2021 7:31 PM, Claude M wrote:

Hello,

I have a Flink jobmanager and taskmanagers deployed in a Kubernetes cluster. I integrated it with Datadog by having the following specified in the flink-conf.yaml.

metrics.reporter.dghttp.class: org.apache.flink.metrics.datadog.DatadogHttpReporter
metrics.reporter.dghttp.apikey: <DD_API_KEY>

However, I'm seeing random timeouts in the log and don't know why this is occurring and how to solve the issue. Please see attached file showing the error.

Thanks