(DEPRECATED) Apache Flink User Mailing List archive.

Seeing Rocks Native Metrics in Data Dog

Classic

List

Threaded

6 messages Options

Rex Fenley

Seeing Rocks Native Metrics in Data Dog

Hi,

I need to get a deeper dive into how rocks is performing so I turned on Rocks Native Metrics. However, I don't see any of the metrics in DataDog (though I have other Flink metrics in DataDog). I only see rocks metrics in the operator metrics in Flink UI, and unfortunately I can't really zoom in or out of those metrics or compare against multiple operators at a time which makes it really difficult to get an overview of how rocks is doing.

Is this there any way to get the Rocks Native Metrics forwarded over to DataDog?

Thanks!

Rex Fenley | Software Engineer - Mobile and Backend

Remind.com | BLOG | FOLLOW US | LIKE US

Chesnay Schepler

Re: Seeing Rocks Native Metrics in Data Dog

Anything metric that is shown in the Flink UI should also appear in DataDog.

If this is not the case then something goes wrong within the reporter.

Is there anything suspicious in the Flink logs?

Can you give some example of metrics that do show up in DataDog?

On 1/26/2021 6:32 PM, Rex Fenley wrote:

Hi,

I need to get a deeper dive into how rocks is performing so I turned on Rocks Native Metrics. However, I don't see any of the metrics in DataDog (though I have other Flink metrics in DataDog). I only see rocks metrics in the operator metrics in Flink UI, and unfortunately I can't really zoom in or out of those metrics or compare against multiple operators at a time which makes it really difficult to get an overview of how rocks is doing.

Is this there any way to get the Rocks Native Metrics forwarded over to DataDog?

Thanks!

--

Rex Fenley | Software Engineer - Mobile and Backend

Remind.com | BLOG | FOLLOW US | LIKE US

Rex Fenley

Re: Seeing Rocks Native Metrics in Data Dog

All taskmanager and jobmanager logs show up. Anything specific to an operator does not.

For example, flink.taskmanager.Status.JVM.Memory.Heap.Used shows up, but I can't see stats on an individual operator.

I mostly followed a combination of https://docs.datadoghq.com/integrations/flink/#metric-collection and https://ci.apache.org/projects/flink/flink-docs-release-1.11/monitoring/metrics.html#datadog-orgapacheflinkmetricsdatadogdatadoghttpreporter since datadog's documentation was slightly out of date.

Thanks

On Tue, Jan 26, 2021 at 10:28 AM Chesnay Schepler <[hidden email]> wrote:

Anything metric that is shown in the Flink UI should also appear in DataDog.

If this is not the case then something goes wrong within the reporter.

Is there anything suspicious in the Flink logs?

Can you give some example of metrics that do show up in DataDog?

On 1/26/2021 6:32 PM, Rex Fenley wrote:

Hi,

I need to get a deeper dive into how rocks is performing so I turned on Rocks Native Metrics. However, I don't see any of the metrics in DataDog (though I have other Flink metrics in DataDog). I only see rocks metrics in the operator metrics in Flink UI, and unfortunately I can't really zoom in or out of those metrics or compare against multiple operators at a time which makes it really difficult to get an overview of how rocks is doing.

Is this there any way to get the Rocks Native Metrics forwarded over to DataDog?

Thanks!

--

Rex Fenley | Software Engineer - Mobile and Backend

Remind.com | BLOG | FOLLOW US | LIKE US

Rex Fenley | Software Engineer - Mobile and Backend

Remind.com | BLOG | FOLLOW US | LIKE US

Chesnay Schepler

Re: Seeing Rocks Native Metrics in Data Dog

It is good to know that something from the task executors arrives at datadog.

Do you see any metrics for a specific job, like the numRestarts metric of the JobManager?

Are you using the default scope formats, or have you modified them?

Could you try these instead and report back? (I replaced all job/task/operator names with their IDs, in case some special character is messing with datadog)

metrics.scope.jm: <host>.jobmanager
metrics.scope.jm.job: <host>.jobmanager.<job_id>
metrics.scope.tm: <host>.taskmanager.<tm_id>
metrics.scope.tm.job: <host>.taskmanager.<tm_id>.<job_id>
metrics.scope.task: <host>.taskmanager.<tm_id>.<job_id>.<task_id>.<subtask_index>
metrics.scope.operator: <host>.taskmanager.<tm_id>.<job_id>.<operator_id>.<subtask_index>

On 1/26/2021 9:28 PM, Rex Fenley wrote:

All taskmanager and jobmanager logs show up. Anything specific to an operator does not.

For example, flink.taskmanager.Status.JVM.Memory.Heap.Used shows up, but I can't see stats on an individual operator.

I mostly followed a combination of https://docs.datadoghq.com/integrations/flink/#metric-collection and https://ci.apache.org/projects/flink/flink-docs-release-1.11/monitoring/metrics.html#datadog-orgapacheflinkmetricsdatadogdatadoghttpreporter since datadog's documentation was slightly out of date.

Thanks

On Tue, Jan 26, 2021 at 10:28 AM Chesnay Schepler <[hidden email]> wrote:

Anything metric that is shown in the Flink UI should also appear in DataDog.

If this is not the case then something goes wrong within the reporter.

Is there anything suspicious in the Flink logs?

Can you give some example of metrics that do show up in DataDog?

On 1/26/2021 6:32 PM, Rex Fenley wrote:

Hi,

I need to get a deeper dive into how rocks is performing so I turned on Rocks Native Metrics. However, I don't see any of the metrics in DataDog (though I have other Flink metrics in DataDog). I only see rocks metrics in the operator metrics in Flink UI, and unfortunately I can't really zoom in or out of those metrics or compare against multiple operators at a time which makes it really difficult to get an overview of how rocks is doing.

Is this there any way to get the Rocks Native Metrics forwarded over to DataDog?

Thanks!

--

Rex Fenley | Software Engineer - Mobile and Backend

Remind.com | BLOG | FOLLOW US | LIKE US

--

Rex Fenley | Software Engineer - Mobile and Backend

Remind.com | BLOG | FOLLOW US | LIKE US

Rex Fenley

Re: Seeing Rocks Native Metrics in Data Dog

Oddly, I'm seeing them now. I'm not sure what has changed. Fwiw, we have modified the scopes per https://docs.datadoghq.com/integrations/flink/#metric-collection but their modifications ids as tags. We do need to modify them according to that documentation - "Note: The system scopes must be remapped for your Flink metrics to be supported, otherwise they are submitted as custom metrics." Could we instead add host and ids as tags to our metrics?

Thanks for your help!

On Tue, Jan 26, 2021 at 2:49 PM Chesnay Schepler <[hidden email]> wrote:

It is good to know that something from the task executors arrives at datadog.

Do you see any metrics for a specific job, like the numRestarts metric of the JobManager?

Are you using the default scope formats, or have you modified them?

Could you try these instead and report back? (I replaced all job/task/operator names with their IDs, in case some special character is messing with datadog)

metrics.scope.jm: <host>.jobmanager
metrics.scope.jm.job: <host>.jobmanager.<job_id>
metrics.scope.tm: <host>.taskmanager.<tm_id>
metrics.scope.tm.job: <host>.taskmanager.<tm_id>.<job_id>
metrics.scope.task: <host>.taskmanager.<tm_id>.<job_id>.<task_id>.<subtask_index>
metrics.scope.operator: <host>.taskmanager.<tm_id>.<job_id>.<operator_id>.<subtask_index>

On 1/26/2021 9:28 PM, Rex Fenley wrote:

All taskmanager and jobmanager logs show up. Anything specific to an operator does not.

For example, flink.taskmanager.Status.JVM.Memory.Heap.Used shows up, but I can't see stats on an individual operator.

I mostly followed a combination of https://docs.datadoghq.com/integrations/flink/#metric-collection and https://ci.apache.org/projects/flink/flink-docs-release-1.11/monitoring/metrics.html#datadog-orgapacheflinkmetricsdatadogdatadoghttpreporter since datadog's documentation was slightly out of date.

Thanks

On Tue, Jan 26, 2021 at 10:28 AM Chesnay Schepler <[hidden email]> wrote:

Anything metric that is shown in the Flink UI should also appear in DataDog.

If this is not the case then something goes wrong within the reporter.

Is there anything suspicious in the Flink logs?

Can you give some example of metrics that do show up in DataDog?

On 1/26/2021 6:32 PM, Rex Fenley wrote:

Hi,

I need to get a deeper dive into how rocks is performing so I turned on Rocks Native Metrics. However, I don't see any of the metrics in DataDog (though I have other Flink metrics in DataDog). I only see rocks metrics in the operator metrics in Flink UI, and unfortunately I can't really zoom in or out of those metrics or compare against multiple operators at a time which makes it really difficult to get an overview of how rocks is doing.

Is this there any way to get the Rocks Native Metrics forwarded over to DataDog?

Thanks!

--

Rex Fenley | Software Engineer - Mobile and Backend

Remind.com | BLOG | FOLLOW US | LIKE US

--

Rex Fenley | Software Engineer - Mobile and Backend

Remind.com | BLOG | FOLLOW US | LIKE US

Rex Fenley | Software Engineer - Mobile and Backend

Remind.com | BLOG | FOLLOW US | LIKE US

Chesnay Schepler

Re: Seeing Rocks Native Metrics in Data Dog

AFAIK all IDs (and in fact all variables except <host>) are exposed as tags. (the <host> is transmitted separately and I would've though Datadog automatically provides similar functionality for it).

On 1/27/2021 2:11 AM, Rex Fenley wrote:

Oddly, I'm seeing them now. I'm not sure what has changed. Fwiw, we have modified the scopes per https://docs.datadoghq.com/integrations/flink/#metric-collection but their modifications ids as tags. We do need to modify them according to that documentation - "Note: The system scopes must be remapped for your Flink metrics to be supported, otherwise they are submitted as custom metrics." Could we instead add host and ids as tags to our metrics?

Thanks for your help!

On Tue, Jan 26, 2021 at 2:49 PM Chesnay Schepler <[hidden email]> wrote:

It is good to know that something from the task executors arrives at datadog.

Do you see any metrics for a specific job, like the numRestarts metric of the JobManager?

Are you using the default scope formats, or have you modified them?

Could you try these instead and report back? (I replaced all job/task/operator names with their IDs, in case some special character is messing with datadog)

metrics.scope.jm: <host>.jobmanager
metrics.scope.jm.job: <host>.jobmanager.<job_id>
metrics.scope.tm: <host>.taskmanager.<tm_id>
metrics.scope.tm.job: <host>.taskmanager.<tm_id>.<job_id>
metrics.scope.task: <host>.taskmanager.<tm_id>.<job_id>.<task_id>.<subtask_index>
metrics.scope.operator: <host>.taskmanager.<tm_id>.<job_id>.<operator_id>.<subtask_index>

On 1/26/2021 9:28 PM, Rex Fenley wrote:

All taskmanager and jobmanager logs show up. Anything specific to an operator does not.

For example, flink.taskmanager.Status.JVM.Memory.Heap.Used shows up, but I can't see stats on an individual operator.

I mostly followed a combination of https://docs.datadoghq.com/integrations/flink/#metric-collection and https://ci.apache.org/projects/flink/flink-docs-release-1.11/monitoring/metrics.html#datadog-orgapacheflinkmetricsdatadogdatadoghttpreporter since datadog's documentation was slightly out of date.

Thanks

On Tue, Jan 26, 2021 at 10:28 AM Chesnay Schepler <[hidden email]> wrote:

Anything metric that is shown in the Flink UI should also appear in DataDog.

If this is not the case then something goes wrong within the reporter.

Is there anything suspicious in the Flink logs?

Can you give some example of metrics that do show up in DataDog?

On 1/26/2021 6:32 PM, Rex Fenley wrote:

Hi,

I need to get a deeper dive into how rocks is performing so I turned on Rocks Native Metrics. However, I don't see any of the metrics in DataDog (though I have other Flink metrics in DataDog). I only see rocks metrics in the operator metrics in Flink UI, and unfortunately I can't really zoom in or out of those metrics or compare against multiple operators at a time which makes it really difficult to get an overview of how rocks is doing.

Is this there any way to get the Rocks Native Metrics forwarded over to DataDog?

Thanks!

--

Rex Fenley | Software Engineer - Mobile and Backend

Remind.com | BLOG | FOLLOW US | LIKE US

--

Rex Fenley | Software Engineer - Mobile and Backend

Remind.com | BLOG | FOLLOW US | LIKE US

--

Rex Fenley | Software Engineer - Mobile and Backend

Remind.com | BLOG | FOLLOW US | LIKE US