Seeing Rocks Native Metrics in Data Dog

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Seeing Rocks Native Metrics in Data Dog

Rex Fenley
Hi,

I need to get a deeper dive into how rocks is performing so I turned on Rocks Native Metrics. However, I don't see any of the metrics in DataDog (though I have other Flink metrics in DataDog). I only see rocks metrics in the operator metrics in Flink UI, and unfortunately I can't really zoom in or out of those metrics or compare against multiple operators at a time which makes it really difficult to get an overview of how rocks is doing.

Is this there any way to get the Rocks Native Metrics forwarded over to DataDog?

Thanks!

--

Rex Fenley  |  Software Engineer - Mobile and Backend


Remind.com |  BLOG  |  FOLLOW US  |  LIKE US

Reply | Threaded
Open this post in threaded view
|

Re: Seeing Rocks Native Metrics in Data Dog

Chesnay Schepler
Anything metric that is shown in the Flink UI should also appear in DataDog.
If this is not the case then something goes wrong within the reporter.

Is there anything suspicious in the Flink logs?

Can you give some example of metrics that do show up in DataDog?

On 1/26/2021 6:32 PM, Rex Fenley wrote:
Hi,

I need to get a deeper dive into how rocks is performing so I turned on Rocks Native Metrics. However, I don't see any of the metrics in DataDog (though I have other Flink metrics in DataDog). I only see rocks metrics in the operator metrics in Flink UI, and unfortunately I can't really zoom in or out of those metrics or compare against multiple operators at a time which makes it really difficult to get an overview of how rocks is doing.

Is this there any way to get the Rocks Native Metrics forwarded over to DataDog?

Thanks!

--

Rex Fenley  |  Software Engineer - Mobile and Backend


Remind.com |  BLOG  |  FOLLOW US  |  LIKE US


Reply | Threaded
Open this post in threaded view
|

Re: Seeing Rocks Native Metrics in Data Dog

Rex Fenley
All taskmanager and jobmanager logs show up. Anything specific to an operator does not.
For example, flink.taskmanager.Status.JVM.Memory.Heap.Used shows up, but I can't see stats on an individual operator.


Thanks

On Tue, Jan 26, 2021 at 10:28 AM Chesnay Schepler <[hidden email]> wrote:
Anything metric that is shown in the Flink UI should also appear in DataDog.
If this is not the case then something goes wrong within the reporter.

Is there anything suspicious in the Flink logs?

Can you give some example of metrics that do show up in DataDog?

On 1/26/2021 6:32 PM, Rex Fenley wrote:
Hi,

I need to get a deeper dive into how rocks is performing so I turned on Rocks Native Metrics. However, I don't see any of the metrics in DataDog (though I have other Flink metrics in DataDog). I only see rocks metrics in the operator metrics in Flink UI, and unfortunately I can't really zoom in or out of those metrics or compare against multiple operators at a time which makes it really difficult to get an overview of how rocks is doing.

Is this there any way to get the Rocks Native Metrics forwarded over to DataDog?

Thanks!

--

Rex Fenley  |  Software Engineer - Mobile and Backend


Remind.com |  BLOG  |  FOLLOW US  |  LIKE US




--

Rex Fenley  |  Software Engineer - Mobile and Backend


Remind.com |  BLOG  |  FOLLOW US  |  LIKE US

Reply | Threaded
Open this post in threaded view
|

Re: Seeing Rocks Native Metrics in Data Dog

Chesnay Schepler
It is good to know that something from the task executors arrives at datadog.

Do you see any metrics for a specific job, like the numRestarts metric of the JobManager?

Are you using the default scope formats, or have you modified them?
Could you try these instead and report back? (I replaced all job/task/operator names with their IDs, in case some special character is messing with datadog)

metrics.scope.jm: <host>.jobmanager
metrics.scope.jm.job: <host>.jobmanager.<job_id>
metrics.scope.tm: <host>.taskmanager.<tm_id>
metrics.scope.tm.job: <host>.taskmanager.<tm_id>.<job_id>
metrics.scope.task: <host>.taskmanager.<tm_id>.<job_id>.<task_id>.<subtask_index>
metrics.scope.operator: <host>.taskmanager.<tm_id>.<job_id>.<operator_id>.<subtask_index>


On 1/26/2021 9:28 PM, Rex Fenley wrote:
All taskmanager and jobmanager logs show up. Anything specific to an operator does not.
For example, flink.taskmanager.Status.JVM.Memory.Heap.Used shows up, but I can't see stats on an individual operator.


Thanks

On Tue, Jan 26, 2021 at 10:28 AM Chesnay Schepler <[hidden email]> wrote:
Anything metric that is shown in the Flink UI should also appear in DataDog.
If this is not the case then something goes wrong within the reporter.

Is there anything suspicious in the Flink logs?

Can you give some example of metrics that do show up in DataDog?

On 1/26/2021 6:32 PM, Rex Fenley wrote:
Hi,

I need to get a deeper dive into how rocks is performing so I turned on Rocks Native Metrics. However, I don't see any of the metrics in DataDog (though I have other Flink metrics in DataDog). I only see rocks metrics in the operator metrics in Flink UI, and unfortunately I can't really zoom in or out of those metrics or compare against multiple operators at a time which makes it really difficult to get an overview of how rocks is doing.

Is this there any way to get the Rocks Native Metrics forwarded over to DataDog?

Thanks!

--

Rex Fenley  |  Software Engineer - Mobile and Backend


Remind.com |  BLOG  |  FOLLOW US  |  LIKE US




--

Rex Fenley  |  Software Engineer - Mobile and Backend


Remind.com |  BLOG  |  FOLLOW US  |  LIKE US


Reply | Threaded
Open this post in threaded view
|

Re: Seeing Rocks Native Metrics in Data Dog

Rex Fenley
Oddly, I'm seeing them now. I'm not sure what has changed. Fwiw, we have modified the scopes per https://docs.datadoghq.com/integrations/flink/#metric-collection but their modifications ids as tags. We do need to modify them according to that documentation - "Note: The system scopes must be remapped for your Flink metrics to be supported, otherwise they are submitted as custom metrics." Could we instead add host and ids as tags to our metrics?

Thanks for your help!

On Tue, Jan 26, 2021 at 2:49 PM Chesnay Schepler <[hidden email]> wrote:
It is good to know that something from the task executors arrives at datadog.

Do you see any metrics for a specific job, like the numRestarts metric of the JobManager?

Are you using the default scope formats, or have you modified them?
Could you try these instead and report back? (I replaced all job/task/operator names with their IDs, in case some special character is messing with datadog)

metrics.scope.jm: <host>.jobmanager
metrics.scope.jm.job: <host>.jobmanager.<job_id>
metrics.scope.tm: <host>.taskmanager.<tm_id>
metrics.scope.tm.job: <host>.taskmanager.<tm_id>.<job_id>
metrics.scope.task: <host>.taskmanager.<tm_id>.<job_id>.<task_id>.<subtask_index>
metrics.scope.operator: <host>.taskmanager.<tm_id>.<job_id>.<operator_id>.<subtask_index>


On 1/26/2021 9:28 PM, Rex Fenley wrote:
All taskmanager and jobmanager logs show up. Anything specific to an operator does not.
For example, flink.taskmanager.Status.JVM.Memory.Heap.Used shows up, but I can't see stats on an individual operator.


Thanks

On Tue, Jan 26, 2021 at 10:28 AM Chesnay Schepler <[hidden email]> wrote:
Anything metric that is shown in the Flink UI should also appear in DataDog.
If this is not the case then something goes wrong within the reporter.

Is there anything suspicious in the Flink logs?

Can you give some example of metrics that do show up in DataDog?

On 1/26/2021 6:32 PM, Rex Fenley wrote:
Hi,

I need to get a deeper dive into how rocks is performing so I turned on Rocks Native Metrics. However, I don't see any of the metrics in DataDog (though I have other Flink metrics in DataDog). I only see rocks metrics in the operator metrics in Flink UI, and unfortunately I can't really zoom in or out of those metrics or compare against multiple operators at a time which makes it really difficult to get an overview of how rocks is doing.

Is this there any way to get the Rocks Native Metrics forwarded over to DataDog?

Thanks!

--

Rex Fenley  |  Software Engineer - Mobile and Backend


Remind.com |  BLOG  |  FOLLOW US  |  LIKE US




--

Rex Fenley  |  Software Engineer - Mobile and Backend


Remind.com |  BLOG  |  FOLLOW US  |  LIKE US




--

Rex Fenley  |  Software Engineer - Mobile and Backend


Remind.com |  BLOG  |  FOLLOW US  |  LIKE US

Reply | Threaded
Open this post in threaded view
|

Re: Seeing Rocks Native Metrics in Data Dog

Chesnay Schepler
AFAIK all IDs (and in fact all variables except <host>) are exposed as tags. (the <host> is transmitted separately and I would've though Datadog automatically provides similar functionality for it).

On 1/27/2021 2:11 AM, Rex Fenley wrote:
Oddly, I'm seeing them now. I'm not sure what has changed. Fwiw, we have modified the scopes per https://docs.datadoghq.com/integrations/flink/#metric-collection but their modifications ids as tags. We do need to modify them according to that documentation - "Note: The system scopes must be remapped for your Flink metrics to be supported, otherwise they are submitted as custom metrics." Could we instead add host and ids as tags to our metrics?

Thanks for your help!

On Tue, Jan 26, 2021 at 2:49 PM Chesnay Schepler <[hidden email]> wrote:
It is good to know that something from the task executors arrives at datadog.

Do you see any metrics for a specific job, like the numRestarts metric of the JobManager?

Are you using the default scope formats, or have you modified them?
Could you try these instead and report back? (I replaced all job/task/operator names with their IDs, in case some special character is messing with datadog)

metrics.scope.jm: <host>.jobmanager
metrics.scope.jm.job: <host>.jobmanager.<job_id>
metrics.scope.tm: <host>.taskmanager.<tm_id>
metrics.scope.tm.job: <host>.taskmanager.<tm_id>.<job_id>
metrics.scope.task: <host>.taskmanager.<tm_id>.<job_id>.<task_id>.<subtask_index>
metrics.scope.operator: <host>.taskmanager.<tm_id>.<job_id>.<operator_id>.<subtask_index>


On 1/26/2021 9:28 PM, Rex Fenley wrote:
All taskmanager and jobmanager logs show up. Anything specific to an operator does not.
For example, flink.taskmanager.Status.JVM.Memory.Heap.Used shows up, but I can't see stats on an individual operator.


Thanks

On Tue, Jan 26, 2021 at 10:28 AM Chesnay Schepler <[hidden email]> wrote:
Anything metric that is shown in the Flink UI should also appear in DataDog.
If this is not the case then something goes wrong within the reporter.

Is there anything suspicious in the Flink logs?

Can you give some example of metrics that do show up in DataDog?

On 1/26/2021 6:32 PM, Rex Fenley wrote:
Hi,

I need to get a deeper dive into how rocks is performing so I turned on Rocks Native Metrics. However, I don't see any of the metrics in DataDog (though I have other Flink metrics in DataDog). I only see rocks metrics in the operator metrics in Flink UI, and unfortunately I can't really zoom in or out of those metrics or compare against multiple operators at a time which makes it really difficult to get an overview of how rocks is doing.

Is this there any way to get the Rocks Native Metrics forwarded over to DataDog?

Thanks!

--

Rex Fenley  |  Software Engineer - Mobile and Backend


Remind.com |  BLOG  |  FOLLOW US  |  LIKE US




--

Rex Fenley  |  Software Engineer - Mobile and Backend


Remind.com |  BLOG  |  FOLLOW US  |  LIKE US




--

Rex Fenley  |  Software Engineer - Mobile and Backend


Remind.com |  BLOG  |  FOLLOW US  |  LIKE US