Query regarding flink metric types

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Query regarding flink metric types

V N, Suchithra (Nokia - IN/Bangalore)

Hi Community,

 

Need some information regarding metrics type mentioned in flink documentation.

https://ci.apache.org/projects/flink/flink-docs-stable/ops/metrics.html

 

For the checkpoint metrics, below metrics are defined as of type gauge. As per my understanding gauge type is used to represent a value which can increase/decrease whereas counter is used to represent a value which will keep increasing. Below metrics will be keep increasing during the job run. Hence counter can be appropriate metric type for these. Please share your input on this.

 

numberOfCompletedCheckpoints

The number of successfully completed checkpoints.

Gauge

numberOfFailedCheckpoints

The number of failed checkpoints.

Gauge

totalNumberOfCheckpoints

The number of total checkpoints (in progress, completed, failed).

Gauge

 

Also “isBackPressured"  metric by the name it indicates as it returns boolean value Yes/No. Flink documentation says backpressure is measured as below,

  • OK: 0 <= Ratio <= 0.10
  • LOW: 0.10 < Ratio <= 0.5
  • HIGH: 0.5 < Ratio <= 1

What exactly this metric reports ?

 

isBackPressured

Whether the task is back-pressured.

Gauge

 

Thanks,

Suchithra

Reply | Threaded
Open this post in threaded view
|

Re: Query regarding flink metric types

Roman Khachatryan
Hi Suchithra,

You are right, those metrics can only grow, at least until failover.

isBackPressured is reported as a boolean on subtask level. These samples are then aggregated and a ratio of (times-back-pressured / number-of-samples) is reported to the JobManager.

Regards,
Roman


On Fri, Apr 9, 2021 at 12:44 PM V N, Suchithra (Nokia - IN/Bangalore) <[hidden email]> wrote:

Hi Community,

 

Need some information regarding metrics type mentioned in flink documentation.

https://ci.apache.org/projects/flink/flink-docs-stable/ops/metrics.html

 

For the checkpoint metrics, below metrics are defined as of type gauge. As per my understanding gauge type is used to represent a value which can increase/decrease whereas counter is used to represent a value which will keep increasing. Below metrics will be keep increasing during the job run. Hence counter can be appropriate metric type for these. Please share your input on this.

 

numberOfCompletedCheckpoints

The number of successfully completed checkpoints.

Gauge

numberOfFailedCheckpoints

The number of failed checkpoints.

Gauge

totalNumberOfCheckpoints

The number of total checkpoints (in progress, completed, failed).

Gauge

 

Also “isBackPressured"  metric by the name it indicates as it returns boolean value Yes/No. Flink documentation says backpressure is measured as below,

  • OK: 0 <= Ratio <= 0.10
  • LOW: 0.10 < Ratio <= 0.5
  • HIGH: 0.5 < Ratio <= 1

What exactly this metric reports ?

 

isBackPressured

Whether the task is back-pressured.

Gauge

 

Thanks,

Suchithra