(DEPRECATED) Apache Flink User Mailing List archive.

End to End Latency Tracking in flink

Classic

List

Threaded

8 messages Options

Lu Niu

End to End Latency Tracking in flink

Hi,

I am looking for end to end latency monitoring of link job. Based on my study, I have two options:

1. flink provide a latency tracking feature. However, the documentation says it cannot show actual latency of business logic as it will bypass all operators. https://ci.apache.org/projects/flink/flink-docs-release-1.10/monitoring/metrics.html#latency-tracking Also, the feature can significantly impact the performance so I assume it's not for usage in production. What are users use the latency tracking for? Sounds like only back pressure could affect the latency.

2. I found another stackoverflow question on this. https://stackoverflow.com/questions/56578919/latency-monitoring-in-flink-application . The answer suggestion to expose (current processing - the event time) after source and before sink for end to end latency monitoring. Is this a good solution? If not, What’s the official solution for end to end latency tracking?

Thank you!

Best

Congxian Qiu

Re: End to End Latency Tracking in flink

As far as I know, the latency-tracking feature is for debugging usages, you can use it to debug, and disable it when running the job on production.

From my side, use $current_processing - $event_time is something ok, but keep the things in mind: the event time may not be the time ingested in Flink.

Best,

Congxian

Lu Niu <[hidden email]> 于2020年3月28日周六上午6:25写道：

Hi,

I am looking for end to end latency monitoring of link job. Based on my study, I have two options:

1. flink provide a latency tracking feature. However, the documentation says it cannot show actual latency of business logic as it will bypass all operators. https://ci.apache.org/projects/flink/flink-docs-release-1.10/monitoring/metrics.html#latency-tracking Also, the feature can significantly impact the performance so I assume it's not for usage in production. What are users use the latency tracking for? Sounds like only back pressure could affect the latency.

2. I found another stackoverflow question on this. https://stackoverflow.com/questions/56578919/latency-monitoring-in-flink-application . The answer suggestion to expose (current processing - the event time) after source and before sink for end to end latency monitoring. Is this a good solution? If not, What’s the official solution for end to end latency tracking?

Thank you!

Best
Lu

Zhijiang(wangzhijiang999)

Re: End to End Latency Tracking in flink

Hi Lu,

Besides Congxian's replies, you can also get some further explanations from "https://flink.apache.org/2019/07/23/flink-network-stack-2.html#latency-tracking".

Best,

Zhijiang

------------------------------------------------------------------
From:Congxian Qiu <[hidden email]>
Send Time:2020 Mar. 28 (Sat.) 11:49
To:Lu Niu <[hidden email]>
Cc:user <[hidden email]>
Subject:Re: End to End Latency Tracking in flink

Hi
As far as I know, the latency-tracking feature is for debugging usages, you can use it to debug, and disable it when running the job on production.
From my side, use $current_processing - $event_time is something ok, but keep the things in mind: the event time may not be the time ingested in Flink.

Best,
Congxian

Lu Niu <[hidden email]> 于2020年3月28日周六上午6:25写道：
Hi,

I am looking for end to end latency monitoring of link job. Based on my study, I have two options:

1. flink provide a latency tracking feature. However, the documentation says it cannot show actual latency of business logic as it will bypass all operators. https://ci.apache.org/projects/flink/flink-docs-release-1.10/monitoring/metrics.html#latency-tracking Also, the feature can significantly impact the performance so I assume it's not for usage in production. What are users use the latency tracking for? Sounds like only back pressure could affect the latency.

2. I found another stackoverflow question on this. https://stackoverflow.com/questions/56578919/latency-monitoring-in-flink-application . The answer suggestion to expose (current processing - the event time) after source and before sink for end to end latency monitoring. Is this a good solution? If not, What’s the official solution for end to end latency tracking?

Thank you!

Best
Lu

Lu Niu

Re: End to End Latency Tracking in flink

Thanks for reply, @Zhijiang, @Congxian!

@Congxian

$current_processing - $event_time works for event time. How about processing time? Is there a good way to measure the latency?

Best

On Sun, Mar 29, 2020 at 6:21 AM Zhijiang <[hidden email]> wrote:

Hi Lu,

Besides Congxian's replies, you can also get some further explanations from "https://flink.apache.org/2019/07/23/flink-network-stack-2.html#latency-tracking".

Best,
Zhijiang

------------------------------------------------------------------
From:Congxian Qiu <[hidden email]>
Send Time:2020 Mar. 28 (Sat.) 11:49
To:Lu Niu <[hidden email]>
Cc:user <[hidden email]>
Subject:Re: End to End Latency Tracking in flink

Hi
As far as I know, the latency-tracking feature is for debugging usages, you can use it to debug, and disable it when running the job on production.
From my side, use $current_processing - $event_time is something ok, but keep the things in mind: the event time may not be the time ingested in Flink.

Best,
Congxian

Lu Niu <[hidden email]> 于2020年3月28日周六上午6:25写道：
Hi,

I am looking for end to end latency monitoring of link job. Based on my study, I have two options:

1. flink provide a latency tracking feature. However, the documentation says it cannot show actual latency of business logic as it will bypass all operators. https://ci.apache.org/projects/flink/flink-docs-release-1.10/monitoring/metrics.html#latency-tracking Also, the feature can significantly impact the performance so I assume it's not for usage in production. What are users use the latency tracking for? Sounds like only back pressure could affect the latency.

2. I found another stackoverflow question on this. https://stackoverflow.com/questions/56578919/latency-monitoring-in-flink-application . The answer suggestion to expose (current processing - the event time) after source and before sink for end to end latency monitoring. Is this a good solution? If not, What’s the official solution for end to end latency tracking?

Thank you!

Best
Lu

Oscar Westra van Holthe - Kind

Re: End to End Latency Tracking in flink

On Mon, 30 Mar 2020 at 05:08, Lu Niu <[hidden email]> wrote:

$current_processing - $event_time works for event time. How about processing time? Is there a good way to measure the latency?

To measure latency you'll need some way to determine the time spent between the start and end of your pipeline.

To measure latency when using processing time, you'll need to partially use ingestion time. That is, you'll need to add the 'current' processing time as soon as messages are ingested.

With it, you can then use the $current_processing - $ingest_time solution that was already mentioned.

Kind regards,

Oscar

Oscar Westra van Holthe - Kind

张光辉

Re: End to End Latency Tracking in flink

Hi.

At flink source connector, you can send $source_current_time - $event_time metric.

In the meantime, at flink sink connector, you can send $sink_current_time - $event_time metric.

Then you use $sink_current_time - $event_time - ($source_current_time - $event_time) = $sink_current_time - $source_current_time as the latency of end to end。

Oscar Westra van Holthe - Kind <[hidden email]> 于2020年3月30日周一下午5:15写道：

On Mon, 30 Mar 2020 at 05:08, Lu Niu <[hidden email]> wrote:
$current_processing - $event_time works for event time. How about processing time? Is there a good way to measure the latency?

To measure latency you'll need some way to determine the time spent between the start and end of your pipeline.

To measure latency when using processing time, you'll need to partially use ingestion time. That is, you'll need to add the 'current' processing time as soon as messages are ingested.

With it, you can then use the $current_processing - $ingest_time solution that was already mentioned.

Kind regards,
Oscar

--
Oscar Westra van Holthe - Kind

zoudan

Re: End to End Latency Tracking in flink

Hi,

I think we may add latency metric for each operator, which can reflect consumption ability of each operator.

Best,
Dan Zou

在 2020年3月30日，18:19，Guanghui Zhang <[hidden email]> 写道：

Hi.
At flink source connector, you can send $source_current_time - $event_time metric.
In the meantime, at flink sink connector, you can send $sink_current_time - $event_time metric.
Then you use $sink_current_time - $event_time - ($source_current_time - $event_time) = $sink_current_time - $source_current_time as the latency of end to end。

Oscar Westra van Holthe - Kind <[hidden email]> 于2020年3月30日周一下午5:15写道：
On Mon, 30 Mar 2020 at 05:08, Lu Niu <[hidden email]> wrote:
$current_processing - $event_time works for event time. How about processing time? Is there a good way to measure the latency?

To measure latency you'll need some way to determine the time spent between the start and end of your pipeline.

To measure latency when using processing time, you'll need to partially use ingestion time. That is, you'll need to add the 'current' processing time as soon as messages are ingested.

With it, you can then use the $current_processing - $ingest_time solution that was already mentioned.

Kind regards,
Oscar

--
Oscar Westra van Holthe - Kind

Lu Niu

Re: End to End Latency Tracking in flink

An Operator like below will expose lag between current time and event time passing the operator. I add that after the source and before the sink, and calculate sink_delay - source_delay in grafana. would that be a generic solution to solve the problem?

```

public class EmitLagOperator<T> extends AbstractStreamOperator<T>
implements OneInputStreamOperator<T, T> {

private transient long delay;

public EmitLagOperator() {
chainingStrategy = ChainingStrategy.ALWAYS;
}

@Override
public void processElement(StreamRecord<T> element) throws Exception {
long now = getProcessingTimeService().getCurrentProcessingTime();
delay = now - element.getTimestamp();
output.collect(element);
}

@Override
public void open() throws Exception {
super.open();
getRuntimeContext()
.getMetricGroup()
.gauge("delay", new Gauge<Long>() {
@Override
public Long getValue() {
return delay;
}
});
}
}

```

On Wed, Apr 1, 2020 at 7:59 PM zoudan <[hidden email]> wrote:

Hi,
I think we may add latency metric for each operator, which can reflect consumption ability of each operator.

Best,
Dan Zou

在 2020年3月30日，18:19，Guanghui Zhang <[hidden email]> 写道：

Hi.
At flink source connector, you can send $source_current_time - $event_time metric.
In the meantime, at flink sink connector, you can send $sink_current_time - $event_time metric.
Then you use $sink_current_time - $event_time - ($source_current_time - $event_time) = $sink_current_time - $source_current_time as the latency of end to end。

Oscar Westra van Holthe - Kind <[hidden email]> 于2020年3月30日周一下午5:15写道：
On Mon, 30 Mar 2020 at 05:08, Lu Niu <[hidden email]> wrote:
$current_processing - $event_time works for event time. How about processing time? Is there a good way to measure the latency?

To measure latency you'll need some way to determine the time spent between the start and end of your pipeline.

To measure latency when using processing time, you'll need to partially use ingestion time. That is, you'll need to add the 'current' processing time as soon as messages are ingested.

With it, you can then use the $current_processing - $ingest_time solution that was already mentioned.

Kind regards,
Oscar

--
Oscar Westra van Holthe - Kind