Hi,
I am looking for end to end latency monitoring of link job. Based on my study, I have two options:
1. flink provide a latency tracking feature. However, the documentation says it cannot show actual latency of business logic as it will bypass all operators. https://ci.apache.org/projects/flink/flink-docs-release-1.10/monitoring/metrics.html#latency-tracking Also, the feature can significantly impact the performance so I assume it's not for usage in production. What are users use the latency tracking for? Sounds like only back pressure could affect the latency.
2. I found another stackoverflow question on this. https://stackoverflow.com/questions/56578919/latency-monitoring-in-flink-application . The answer suggestion to expose (current processing - the event time) after source and before sink for end to end latency monitoring. Is this a good solution? If not, What’s the official solution for end to end latency tracking?
Thank you! Best Lu |
Hi As far as I know, the latency-tracking feature is for debugging usages, you can use it to debug, and disable it when running the job on production. From my side, use $current_processing - $event_time is something ok, but keep the things in mind: the event time may not be the time ingested in Flink. Best, Congxian Lu Niu <[hidden email]> 于2020年3月28日周六 上午6:25写道:
|
Hi Lu, Besides Congxian's replies, you can also get some further explanations from "https://flink.apache.org/2019/07/23/flink-network-stack-2.html#latency-tracking". Best, Zhijiang
|
Thanks for reply, @Zhijiang, @Congxian! @Congxian $current_processing - $event_time works for event time. How about processing time? Is there a good way to measure the latency? Best Lu On Sun, Mar 29, 2020 at 6:21 AM Zhijiang <[hidden email]> wrote:
|
On Mon, 30 Mar 2020 at 05:08, Lu Niu <[hidden email]> wrote:
To measure latency you'll need some way to determine the time spent between the start and end of your pipeline. To measure latency when using processing time, you'll need to partially use ingestion time. That is, you'll need to add the 'current' processing time as soon as messages are ingested. With it, you can then use the
$current_processing - $ingest_time solution that was already mentioned. Kind regards, Oscar Oscar Westra van Holthe - Kind |
Hi. At flink source connector, you can send $source_current_time - $event_time metric. In the meantime, at flink sink connector, you can send $sink_current_time - $event_time metric. Then you use $sink_current_time - $event_time - ($source_current_time - $event_time) = $sink_current_time - $source_current_time as the latency of end to end。 Oscar Westra van Holthe - Kind <[hidden email]> 于2020年3月30日周一 下午5:15写道:
|
Hi, I think we may add latency metric for each operator, which can reflect consumption ability of each operator. Best,
Dan Zou
|
An Operator like below will expose lag between current time and event time passing the operator. I add that after the source and before the sink, and calculate sink_delay - source_delay in grafana. would that be a generic solution to solve the problem? ``` public class EmitLagOperator<T> extends AbstractStreamOperator<T> implements OneInputStreamOperator<T, T> { private transient long delay; public EmitLagOperator() { chainingStrategy = ChainingStrategy.ALWAYS; } @Override public void processElement(StreamRecord<T> element) throws Exception { long now = getProcessingTimeService().getCurrentProcessingTime(); delay = now - element.getTimestamp(); output.collect(element); } @Override public void open() throws Exception { super.open(); getRuntimeContext() .getMetricGroup() .gauge("delay", new Gauge<Long>() { @Override public Long getValue() { return delay; } }); } } ``` On Wed, Apr 1, 2020 at 7:59 PM zoudan <[hidden email]> wrote:
|
Free forum by Nabble | Edit this page |