Re: metrics for Flink sinks
Posted by
Chesnay Schepler on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/metrics-for-Flink-sinks-tp15200p15236.html
Hello,
1. Because no one found time to fix it. In contrast to the
remaining byte/record metrics, input metrics for sources / output
metrics for sinks have to be implemented for every single
implementation with their respective semantics. In contrast, the
output metrics are gathered in the intersection between operators,
independent of the actual operator implementation. Furthermore,
this requires system metrics (i.e. metrics that Flink itself
creates) to be exposed (and be mutable!) to user-defined
functions, which is something i generally wanted to
avoid, but it appears to be a big enough pain point to make an
exception here.
2. Due to the above it is currently not possible without
modifications of the code to know how many reads/writes were made.
3. Do you mean aggregated metrics? The web UI allows the
aggregation of record/byte metrics on the task level. Beyond that
we defer aggregation to actual time-series databases that
specialize in these things.
On 28.08.2017 19:08, Martin Eden wrote:
Hi all,
Just 3 quick questions both related to Flink metrics,
especially around sinks:
1. In the Flink UI Sources always have 0 input records /
bytes and Sinks always have 0 output records / bytes? Why is
it like that?
2. What is the best practice for instrumenting off the
shelf Flink sinks?
Currently the only metrics available are num records/bytes
in and out at the operator and task scope. For the task scope
there are extra buffer metrics. However the output metrics are
always zero (see question 1). How can one know the actual
number of successful writes done by an off the shelf Flink
sink? Or the latency of the write operation?
3. Is it possible to configure Flink to get global job
metrics for all subtasks of an operator? Or are there any best
practices around that?
Thanks,
M