Hi: I need to collect application metrics which are counts (per unit of time eg: minute) for certain events. There are two ways of doing this: 1. Create separate streams (using split stream etc) in the application explicitly, then aggregate the counts in a window and save them. This mixes metrics collection with application logic and making the application logic complex. 2. Use Flink metrics framework (counter, guage, etc) to save metrics I have a very small test with 2 events but when I run the application the counters are not getting saved (they show value 0) even though that part of the code is being executed. I do see the numRecordsIn counters being updated from the source operator. I've also tried incrementing the count by 10 (instead of 1) every time the function gets execute but still the counts remain 0. Here is snippet of the code: dataStream.map(new RichMapFunction<String, String>() { protected Counter counter; public void open(Configuration parameters) { counter = getRuntimeContext().getMetricGroup().addGroup("test", "split").counter("success"); } @Override public String map(String value) throws Exception { counter.inc(); return value; } }); As I mentioned, I do get the success metric count but the value is always 0, even though the above map function was executed. My questions are: 1. Are there any issues regarding counters being approximate ? 2. If I want to collect accurate counts, is it recommended to use counters or should I do it explicitly (which is making the code too complex) ? 3. Do counters participate in flink's failure/checkpointing/recovery ? 4. Is there any better way of collecting application metric counts ? Thanks Mans |
1) None that I'm aware of.
2) You should use counters. 3) No, counters are not checkpointed, but you could store the value in state yourself. 4) None that I'm aware of that doesn't require modifications to the application logic. How long does your job run for, and how do you access metrics? On 27/06/2019 17:36, M Singh wrote:
|
Hi Chesnay: Thanks for your response. My job runs for a few minutes and i've tried setting the reporter interval to 1 second. I will try the counter on a longer running job. Thanks again.
On Thursday, June 27, 2019, 11:46:17 AM EDT, Chesnay Schepler <[hidden email]> wrote:
1) None that I'm aware of.
2) You should use counters. 3) No, counters are not checkpointed, but you could store the value in state yourself. 4) None that I'm aware of that doesn't require modifications to the application logic. How long does your job run for, and how do you access metrics? On 27/06/2019 17:36, M Singh wrote:
|
So here's the thing: Metrics are
accurate, so long as the job is running. Once the job terminates
metrics are cleaned up and not persisted anywhere, with the
exception of a few metrics (like numRecordsIn).
Another thing that is always good to double-check is to enable DEBUG logging and re-run your test. On 27/06/2019 22:41, M Singh wrote:
|
Free forum by Nabble | Edit this page |