idleTimeMsPerSecond on Flink 1.9?

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

idleTimeMsPerSecond on Flink 1.9?

Lakshmi Gururaja Rao-2
Hi

I'm trying to understand the implementation of idleTimeMsPerSecond. Specifically what I'm trying to do is, adapt this metric to be used with Flink 1.9 (for a fork).

I tried an approach similar to this PR  and measuring the time to request a new buffer is easy to adapt but I've found that there's a difference in the way the mailbox loop runs in 1.9 vs. 1.12.1 and I end up with under-reported values (for example a sink getting no data, reports idleTimeMsPerSecond as 0 always which doesn't seem right).

I see that the threading model in StreamTask has changed significantly after 1.9. Specifically, I think in Flink 1.9 there's no blocking Mailbox loop as in the later versions (example)  which is where the idle time is measured..

Maybe I'm missing something, but I guess I can't directly use the same approach to measure idle time in 1.9? If so, I guess an alternative (more expensive) approach may be to measure it when the task thread processes records (like somewhere in this block) but I'm not sure if that would be the right/efficient thing to do..

Any suggestions on how to accurately measure task idle time in Flink 1.9?

--
Lakshmi
Reply | Threaded
Open this post in threaded view
|

Re: idleTimeMsPerSecond on Flink 1.9?

Till Rohrmann
Hi Lakshmi,

as you have said the StreamTask code base has evolved quite a bit between Flink 1.9 and Flink 1.12. With the mailbox model it now works quite differently. Moreover, the community no longer actively maintains versions < 1.11. Hence, if possible I would recommend you to upgrade to one of the latest Flink versions if this is possible. That way you don't have to implement these features yourself.

Cheers,
Till

On Mon, Mar 8, 2021 at 7:08 PM Lakshmi Gururaja Rao <[hidden email]> wrote:
Hi

I'm trying to understand the implementation of idleTimeMsPerSecond. Specifically what I'm trying to do is, adapt this metric to be used with Flink 1.9 (for a fork).

I tried an approach similar to this PR  and measuring the time to request a new buffer is easy to adapt but I've found that there's a difference in the way the mailbox loop runs in 1.9 vs. 1.12.1 and I end up with under-reported values (for example a sink getting no data, reports idleTimeMsPerSecond as 0 always which doesn't seem right).

I see that the threading model in StreamTask has changed significantly after 1.9. Specifically, I think in Flink 1.9 there's no blocking Mailbox loop as in the later versions (example)  which is where the idle time is measured..

Maybe I'm missing something, but I guess I can't directly use the same approach to measure idle time in 1.9? If so, I guess an alternative (more expensive) approach may be to measure it when the task thread processes records (like somewhere in this block) but I'm not sure if that would be the right/efficient thing to do..

Any suggestions on how to accurately measure task idle time in Flink 1.9?

--
Lakshmi