Confusing debug level log output with Flink 1.5

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Confusing debug level log output with Flink 1.5

Ken Krugler
Hi Till,


I’ve been trying out 1.5, and noticed similar output in my logs, e.g.

18/04/18 17:33:47 DEBUG slotpool.SlotPool:751 - Releasing slot with slot request id 2c9fba45bd28940c46502d2b003fc437.
org.apache.flink.util.FlinkException: Release multi task slot because all children have been released.
at org.apache.flink.runtime.jobmaster.slotpool.SlotSharingManager$MultiTaskSlot.releaseChild(SlotSharingManager.java:507)
28 or so more lines
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

It seems to be a result of this change in SlotPool.java:

107c8e04be8 flink-runtime/src/main/java/org/apache/flink/runtime/jobmaster/slotpool/SlotPool.java (Till Rohrmann  2018-02-22 14:10:29 +0100  741)               log.debug("Releasing slot with slot request id {}.", slotRequestId, cause);

While I understand the value of the call stack, it’s pretty confusing to see what looks like a bunch of stack traces showing up in the logs.

I can obviously edit the log4j.properties file to hide this, but that’s not so easy when running in EMR, for example.

And I think you’ll continue to get bug reports :)

Could this be turned into a trace-level msg?

Thanks,

— Ken

--------------------------------------------
+1 530-210-6378

Reply | Threaded
Open this post in threaded view
|

Re: Confusing debug level log output with Flink 1.5

Till Rohrmann
Hi Ken,

sorry for the long response time. 

You're right that the log output is too verbose and we should not log the full stack traces in the happy case. I've responded onto the PR thread [1] and suggested to only log the exception message for information purposes. I hope that this will solve the problem.


Cheers,
Till

On Thu, Apr 19, 2018 at 6:29 AM, Ken Krugler <[hidden email]> wrote:
Hi Till,


I’ve been trying out 1.5, and noticed similar output in my logs, e.g.

18/04/18 17:33:47 DEBUG slotpool.SlotPool:751 - Releasing slot with slot request id 2c9fba45bd28940c46502d2b003fc437.
org.apache.flink.util.FlinkException: Release multi task slot because all children have been released.
at org.apache.flink.runtime.jobmaster.slotpool.SlotSharingManager$MultiTaskSlot.releaseChild(SlotSharingManager.java:507)
28 or so more lines
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

It seems to be a result of this change in SlotPool.java:

107c8e04be8 flink-runtime/src/main/java/org/apache/flink/runtime/jobmaster/slotpool/SlotPool.java (Till Rohrmann  2018-02-22 14:10:29 +0100  741)               log.debug("Releasing slot with slot request id {}.", slotRequestId, cause);

While I understand the value of the call stack, it’s pretty confusing to see what looks like a bunch of stack traces showing up in the logs.

I can obviously edit the log4j.properties file to hide this, but that’s not so easy when running in EMR, for example.

And I think you’ll continue to get bug reports :)

Could this be turned into a trace-level msg?

Thanks,

— Ken

--------------------------------------------
+1 530-210-6378