Need help debugging back pressure job

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Need help debugging back pressure job

Fritz Budiyanto
Hi All,

Any tips on debugging back pressure ? I have a workload where it get stuck after it ran for a couple of hours.
I assume the cause of the back pressure is the block next to the one showing as having the back pressure, is this right ?

Any idea on how to get the backtrace ? (I’m using standalone combined jm/tm with parallelism of 1, and the suspected block is doing ProcessFunction with event timers)


Fritz


Reply | Threaded
Open this post in threaded view
|

Re: Need help debugging back pressure job

Till Rohrmann
Hi Fritz,

you're right that back pressure should propagate upstream to the sources. Thus, the cause of the back pressure should be the operator following the last operator with back pressure.

In order to debug it you could take a look at the stack trace of the TM. Simply go to the machine on which the TM runs, find out the process id via jps and then call jstack with the respective process id.

Alternatively, you can try to debug the cluster remotely [1].


Cheers,
Till

On Tue, May 23, 2017 at 7:14 AM, Fritz Budiyanto <[hidden email]> wrote:
Hi All,

Any tips on debugging back pressure ? I have a workload where it get stuck after it ran for a couple of hours.
I assume the cause of the back pressure is the block next to the one showing as having the back pressure, is this right ?

Any idea on how to get the backtrace ? (I’m using standalone combined jm/tm with parallelism of 1, and the suspected block is doing ProcessFunction with event timers)


Fritz