some subtask taking too long

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

some subtask taking too long

Fanbin Bu
Hi,

I m running flink 1.9 on EMR using flink sql blink planner reading and writing to JDBC input/output. my sql is just a listagg over window for the last 7 days. However, i notice that there are one or two subtasks that take too long to finish. In this thread http://mail-archives.apache.org/mod_mbox/flink-user/201901.mbox/%3CCAEv5b0yD+0WBXgAnfT0b=ZqLC8rPE9_izzE3g+9Vxw8oK9w2=A@...%3E, that is a similar issue. 

Any idea on how to debug this?

Thanks
Fanbin

Reply | Threaded
Open this post in threaded view
|

Re: some subtask taking too long

Piotr Nowojski-3
Hey,

The thread you are referring to is about DataStream API job and long checkpointing issue. While from your message it seems like you are using Table API (SQL) to process a batch data? Or what exactly do you mean by:

>  i notice that there are one or two subtasks that take too long to finish

Aside from that, don’t you have just a problem with a data skew, where some subset of keys are more heavily used than others?

Piotrek

On 31 Mar 2020, at 01:43, Fanbin Bu <[hidden email]> wrote:

Hi,

I m running flink 1.9 on EMR using flink sql blink planner reading and writing to JDBC input/output. my sql is just a listagg over window for the last 7 days. However, i notice that there are one or two subtasks that take too long to finish. In this thread http://mail-archives.apache.org/mod_mbox/flink-user/201901.mbox/%3CCAEv5b0yD+0WBXgAnfT0b=ZqLC8rPE9_izzE3g+9Vxw8oK9w2=A@...%3E, that is a similar issue. 

Any idea on how to debug this?

Thanks
Fanbin