I run flink 1.6.2 on yarn. At some time, job is failed becuase of: org.apache.flink.util.FlinkException: The assigned slot container_e708_1555051789618_2644286_01_000061_0 was removed Then the job restarts. After some time, the container container_e708_1555051789618_2644286_01_000061 is still not released. The log of container_e708_1555051789618_2644286_01_000061 is as following: The log shows that two tasks are canceled before successful registration at resource manager and one is canceled after registration. After five minutes, the container registers again. At last, the container is alive but not used. Anyone have any idea about this problem. Thank you. |
Hi,
I will loop in Till here who might know
about this problem. In the meantime could you maybe tell us a bit
more about your setup/deployment (how is yarn configured and the
Flink job submitted?) and link to the full logs?
Thanks,
Timo
Am 26.04.19 um 11:15 schrieb 刘建刚:
|
Hi, have you tried whether the same problem also occurs with the latest Flink version (1.8.0, 1.7.2 or 1.6.4)? If yes, then I would need to take a look at the logs to better understand what's happening. Cheers, Till On Fri, Apr 26, 2019 at 12:33 PM Timo Walther <[hidden email]> wrote:
|
Thank you, it is fixed in the new version.
-- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ |
Free forum by Nabble | Edit this page |