Hi.
We have seen the same behaviour on Yarn. It turned out that the default settings for was not optimal.
yarn.maximum-failed-containers
: The maximum number of failed containers the ApplicationMaster accepts until it fails the YARN session. Default: The number of initially requested TaskManagers (-n
).
So try to lookup the configuration for your system.
Next step is to investigate why the task manager is killed.
Med venlig hilsen / Best regards
Lasse Nedergaard
Hey,
Can You please provide a little more information about your setup and maybe logs showing when the crash occurs?
Best Regards,
Dominik