(DEPRECATED) Apache Flink User Mailing List archive.

Jobs running on a yarn per-job cluster fail to restart when a task manager is lost

Classic

List

Threaded

1 message

杨力

Jobs running on a yarn per-job cluster fail to restart when a task manager is lost

Hi,

I am running a streaming job without checkpointing enabled. A failute rate restart strategy have been set with StreamExecutionEvironment.setRestartStrategy.

When a task manager is lost because of memory problems, the job manager try to restart the job without launching a new task manager, and failed with NoResourceAvailableException: Not enough slots available to run the job.

The job is running on flink 1.4.2 and Hadoop 2.7.4.