Flink Mesos Outstanding Offers - trouble launching task managers

Posted by prashantnayak on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/Flink-Mesos-Outstanding-Offers-trouble-launching-task-managers-tp14227.html


Hi

We’re running Flink 1.3.1 on Mesos.

From time-to-time, the Flink app master seems to have trouble with Mesos offers… At such time, it obviously ends up not launching the requested task managers (mesos.initial-tasks) and we’ve noticed situations where it launches zero tasks.  During such
times we see a long list of “Outstanding Offers” in the Mesos UI.  At the same time, the app master logs have the following


2017-07-12 18:06:23.939 [flink-akka.actor.default-dispatcher-20] INFO  org.apache.flink.mesos.scheduler.LaunchCoordinator  - Processing 12 task(s) against 0 new offer(s) plus outstanding offers.
2017-07-12 18:06:23.939 [flink-akka.actor.default-dispatcher-20] INFO  org.apache.flink.mesos.scheduler.LaunchCoordinator  - Resources considered: (note: expired offers not deducted from below)
2017-07-12 18:06:23.939 [flink-akka.actor.default-dispatcher-20] INFO  org.apache.flink.mesos.scheduler.LaunchCoordinator  -   10.80.xx.6 has 0.0 MB, 0.0 cpus
2017-07-12 18:06:23.939 [flink-akka.actor.default-dispatcher-20] INFO  org.apache.flink.mesos.scheduler.LaunchCoordinator  -   10.80.xx.233 has 0.0 MB, 0.0 cpus
2017-07-12 18:06:23.939 [flink-akka.actor.default-dispatcher-20] INFO  org.apache.flink.mesos.scheduler.LaunchCoordinator  - Waiting for more offers; 12 task(s) are not yet launched.

The two Mesos agents above (10.80.xx.6, 10.80.xx.233) are listed as having offers outstanding to the Flink framework in the Mesos UI

Appreciate any input on how to go about resolving such an issue.

Thanks
Prashant