Hi All, ~ Biswajit I'm trying to run a flink docker from the marathon with mesos app master; I could see it goes on a continuous loop and failed to launch the task manger. If I go to mesos master UI I could see job manager web UI with task manager zero . I have pretty much checked every possible log starting from Ubuntu machine docker.log /mesos master/slave pretty much no information other than just failed task , I could see below log @ flink . However, I'm able to run same docker image if I run jobamanger and taskmanager by itself in marathon and let it connect via jobmanager RPC port . for mesos config , I'm using below details from yml mesos.master: ${MESOS_MASTER} mesos.failover-timeout: 60 mesos.initial-tasks: ${INITIAL_TASK_MANAGERS} mesos.resourcemanager.tasks.mem: ${RESOURCEMANAGER_TASKS_MEM:-4096} mesos.resourcemanager.tasks.cpus:${RESOURCEMANAGER_TASKS_CPU:-1} mesos.resourcemanager.tasks.container.type: docker mesos.resourcemanager.tasks.container.image.name: ${IMAGE_NAME} --------------------------- 07-30 02:05:48,351 WARN org.apache.flink.mesos.scheduler.TaskMonitor - Mesos task taskmanager-00002 failed unexpectedly. 2017-07-30 02:05:48,352 INFO org.apache.flink.mesos.runtime.clusterframework.MesosFlinkResourceManager - Mesos task taskmanager-00002 failed, with a TaskManager in launch or registration. State: TASK_FAILED Reason: REASON_COMMAND_EXECUTOR_FAILED (Container exited with status 127) ----------------------------------------------------- Please let me know if any one has any pointer to debug further .. Thank you ~/Das |
Hi There, This is kind of blocker for me now for mesos deployment , really appreciate for any inputs/suggestion I have posted this here in the group a few days back and after that I have been exchanging email with Eron, thanks to Eron for all the tips. Now I see this basic auth error, I'm little confused how come Job Manager launched fine and task manager failing to auth. Also, mesos doc says by default authenticate is false so it should not have gone there, do I have to disable somewhere inside flink ??? I don't see any config or property in code. ---------- Forwarded message ---------- From: Eron Wright <[hidden email]> Date: Wed, Aug 2, 2017 at 10:51 AM From: Biswajit Das <[hidden email]>
Sent: Wednesday, August 2, 2017 10:19:45 AM To: Eron Wright Subject: Re: Flink -mesos-app master hang Hi Eron ,
Good morning , I'm really sorry for flooding question . I'll post this one to user group also .I could narrow down the actual error thrown by mesos , seems like JM some how not able to authenticate . I'm little confused if it is docker private registry tls error or some thing else , I have started slave even with --docker_config , previously mostly I was using docker.tar.gz with container for private repo authentication . 017-08-02 03:32:54,163 WARN org.apache.flink.mesos. 2017-08-02 03:32:54,163 INFO org.apache.flink.mesos. 2017-08-02 03:32:54,163 INFO org.apache.flink.mesos. 2017-08-02 03:32:54,163 INFO org.apache.flink.mesos. 2017-08-02 03:32:54,164 ERROR org.apache.flink.mesos. 2017-08-02 03:32:54,164 INFO org.apache.flink.mesos. 2017-08-02 03:32:54,164 INFO org.apache.flink.mesos. 2017-08-02 03:32:54,171 INFO org.apache.flink.mesos. root@ip-172-31-4-44:/etc/me On Tue, Aug 1, 2017 at 1:53 PM, Eron Wright
<[hidden email]> wrote:
Thank you ~/Das |
Hi Biswajit, are there any Mesos logs which might help us pinpointing the problem? I've actually never run Flink on Mesos with Docker images. But it could be that Flink does not set things properly up for running Docker images. I'll try to run Flink based on Docker images over the weekend in order to see whether I can reproduce the problem. Cheers, Till On Wed, Aug 2, 2017 at 8:48 PM, Biswajit Das <[hidden email]> wrote:
|
Hi Till , On Fri, Aug 4, 2017 at 3:17 AM, Till Rohrmann <[hidden email]> wrote:
Thank you ~/Das |
Free forum by Nabble | Edit this page |