Command exited with status 1 in running Flink on marathon

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Command exited with status 1 in running Flink on marathon

Mar_zieh
I want to run my flink program on Mesos cluster via marathon. I created an
application with this Json file in Marathon:
 
 {
    "id": "flink",
    "cmd": "/home/flink-1.7.0/bin/mesos-appmaster.sh
-Djobmanager.heap.mb=1024 -Djobmanager.rpc.port=6123 -Drest.port=8081
-Dmesos.resourcemanager.tasks.mem=1024 -Dtaskmanager.heap.mb=1024
-Dtaskmanager.numberOfTaskSlots=2 -Dparallelism.default=2
-Dmesos.resourcemanager.tasks.cpus=1",
    "cpus": 1.0,
    "mem": 1024
}

  The task became failed with this error:

 I0303 09:41:52.841243  2594 exec.cpp:162] Version: 1.7.0
I0303 09:41:52.851898  2593 exec.cpp:236] Executor registered on agent
d9a98175-b93c-4600-a41b-fe91fae5486a-S0
I0303 09:41:52.854436  2594 executor.cpp:182] Received SUBSCRIBED event
I0303 09:41:52.855284  2594 executor.cpp:186] Subscribed executor on
172.28.10.136
I0303 09:41:52.855479  2594 executor.cpp:182] Received LAUNCH event
I0303 09:41:52.855932  2594 executor.cpp:679] Starting task
ffff.933fdd2f-3d98-11e9-bbc4-0242a78449af
I0303 09:41:52.868172  2594 executor.cpp:499] Running
'/home/mesos-1.7.0/build/src/mesos-containerizer launch
<POSSIBLY-SENSITIVE-DATA>'
I0303 09:41:52.872699  2594 executor.cpp:693] Forked command at 2599
I0303 09:41:54.050284  2596 executor.cpp:994] Command exited with status 1
(pid: 2599)
I0303 09:41:55.052323  2598 process.cpp:926] Stopped the socket accept loop

I configured Zookeeper, Mesos, Marathon and Flink. Moreover, they are all on
docker. I ran a simple program like "echo "hello" >> /home/output.txt"
without any problems.

I really do not know what is going on, I am confused. Would you please any
one tell me what is wrong here?

Any help would be appreciated.

Many thanks.




--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Command exited with status 1 in running Flink on marathon

Piotr Nowojski-3
Hi,

With just this information it might be difficult to help.

Please look for some additional logs (has the Flink managed to log anything?) or some standard output/errors. I would guess this might be some relatively simple mistake in configuration, like file/directory read/write/execute permissions or something like that.

I guess you have seen/followed this?
https://ci.apache.org/projects/flink/flink-docs-stable/ops/deployment/mesos.html

Piotrek

> On 3 Mar 2019, at 12:46, Mar_zieh <[hidden email]> wrote:
>
> I want to run my flink program on Mesos cluster via marathon. I created an
> application with this Json file in Marathon:
>
> {
>    "id": "flink",
>    "cmd": "/home/flink-1.7.0/bin/mesos-appmaster.sh
> -Djobmanager.heap.mb=1024 -Djobmanager.rpc.port=6123 -Drest.port=8081
> -Dmesos.resourcemanager.tasks.mem=1024 -Dtaskmanager.heap.mb=1024
> -Dtaskmanager.numberOfTaskSlots=2 -Dparallelism.default=2
> -Dmesos.resourcemanager.tasks.cpus=1",
>    "cpus": 1.0,
>    "mem": 1024
> }
>
>  The task became failed with this error:
>
> I0303 09:41:52.841243  2594 exec.cpp:162] Version: 1.7.0
> I0303 09:41:52.851898  2593 exec.cpp:236] Executor registered on agent
> d9a98175-b93c-4600-a41b-fe91fae5486a-S0
> I0303 09:41:52.854436  2594 executor.cpp:182] Received SUBSCRIBED event
> I0303 09:41:52.855284  2594 executor.cpp:186] Subscribed executor on
> 172.28.10.136
> I0303 09:41:52.855479  2594 executor.cpp:182] Received LAUNCH event
> I0303 09:41:52.855932  2594 executor.cpp:679] Starting task
> ffff.933fdd2f-3d98-11e9-bbc4-0242a78449af
> I0303 09:41:52.868172  2594 executor.cpp:499] Running
> '/home/mesos-1.7.0/build/src/mesos-containerizer launch
> <POSSIBLY-SENSITIVE-DATA>'
> I0303 09:41:52.872699  2594 executor.cpp:693] Forked command at 2599
> I0303 09:41:54.050284  2596 executor.cpp:994] Command exited with status 1
> (pid: 2599)
> I0303 09:41:55.052323  2598 process.cpp:926] Stopped the socket accept loop
>
> I configured Zookeeper, Mesos, Marathon and Flink. Moreover, they are all on
> docker. I ran a simple program like "echo "hello" >> /home/output.txt"
> without any problems.
>
> I really do not know what is going on, I am confused. Would you please any
> one tell me what is wrong here?
>
> Any help would be appreciated.
>
> Many thanks.
>
>
>
>
> --
> Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Reply | Threaded
Open this post in threaded view
|

Re: Command exited with status 1 in running Flink on marathon

Piotr Nowojski-3
Hi,

Flink per se doesn’t require Hadoop to work, however keep in mind that you need some way to provide some kind of distributed/remote file system for checkpoint mechanism to work. If one node writes a file for checkpoint/savepoint, in case of restart/crash this file must be accessible from other nodes after the restart.

Piotrek

On 5 Mar 2019, at 10:01, marzieh ghasemi <[hidden email]> wrote:

Thank you for your reply.

Yes, I followed this link.

But I did not install Hadoop. Is problem for that? Since HDFS was commented. I did not change it. 

On Mon, Mar 4, 2019 at 4:40 PM Piotr Nowojski <[hidden email]> wrote:
Hi,

With just this information it might be difficult to help.

Please look for some additional logs (has the Flink managed to log anything?) or some standard output/errors. I would guess this might be some relatively simple mistake in configuration, like file/directory read/write/execute permissions or something like that.

I guess you have seen/followed this?
https://ci.apache.org/projects/flink/flink-docs-stable/ops/deployment/mesos.html

Piotrek

> On 3 Mar 2019, at 12:46, Mar_zieh <[hidden email]> wrote:
>
> I want to run my flink program on Mesos cluster via marathon. I created an
> application with this Json file in Marathon:
>
> {
>    "id": "flink",
>    "cmd": "/home/flink-1.7.0/bin/mesos-appmaster.sh
> -Djobmanager.heap.mb=1024 -Djobmanager.rpc.port=6123 -Drest.port=8081
> -Dmesos.resourcemanager.tasks.mem=1024 -Dtaskmanager.heap.mb=1024
> -Dtaskmanager.numberOfTaskSlots=2 -Dparallelism.default=2
> -Dmesos.resourcemanager.tasks.cpus=1",
>    "cpus": 1.0,
>    "mem": 1024
> }
>
>  The task became failed with this error:
>
> I0303 09:41:52.841243  2594 exec.cpp:162] Version: 1.7.0
> I0303 09:41:52.851898  2593 exec.cpp:236] Executor registered on agent
> d9a98175-b93c-4600-a41b-fe91fae5486a-S0
> I0303 09:41:52.854436  2594 executor.cpp:182] Received SUBSCRIBED event
> I0303 09:41:52.855284  2594 executor.cpp:186] Subscribed executor on
> 172.28.10.136
> I0303 09:41:52.855479  2594 executor.cpp:182] Received LAUNCH event
> I0303 09:41:52.855932  2594 executor.cpp:679] Starting task
> ffff.933fdd2f-3d98-11e9-bbc4-0242a78449af
> I0303 09:41:52.868172  2594 executor.cpp:499] Running
> '/home/mesos-1.7.0/build/src/mesos-containerizer launch
> <POSSIBLY-SENSITIVE-DATA>'
> I0303 09:41:52.872699  2594 executor.cpp:693] Forked command at 2599
> I0303 09:41:54.050284  2596 executor.cpp:994] Command exited with status 1
> (pid: 2599)
> I0303 09:41:55.052323  2598 process.cpp:926] Stopped the socket accept loop
>
> I configured Zookeeper, Mesos, Marathon and Flink. Moreover, they are all on
> docker. I ran a simple program like "echo "hello" >> /home/output.txt"
> without any problems.
>
> I really do not know what is going on, I am confused. Would you please any
> one tell me what is wrong here?
>
> Any help would be appreciated.
>
> Many thanks.
>
>
>
>
> --
> Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/


Reply | Threaded
Open this post in threaded view
|

Re: Command exited with status 1 in running Flink on marathon

Mar_zieh
Ok, thanks. 

On Tue, Mar 5, 2019 at 2:43 PM Piotr Nowojski <[hidden email]> wrote:
Hi,

Flink per se doesn’t require Hadoop to work, however keep in mind that you need some way to provide some kind of distributed/remote file system for checkpoint mechanism to work. If one node writes a file for checkpoint/savepoint, in case of restart/crash this file must be accessible from other nodes after the restart.

Piotrek

On 5 Mar 2019, at 10:01, marzieh ghasemi <[hidden email]> wrote:

Thank you for your reply.

Yes, I followed this link.

But I did not install Hadoop. Is problem for that? Since HDFS was commented. I did not change it. 

On Mon, Mar 4, 2019 at 4:40 PM Piotr Nowojski <[hidden email]> wrote:
Hi,

With just this information it might be difficult to help.

Please look for some additional logs (has the Flink managed to log anything?) or some standard output/errors. I would guess this might be some relatively simple mistake in configuration, like file/directory read/write/execute permissions or something like that.

I guess you have seen/followed this?
https://ci.apache.org/projects/flink/flink-docs-stable/ops/deployment/mesos.html

Piotrek

> On 3 Mar 2019, at 12:46, Mar_zieh <[hidden email]> wrote:
>
> I want to run my flink program on Mesos cluster via marathon. I created an
> application with this Json file in Marathon:
>
> {
>    "id": "flink",
>    "cmd": "/home/flink-1.7.0/bin/mesos-appmaster.sh
> -Djobmanager.heap.mb=1024 -Djobmanager.rpc.port=6123 -Drest.port=8081
> -Dmesos.resourcemanager.tasks.mem=1024 -Dtaskmanager.heap.mb=1024
> -Dtaskmanager.numberOfTaskSlots=2 -Dparallelism.default=2
> -Dmesos.resourcemanager.tasks.cpus=1",
>    "cpus": 1.0,
>    "mem": 1024
> }
>
>  The task became failed with this error:
>
> I0303 09:41:52.841243  2594 exec.cpp:162] Version: 1.7.0
> I0303 09:41:52.851898  2593 exec.cpp:236] Executor registered on agent
> d9a98175-b93c-4600-a41b-fe91fae5486a-S0
> I0303 09:41:52.854436  2594 executor.cpp:182] Received SUBSCRIBED event
> I0303 09:41:52.855284  2594 executor.cpp:186] Subscribed executor on
> 172.28.10.136
> I0303 09:41:52.855479  2594 executor.cpp:182] Received LAUNCH event
> I0303 09:41:52.855932  2594 executor.cpp:679] Starting task
> ffff.933fdd2f-3d98-11e9-bbc4-0242a78449af
> I0303 09:41:52.868172  2594 executor.cpp:499] Running
> '/home/mesos-1.7.0/build/src/mesos-containerizer launch
> <POSSIBLY-SENSITIVE-DATA>'
> I0303 09:41:52.872699  2594 executor.cpp:693] Forked command at 2599
> I0303 09:41:54.050284  2596 executor.cpp:994] Command exited with status 1
> (pid: 2599)
> I0303 09:41:55.052323  2598 process.cpp:926] Stopped the socket accept loop
>
> I configured Zookeeper, Mesos, Marathon and Flink. Moreover, they are all on
> docker. I ran a simple program like "echo "hello" >> /home/output.txt"
> without any problems.
>
> I really do not know what is going on, I am confused. Would you please any
> one tell me what is wrong here?
>
> Any help would be appreciated.
>
> Many thanks.
>
>
>
>
> --
> Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/