"Not enough free slots available to run the job" for word count example

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

"Not enough free slots available to run the job" for word count example

Daniel Blazevski
Hello,

I am new to Flink, I setup a Flink cluster on 4 m4.large Amazon EC2 instances, and set the following in link-conf.yaml:

jobmanager.heap.mb: 4000
taskmanager.heap.mb: 5000
taskmanager.numberOfTaskSlots: 2
parallelism.default: 8

In the 8081 dashboard, it shows 4 for Task Manager and 5 for Processing Slots ( I’m not sure if “5” is OK here?).

I then tried to execute:
./bin/flink run ./examples/flink-java-examples-0.9.1-WordCount.jar
and got the following error message:
Error: java.lang.IllegalStateException: Could not schedule consumer vertex CHAIN Reduce (SUM(1), at main(WordCount.java:72) -> FlatMap (collect()) (7/8)
at org.apache.flink.runtime.executiongraph.Execution$3.call(Execution.java:482)
at org.apache.flink.runtime.executiongraph.Execution$3.call(Execution.java:472)
at akka.dispatch.Futures$$anonfun$future$1.apply(Future.scala:94)
at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at scala.concurrent.impl.ExecutionContextImpl$$anon$3.exec(ExecutionContextImpl.scala:107)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: Not enough free slots available to run the job. You can decrease the operator parallelism or increase the number of slots per TaskManager in the configuration. Task to schedule: < Attempt #0 (CHAIN Reduce (SUM(1), at main(WordCount.java:72) -> FlatMap (collect()) (7/8)) @ (unassigned) - [SCHEDULED] > with groupID < 6adebf08c73e7f3adb6ea20f8950d627 > in sharing group < SlotSharingGroup [02cac542946daf808c406c2b18e252e0, d883aa4274b6cef49ab57aaf3078147c, 6adebf08c73e7f3adb6ea20f8950d627] >. Resources available to scheduler: Number of instances=4, total number of slots=5, available slots=0
at org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleTask(Scheduler.java:251)
at org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleImmediately(Scheduler.java:126)
at org.apache.flink.runtime.executiongraph.Execution.scheduleForExecution(Execution.java:271)
at org.apache.flink.runtime.executiongraph.ExecutionVertex.scheduleForExecution(ExecutionVertex.java:430)
at org.apache.flink.runtime.executiongraph.Execution$3.call(Execution.java:478)
... 9 more


More details about my setup: I am running Ubuntu on the master node and 3 data nodes.  If it matters, I already had hadoop 2.7.1 running and downloaded and installed the latest version of Flink, which is technically for hadoop 2.7.0.

Thanks,
Dan
Reply | Threaded
Open this post in threaded view
|

Re: "Not enough free slots available to run the job" for word count example

Daniel Blazevski
Hello,

I am not sure if I can give updates to an email I send to the user list before getting any response, but here is a quick update:

I tried to run using one processor:
./bin/flink run -p 1 ./examples/flink-java-examples-0.9.1-WordCount.jar

and that worked.  It seems to be an issue with configuring to the other workers.

I further realized that the reason there were only 5 processing slots on the Dashboard was that I only changed flnk-conf.yaml on the master node, though I changed that for all workers as well, and the Dashboard now shows 8 processing slots.

I stopped and re-started the cluster, and the example runs (even w/o the -p 1 setting)

Best,
Dan



On Sun, Sep 13, 2015 at 10:40 AM, Daniel Blazevski <[hidden email]> wrote:
Hello,

I am new to Flink, I setup a Flink cluster on 4 m4.large Amazon EC2 instances, and set the following in link-conf.yaml:

jobmanager.heap.mb: 4000
taskmanager.heap.mb: 5000
taskmanager.numberOfTaskSlots: 2
parallelism.default: 8

In the 8081 dashboard, it shows 4 for Task Manager and 5 for Processing Slots ( I’m not sure if “5” is OK here?).

I then tried to execute:
./bin/flink run ./examples/flink-java-examples-0.9.1-WordCount.jar
and got the following error message:
Error: java.lang.IllegalStateException: Could not schedule consumer vertex CHAIN Reduce (SUM(1), at main(WordCount.java:72) -> FlatMap (collect()) (7/8)
at org.apache.flink.runtime.executiongraph.Execution$3.call(Execution.java:482)
at org.apache.flink.runtime.executiongraph.Execution$3.call(Execution.java:472)
at akka.dispatch.Futures$$anonfun$future$1.apply(Future.scala:94)
at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at scala.concurrent.impl.ExecutionContextImpl$$anon$3.exec(ExecutionContextImpl.scala:107)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: Not enough free slots available to run the job. You can decrease the operator parallelism or increase the number of slots per TaskManager in the configuration. Task to schedule: < Attempt #0 (CHAIN Reduce (SUM(1), at main(WordCount.java:72) -> FlatMap (collect()) (7/8)) @ (unassigned) - [SCHEDULED] > with groupID < 6adebf08c73e7f3adb6ea20f8950d627 > in sharing group < SlotSharingGroup [02cac542946daf808c406c2b18e252e0, d883aa4274b6cef49ab57aaf3078147c, 6adebf08c73e7f3adb6ea20f8950d627] >. Resources available to scheduler: Number of instances=4, total number of slots=5, available slots=0
at org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleTask(Scheduler.java:251)
at org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleImmediately(Scheduler.java:126)
at org.apache.flink.runtime.executiongraph.Execution.scheduleForExecution(Execution.java:271)
at org.apache.flink.runtime.executiongraph.ExecutionVertex.scheduleForExecution(ExecutionVertex.java:430)
at org.apache.flink.runtime.executiongraph.Execution$3.call(Execution.java:478)
... 9 more


More details about my setup: I am running Ubuntu on the master node and 3 data nodes.  If it matters, I already had hadoop 2.7.1 running and downloaded and installed the latest version of Flink, which is technically for hadoop 2.7.0.

Thanks,
Dan

Reply | Threaded
Open this post in threaded view
|

Re: "Not enough free slots available to run the job" for word count example

Sachin Goel

Hi Daniel
Your problem did get solved, I assume.

As for the -p flag, it determines the default parallelism of operators at runtime. If you end up specifying a value more than the slots available, that's an issue. Hope that helped.

Cheers
Sachin

On Sep 13, 2015 9:13 PM, "Daniel Blazevski" <[hidden email]> wrote:
Hello,

I am not sure if I can give updates to an email I send to the user list before getting any response, but here is a quick update:

I tried to run using one processor:
./bin/flink run -p 1 ./examples/flink-java-examples-0.9.1-WordCount.jar

and that worked.  It seems to be an issue with configuring to the other workers.

I further realized that the reason there were only 5 processing slots on the Dashboard was that I only changed flnk-conf.yaml on the master node, though I changed that for all workers as well, and the Dashboard now shows 8 processing slots.

I stopped and re-started the cluster, and the example runs (even w/o the -p 1 setting)

Best,
Dan



On Sun, Sep 13, 2015 at 10:40 AM, Daniel Blazevski <[hidden email]> wrote:
Hello,

I am new to Flink, I setup a Flink cluster on 4 m4.large Amazon EC2 instances, and set the following in link-conf.yaml:

jobmanager.heap.mb: 4000
taskmanager.heap.mb: 5000
taskmanager.numberOfTaskSlots: 2
parallelism.default: 8

In the 8081 dashboard, it shows 4 for Task Manager and 5 for Processing Slots ( I’m not sure if “5” is OK here?).

I then tried to execute:
./bin/flink run ./examples/flink-java-examples-0.9.1-WordCount.jar
and got the following error message:
Error: java.lang.IllegalStateException: Could not schedule consumer vertex CHAIN Reduce (SUM(1), at main(WordCount.java:72) -> FlatMap (collect()) (7/8)
at org.apache.flink.runtime.executiongraph.Execution$3.call(Execution.java:482)
at org.apache.flink.runtime.executiongraph.Execution$3.call(Execution.java:472)
at akka.dispatch.Futures$$anonfun$future$1.apply(Future.scala:94)
at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at scala.concurrent.impl.ExecutionContextImpl$$anon$3.exec(ExecutionContextImpl.scala:107)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: Not enough free slots available to run the job. You can decrease the operator parallelism or increase the number of slots per TaskManager in the configuration. Task to schedule: < Attempt #0 (CHAIN Reduce (SUM(1), at main(WordCount.java:72) -> FlatMap (collect()) (7/8)) @ (unassigned) - [SCHEDULED] > with groupID < 6adebf08c73e7f3adb6ea20f8950d627 > in sharing group < SlotSharingGroup [02cac542946daf808c406c2b18e252e0, d883aa4274b6cef49ab57aaf3078147c, 6adebf08c73e7f3adb6ea20f8950d627] >. Resources available to scheduler: Number of instances=4, total number of slots=5, available slots=0
at org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleTask(Scheduler.java:251)
at org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleImmediately(Scheduler.java:126)
at org.apache.flink.runtime.executiongraph.Execution.scheduleForExecution(Execution.java:271)
at org.apache.flink.runtime.executiongraph.ExecutionVertex.scheduleForExecution(ExecutionVertex.java:430)
at org.apache.flink.runtime.executiongraph.Execution$3.call(Execution.java:478)
... 9 more


More details about my setup: I am running Ubuntu on the master node and 3 data nodes.  If it matters, I already had hadoop 2.7.1 running and downloaded and installed the latest version of Flink, which is technically for hadoop 2.7.0.

Thanks,
Dan