NoResourceAvailableException

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

NoResourceAvailableException

Alexander Semeshchenko

Installing (download & tar zxf) Apache Flink 1.11.1 and running: ./bin/flink run examples/streaming/WordCount.jar it show on the nice message after more less 5 min. the trying of submitting:  Caused by: org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: Could not allocate the required slot within slot request timeout. Please make sure that the cluster has enough resources. at org.apache.flink.runtime.scheduler.DefaultScheduler.maybeWrapWithNoResourceAvailableException(DefaultScheduler.java:441) ... 45 more Caused by: java.util.concurrent.CompletionException: java.util.concurrent.TimeoutException at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292) at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308) at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:607) at java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591)

It's Flink default configuration.

Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 8 On-line CPU(s) list: 0-7 Thread(s) per core: 1 Core(s) per socket: 1

free -g

 total used free shared buff/cache available

Mem: 62 1 23 3 37 57 Swap: 7 0 7

are there some advices about what is happened?

Reply | Threaded
Open this post in threaded view
|

Re: NoResourceAvailableException

r_khachatryan
I assume that before submitting a job you started a cluster with default settings with ./bin/start-cluster.sh.

Did you submit any other jobs?
Can you share the logs from log folder?

Regards,
Roman


On Wed, Oct 7, 2020 at 11:03 PM Alexander Semeshchenko <[hidden email]> wrote:

Installing (download & tar zxf) Apache Flink 1.11.1 and running: ./bin/flink run examples/streaming/WordCount.jar it show on the nice message after more less 5 min. the trying of submitting:  Caused by: org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: Could not allocate the required slot within slot request timeout. Please make sure that the cluster has enough resources. at org.apache.flink.runtime.scheduler.DefaultScheduler.maybeWrapWithNoResourceAvailableException(DefaultScheduler.java:441) ... 45 more Caused by: java.util.concurrent.CompletionException: java.util.concurrent.TimeoutException at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292) at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308) at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:607) at java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591)

It's Flink default configuration.

Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 8 On-line CPU(s) list: 0-7 Thread(s) per core: 1 Core(s) per socket: 1

free -g

 total used free shared buff/cache available

Mem: 62 1 23 3 37 57 Swap: 7 0 7

are there some advices about what is happened?

Reply | Threaded
Open this post in threaded view
|

Re: NoResourceAvailableException

r_khachatryan
Hi Alex,

This message isn't actually a problem - netty can't find the native transports and falls back to nio-based one.
Does increasing taskmanager.numberOfTaskSlots in flink-conf.yaml help?
Can you share the full logs in DEBUG mode?

Regards,
Roman


On Mon, Oct 19, 2020 at 6:14 PM Alexander Semeshchenko <[hidden email]> wrote:
thank you for your response.

taskmanager has 1 slot , 1 slot free but WordCount job never change its status from "Created".
After more less 5 min. job is canceled. 
I attached screenshot of taskmanager.

Best Regards
Alexander

On Wed, Oct 14, 2020 at 6:13 PM Khachatryan Roman <[hidden email]> wrote:
Hi,
Thanks for sharing the details and sorry for the late reply.
You can check the number of free slots in the task manager in the web UI (http://localhost:8081/#/task-manager by default).
Before running the program, there should be 1 TM with 1 slot available which should be free (with default settings).

If there are other jobs, you can increase slots per TM by increasing taskmanager.numberOfTaskSlots in flink-conf.yaml [1].


Regards,
Roman


On Wed, Oct 14, 2020 at 6:56 PM Alexander Semeshchenko <[hidden email]> wrote:
Hi, is there any news about my issue "Flink -    NoResourceAvailableException " post - installed WordCount job ?
Best

On Fri, Oct 9, 2020 at 10:19 AM Alexander Semeshchenko <[hidden email]> wrote:
Yes, I made the following accions:
-   download Flink
-   ./bin/start-cluster.sh.
-   ./bin/flink run ./examples/streaming/WordCount.jar 
------------------------------------------------
Then, tried to increase values for > ulimit , VM memory values...
Below I put the logs messages.

It's rare as I could do the  same job on: My Macbook( 8 cpu, 16g RAM ), on k8s cluster - 4 cpu, 8g RAM 

Thanks



On Fri, Oct 9, 2020 at 3:32 AM Khachatryan Roman <[hidden email]> wrote:
I assume that before submitting a job you started a cluster with default settings with ./bin/start-cluster.sh.

Did you submit any other jobs?
Can you share the logs from log folder?

Regards,
Roman


On Wed, Oct 7, 2020 at 11:03 PM Alexander Semeshchenko <[hidden email]> wrote:

Installing (download & tar zxf) Apache Flink 1.11.1 and running: ./bin/flink run examples/streaming/WordCount.jar it show on the nice message after more less 5 min. the trying of submitting:  Caused by: org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: Could not allocate the required slot within slot request timeout. Please make sure that the cluster has enough resources. at org.apache.flink.runtime.scheduler.DefaultScheduler.maybeWrapWithNoResourceAvailableException(DefaultScheduler.java:441) ... 45 more Caused by: java.util.concurrent.CompletionException: java.util.concurrent.TimeoutException at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292) at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308) at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:607) at java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591)

It's Flink default configuration.

Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 8 On-line CPU(s) list: 0-7 Thread(s) per core: 1 Core(s) per socket: 1

free -g

 total used free shared buff/cache available

Mem: 62 1 23 3 37 57 Swap: 7 0 7

are there some advices about what is happened?

Reply | Threaded
Open this post in threaded view
|

Re: NoResourceAvailableException

r_khachatryan
Hi Alexander,

Thanks for sharing,

I see a lot of exceptions in the logs, particularly
*Caused by: java.net.BindException: Could not start actor system on any port in port range 6123

which means that there's probably more than one instance running and is likely the root cause.
So it makes sense to make sure that the previous attempts cleaned up.

Regards,
Roman


On Tue, Oct 20, 2020 at 12:08 AM Alexander Semeshchenko <[hidden email]> wrote:
Hi Roman, 
I made the cluster: 1 master, 2 worker. All - 8 cpu, 32 g RAM . Red Hat Enterprise Linux Server release 7.9 (Maipo)
vsmart-f01 - master
vsmart-f02 - worker
vsmart-f03 - worker
tvsmart-f02askmanager.numberOfTaskSlots for each node is : 8 

Then:
[flink@vsmart-f01 flink-1.11.1]$ ./bin/start-cluster.sh 
Starting cluster.
[INFO] 1 instance(s) of standalonesession are already running on vsmart-f01.
Starting standalonesession daemon on host vsmart-f01.
[hidden email]'s password: 
[INFO] 1 instance(s) of taskexecutor are already running on vsmart-f02.
Starting taskexecutor daemon on host vsmart-f02.
[hidden email]'s password: 
Starting taskexecutor daemon on host vsmart-f03.

The cluster start up, running WordCount from master:
./bin/flink run -c org.apache.flink.examples.java.wordcount.WordCount  ./examples/batch/WordCount.jar  --output file:/tmp/wordcount_out

After 5 min. the job was canceled.
In the screenshot appeared that was never assigned taskmanager for the job operator. 
I've put the 3 logs(  from each node) here.

Thanks and Best Regards.
Alex
 

On Mon, Oct 19, 2020 at 5:47 PM Khachatryan Roman <[hidden email]> wrote:
Hi Alex,

This message isn't actually a problem - netty can't find the native transports and falls back to nio-based one.
Does increasing taskmanager.numberOfTaskSlots in flink-conf.yaml help?
Can you share the full logs in DEBUG mode?

Regards,
Roman


On Mon, Oct 19, 2020 at 6:14 PM Alexander Semeshchenko <[hidden email]> wrote:
thank you for your response.

taskmanager has 1 slot , 1 slot free but WordCount job never change its status from "Created".
After more less 5 min. job is canceled. 
I attached screenshot of taskmanager.

Best Regards
Alexander

On Wed, Oct 14, 2020 at 6:13 PM Khachatryan Roman <[hidden email]> wrote:
Hi,
Thanks for sharing the details and sorry for the late reply.
You can check the number of free slots in the task manager in the web UI (http://localhost:8081/#/task-manager by default).
Before running the program, there should be 1 TM with 1 slot available which should be free (with default settings).

If there are other jobs, you can increase slots per TM by increasing taskmanager.numberOfTaskSlots in flink-conf.yaml [1].


Regards,
Roman


On Wed, Oct 14, 2020 at 6:56 PM Alexander Semeshchenko <[hidden email]> wrote:
Hi, is there any news about my issue "Flink -    NoResourceAvailableException " post - installed WordCount job ?
Best

On Fri, Oct 9, 2020 at 10:19 AM Alexander Semeshchenko <[hidden email]> wrote:
Yes, I made the following accions:
-   download Flink
-   ./bin/start-cluster.sh.
-   ./bin/flink run ./examples/streaming/WordCount.jar 
------------------------------------------------
Then, tried to increase values for > ulimit , VM memory values...
Below I put the logs messages.

It's rare as I could do the  same job on: My Macbook( 8 cpu, 16g RAM ), on k8s cluster - 4 cpu, 8g RAM 

Thanks



On Fri, Oct 9, 2020 at 3:32 AM Khachatryan Roman <[hidden email]> wrote:
I assume that before submitting a job you started a cluster with default settings with ./bin/start-cluster.sh.

Did you submit any other jobs?
Can you share the logs from log folder?

Regards,
Roman


On Wed, Oct 7, 2020 at 11:03 PM Alexander Semeshchenko <[hidden email]> wrote:

Installing (download & tar zxf) Apache Flink 1.11.1 and running: ./bin/flink run examples/streaming/WordCount.jar it show on the nice message after more less 5 min. the trying of submitting:  Caused by: org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: Could not allocate the required slot within slot request timeout. Please make sure that the cluster has enough resources. at org.apache.flink.runtime.scheduler.DefaultScheduler.maybeWrapWithNoResourceAvailableException(DefaultScheduler.java:441) ... 45 more Caused by: java.util.concurrent.CompletionException: java.util.concurrent.TimeoutException at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292) at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308) at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:607) at java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591)

It's Flink default configuration.

Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 8 On-line CPU(s) list: 0-7 Thread(s) per core: 1 Core(s) per socket: 1

free -g

 total used free shared buff/cache available

Mem: 62 1 23 3 37 57 Swap: 7 0 7

are there some advices about what is happened?