The TaskManager failed to determine its own network address

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

The TaskManager failed to determine its own network address

Arvid Heise
Hi Flinker,

I receive this error message quite often when starting the cluster. Is there any way to increase the likelihood of a successful detection?

Here is the log of one of the nodes failing, but it could be any other:

19:38:57,317 INFO  org.apache.flink.runtime.taskmanager.TaskManager              - Execution mode: CLUSTER
19:38:57,317 INFO  org.apache.flink.runtime.taskmanager.TaskManager              - Reading location of job manager from configuration
19:38:57,318 INFO  org.apache.flink.runtime.taskmanager.TaskManager              - Connecting to JobManager at: msvr-01/192.168.10.21:6123
19:38:57,384 INFO  org.apache.flink.runtime.taskmanager.TaskManager              - Failed to determine own IP address from '/192.168.11.225': connect timed out
19:38:57,385 INFO  org.apache.flink.runtime.taskmanager.TaskManager              - Failed to determine own IP address from '/192.168.10.225': Connection refused
19:38:57,386 INFO  org.apache.flink.runtime.taskmanager.TaskManager              - Failed to determine own IP address from '/fe80:0:0:0:21b:21ff:feae:a38d%5': Network is unreachable
19:38:57,436 INFO  org.apache.flink.runtime.taskmanager.TaskManager              - Failed to determine own IP address from '/192.168.11.225': connect timed out
19:38:57,436 INFO  org.apache.flink.runtime.taskmanager.TaskManager              - Failed to determine own IP address from '/fe80:0:0:0:21b:21ff:feae:a38c%3': Network is unreachable
19:38:57,437 INFO  org.apache.flink.runtime.taskmanager.TaskManager              - Failed to determine own IP address from '/192.168.10.225': Connection refused
19:38:57,438 INFO  org.apache.flink.runtime.taskmanager.TaskManager              - Failed to determine own IP address from '/fe80:0:0:0:67d:7bff:fe8b:37ce%2': Network is unreachable
19:38:57,488 INFO  org.apache.flink.runtime.taskmanager.TaskManager              - Failed to determine own IP address from '/10.6.27.45': connect timed out
19:38:57,489 INFO  org.apache.flink.runtime.taskmanager.TaskManager              - Failed to determine own IP address from '/0:0:0:0:0:0:0:1%1': Network is unreachable
19:38:57,490 INFO  org.apache.flink.runtime.taskmanager.TaskManager              - Failed to determine own IP address from '/127.0.0.2': Invalid argument
19:38:57,490 INFO  org.apache.flink.runtime.taskmanager.TaskManager              - Failed to determine own IP address from '/127.0.0.1': Invalid argument
19:38:57,491 INFO  org.apache.flink.runtime.taskmanager.TaskManager              - Failed to determine own IP address from '/fe80:0:0:0:21b:21ff:feae:a38d%5': Network is unreachable
19:38:58,493 INFO  org.apache.flink.runtime.taskmanager.TaskManager              - Failed to determine own IP address from '/192.168.11.225': connect timed out
19:38:58,494 INFO  org.apache.flink.runtime.taskmanager.TaskManager              - Failed to determine own IP address from '/fe80:0:0:0:21b:21ff:feae:a38c%3': Network is unreachable
19:38:58,495 INFO  org.apache.flink.runtime.taskmanager.TaskManager              - Failed to determine own IP address from '/192.168.10.225': Connection refused
19:38:58,495 INFO  org.apache.flink.runtime.taskmanager.TaskManager              - Failed to determine own IP address from '/fe80:0:0:0:67d:7bff:fe8b:37ce%2': Network is unreachable
19:38:59,496 INFO  org.apache.flink.runtime.taskmanager.TaskManager              - Failed to determine own IP address from '/10.6.27.45': connect timed out
19:38:59,497 INFO  org.apache.flink.runtime.taskmanager.TaskManager              - Failed to determine own IP address from '/0:0:0:0:0:0:0:1%1': Network is unreachable
19:38:59,497 INFO  org.apache.flink.runtime.taskmanager.TaskManager              - Failed to determine own IP address from '/127.0.0.2': Invalid argument
19:38:59,498 INFO  org.apache.flink.runtime.taskmanager.TaskManager              - Failed to determine own IP address from '/127.0.0.1': Invalid argument
19:38:59,499 FATAL org.apache.flink.runtime.taskmanager.TaskManager              - Taskmanager startup failed: The TaskManager failed to determine its own network address.
java.lang.RuntimeException: The TaskManager failed to determine its own network address.
        at org.apache.flink.runtime.taskmanager.TaskManager.<init>(TaskManager.java:230)
        at org.apache.flink.runtime.taskmanager.TaskManager.main(TaskManager.java:514)
Caused by: java.lang.RuntimeException: The TaskManager failed to detect its own IP address
        at org.apache.flink.runtime.taskmanager.TaskManager.getTaskManagerAddress(TaskManager.java:652)
        at org.apache.flink.runtime.taskmanager.TaskManager.<init>(TaskManager.java:227)
        ... 1 more

Reply | Threaded
Open this post in threaded view
|

Re: The TaskManager failed to determine its own network address

Stephan Ewen
Hi!

It is a confusing error message. The problem is usually that the jobmanager is unreachable from the taskmanager.

Can you verify that?

There is an issue for that, it has not been merged as of now.

Stephan

Reply | Threaded
Open this post in threaded view
|

Re: The TaskManager failed to determine its own network address

Arvid Heise
Then it must be some kind of temporary network issue - restarting the cluster often helps, but sometimes causes the same error message on another slave.

I think I also found the source fragment. Would it make sense to increase the retry count?


2014-08-12 20:26 GMT+02:00 Stephan Ewen <[hidden email]>:
Hi!

It is a confusing error message. The problem is usually that the jobmanager is unreachable from the taskmanager.

Can you verify that?

There is an issue for that, it has not been merged as of now.

Stephan


Reply | Threaded
Open this post in threaded view
|

Re: The TaskManager failed to determine its own network address

Stephan Ewen
Yes, it might help, but also increase the time the system spends on trying each address...