Hi all,
I trying to run a Flink job on YARN via "$/bin/flink run -m yarn-cluster -yn 2 ..." with two nodes. But only one JobManager seems to be connected. Flinks hangs at this stage (look up message repeats every second): 017-01-11 15:12:13,653 DEBUG org.apache.flink.yarn.YarnClusterClient - Looking up JobManager 2017-01-11 15:12:13,678 INFO org.apache.flink.yarn.YarnClusterClient - TaskManager status (1/2) TaskManager status (1/2) 2017-01-11 15:12:13,929 DEBUG org.apache.flink.yarn.YarnClusterClient - Looking up JobManager 2017-01-11 15:12:14,197 DEBUG org.apache.flink.yarn.YarnClusterClient - Looking up JobManager 2017-01-11 15:12:14,451 DEBUG org.apache.hadoop.ipc.Client - IPC Client (20529812) connection to ____/10.68.17 .206:8032 from user sending #104 2017-01-11 15:12:14,452 DEBUG org.apache.hadoop.ipc.Client - IPC Client (20529812) connection to ___:8032 from user got value #104 2017-01-11 15:12:14,452 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine - Call: getApplicationReport took 1ms 2017-01-11 15:12:14,462 DEBUG org.apache.flink.yarn.YarnClusterClient - Looking up JobManager 2017-01-11 15:12:14,745 DEBUG org.apache.flink.yarn.YarnClusterClient - Looking up JobManager 2017-01-11 15:12:15,014 DEBUG org.apache.flink.yarn.YarnClusterClient - Looking up JobManager 2017-01-11 15:12:15,276 DEBUG org.apache.flink.yarn.YarnClusterClient - Looking up JobManager 2017-01-11 15:12:15,322 DEBUG org.apache.hadoop.ipc.Client - IPC Client (20529812) connection to ___:8020 from user: closed ... Any suggestions what can cause this? Standard MapReduce jobs work without any problem on YARN. Best regards, Malte |
Hi Malte, can it be that you’re trying to request more resources from your yarn cluster than there are currently available? It depends a little bit on your other settings but If you have them available, then it indicates a faulty behaviour. Then it would be great if you could share the aggregated YARN logs for the Flink application with us (available after terminating the YARN application). This would help with the further debugging of the problem. Cheers, On Thu, Jan 12, 2017 at 4:13 PM, Malte Schwarzer <[hidden email]> wrote: Hi all, |
Free forum by Nabble | Edit this page |