lost connection

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

lost connection

Radu Tudoran

Hi,

 

I am trying to submit a jar via the console (flink run my.jar). The result is that I get an error saying that the communication with the jobmanager failed: Lost connection to the jobmanager.

Can you give me some hints/ recommendations about approaching this issue.

 

Thanks

 

Dr. Radu Tudoran

Research Engineer - Big Data Expert

IT R&D Division

 

cid:image007.jpg@01CD52EB.AD060EE0

HUAWEI TECHNOLOGIES Duesseldorf GmbH

European Research Center

Riesstrasse 25, 80992 München

 

E-mail: [hidden email]

Mobile: +49 15209084330

Telephone: +49 891588344173

 

HUAWEI TECHNOLOGIES Duesseldorf GmbH
Hansaallee 205, 40549 Düsseldorf, Germany,
www.huawei.com
Registered Office: Düsseldorf, Register Court Düsseldorf, HRB 56063,
Managing Director: Bo PENG, Wanzhou MENG, Lifang CHEN
Sitz der Gesellschaft: Düsseldorf, Amtsgericht Düsseldorf, HRB 56063,
Geschäftsführer: Bo PENG, Wanzhou MENG, Lifang CHEN

This e-mail and its attachments contain confidential information from HUAWEI, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it!

 

Reply | Threaded
Open this post in threaded view
|

Re: lost connection

Chesnay Schepler
Hello,

the first step is always to check the logs under /log. The JobManager log in particular may contain clues as why no connection could be established.

Regards,
Chesnay

On 21.04.2016 15:44, Radu Tudoran wrote:

Hi,

 

I am trying to submit a jar via the console (flink run my.jar). The result is that I get an error saying that the communication with the jobmanager failed: Lost connection to the jobmanager.

Can you give me some hints/ recommendations about approaching this issue.

 

Thanks

 

Dr. Radu Tudoran

Research Engineer - Big Data Expert

IT R&D Division

 

cid:image007.jpg@01CD52EB.AD060EE0

HUAWEI TECHNOLOGIES Duesseldorf GmbH

European Research Center

Riesstrasse 25, 80992 München

 

E-mail: [hidden email]

Mobile: +49 15209084330

Telephone: +49 891588344173

 

HUAWEI TECHNOLOGIES Duesseldorf GmbH
Hansaallee 205, 40549 Düsseldorf, Germany,
www.huawei.com
Registered Office: Düsseldorf, Register Court Düsseldorf, HRB 56063,
Managing Director: Bo PENG, Wanzhou MENG, Lifang CHEN
Sitz der Gesellschaft: Düsseldorf, Amtsgericht Düsseldorf, HRB 56063,
Geschäftsführer: Bo PENG, Wanzhou MENG, Lifang CHEN

This e-mail and its attachments contain confidential information from HUAWEI, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it!

 


Reply | Threaded
Open this post in threaded view
|

RE: lost connection

Radu Tudoran

- Could not submit job Operator2 execution (170aef70d31f3fee62f8a483930be213), because there is no connection to a JobManager.

15:59:48,456 WARN  Remoting                                                      - Tried to associate with unreachable remote address [akka.tcp://flink@10.204.62.71:6123]. Address is now gated for 5000 ms, all messages to this address will be delivered to dead letters. Reason: Connection refused: /10.204.62.71:6123

16:01:28,409 ERROR org.apache.flink.client.CliFrontend                           - Error while running the command.

org.apache.flink.client.program.ProgramInvocationException: The program execution failed: Communication with JobManager failed: Lost connection to the JobManager.

 

I do not understand what could be the root cause of this… the IPs look ok and there is not firewall to block things…

 

Dr. Radu Tudoran

Research Engineer - Big Data Expert

IT R&D Division

 

cid:image007.jpg@01CD52EB.AD060EE0

HUAWEI TECHNOLOGIES Duesseldorf GmbH

European Research Center

Riesstrasse 25, 80992 München

 

E-mail: [hidden email]

Mobile: +49 15209084330

Telephone: +49 891588344173

 

HUAWEI TECHNOLOGIES Duesseldorf GmbH
Hansaallee 205, 40549 Düsseldorf, Germany,
www.huawei.com
Registered Office: Düsseldorf, Register Court Düsseldorf, HRB 56063,
Managing Director: Bo PENG, Wanzhou MENG, Lifang CHEN
Sitz der Gesellschaft: Düsseldorf, Amtsgericht Düsseldorf, HRB 56063,
Geschäftsführer: Bo PENG, Wanzhou MENG, Lifang CHEN

This e-mail and its attachments contain confidential information from HUAWEI, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it!

 

From: Chesnay Schepler [mailto:[hidden email]]
Sent: Thursday, April 21, 2016 3:58 PM
To: [hidden email]
Subject: Re: lost connection

 

Hello,

the first step is always to check the logs under /log. The JobManager log in particular may contain clues as why no connection could be established.

Regards,
Chesnay

On 21.04.2016 15:44, Radu Tudoran wrote:

Hi,

 

I am trying to submit a jar via the console (flink run my.jar). The result is that I get an error saying that the communication with the jobmanager failed: Lost connection to the jobmanager.

Can you give me some hints/ recommendations about approaching this issue.

 

Thanks

 

Dr. Radu Tudoran

Research Engineer - Big Data Expert

IT R&D Division

 

cid:image007.jpg@01CD52EB.AD060EE0

HUAWEI TECHNOLOGIES Duesseldorf GmbH

European Research Center

Riesstrasse 25, 80992 München

 

E-mail: [hidden email]

Mobile: +49 15209084330

Telephone: +49 891588344173

 

HUAWEI TECHNOLOGIES Duesseldorf GmbH
Hansaallee 205, 40549 Düsseldorf, Germany,
www.huawei.com
Registered Office: Düsseldorf, Register Court Düsseldorf, HRB 56063,
Managing Director: Bo PENG, Wanzhou MENG, Lifang CHEN
Sitz der Gesellschaft: Düsseldorf, Amtsgericht Düsseldorf, HRB 56063,
Geschäftsführer: Bo PENG, Wanzhou MENG, Lifang CHEN

This e-mail and its attachments contain confidential information from HUAWEI, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it!

 

 

Reply | Threaded
Open this post in threaded view
|

Re: lost connection

Chesnay Schepler
That is an exempt from the client log, can you check the JobManager log? It could have crashed, and if so the cause is hopefully in there.

Did this issue suddenly occur; as in have you run a job successfully on the system before? (to exclude network configuration issues)

Regards,
Chesnay

On 21.04.2016 16:09, Radu Tudoran wrote:

- Could not submit job Operator2 execution (170aef70d31f3fee62f8a483930be213), because there is no connection to a JobManager.

15:59:48,456 WARN  Remoting                                                      - Tried to associate with unreachable remote address [[hidden email]]. Address is now gated for 5000 ms, all messages to this address will be delivered to dead letters. Reason: Connection refused: /10.204.62.71:6123

16:01:28,409 ERROR org.apache.flink.client.CliFrontend                           - Error while running the command.

org.apache.flink.client.program.ProgramInvocationException: The program execution failed: Communication with JobManager failed: Lost connection to the JobManager.

 

I do not understand what could be the root cause of this… the IPs look ok and there is not firewall to block things…

 

Dr. Radu Tudoran

Research Engineer - Big Data Expert

IT R&D Division

 

cid:image007.jpg@01CD52EB.AD060EE0

HUAWEI TECHNOLOGIES Duesseldorf GmbH

European Research Center

Riesstrasse 25, 80992 München

 

E-mail: [hidden email]

Mobile: +49 15209084330

Telephone: +49 891588344173

 

HUAWEI TECHNOLOGIES Duesseldorf GmbH
Hansaallee 205, 40549 Düsseldorf, Germany,
www.huawei.com
Registered Office: Düsseldorf, Register Court Düsseldorf, HRB 56063,
Managing Director: Bo PENG, Wanzhou MENG, Lifang CHEN
Sitz der Gesellschaft: Düsseldorf, Amtsgericht Düsseldorf, HRB 56063,
Geschäftsführer: Bo PENG, Wanzhou MENG, Lifang CHEN

This e-mail and its attachments contain confidential information from HUAWEI, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it!

 

From: Chesnay Schepler [[hidden email]]
Sent: Thursday, April 21, 2016 3:58 PM
To: [hidden email]
Subject: Re: lost connection

 

Hello,

the first step is always to check the logs under /log. The JobManager log in particular may contain clues as why no connection could be established.

Regards,
Chesnay

On 21.04.2016 15:44, Radu Tudoran wrote:

Hi,

 

I am trying to submit a jar via the console (flink run my.jar). The result is that I get an error saying that the communication with the jobmanager failed: Lost connection to the jobmanager.

Can you give me some hints/ recommendations about approaching this issue.

 

Thanks

 

Dr. Radu Tudoran

Research Engineer - Big Data Expert

IT R&D Division

 

cid:image007.jpg@01CD52EB.AD060EE0

HUAWEI TECHNOLOGIES Duesseldorf GmbH

European Research Center

Riesstrasse 25, 80992 München

 

E-mail: [hidden email]

Mobile: +49 15209084330

Telephone: +49 891588344173

 

HUAWEI TECHNOLOGIES Duesseldorf GmbH
Hansaallee 205, 40549 Düsseldorf, Germany,
www.huawei.com
Registered Office: Düsseldorf, Register Court Düsseldorf, HRB 56063,
Managing Director: Bo PENG, Wanzhou MENG, Lifang CHEN
Sitz der Gesellschaft: Düsseldorf, Amtsgericht Düsseldorf, HRB 56063,
Geschäftsführer: Bo PENG, Wanzhou MENG, Lifang CHEN

This e-mail and its attachments contain confidential information from HUAWEI, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it!

 

 


Reply | Threaded
Open this post in threaded view
|

RE: lost connection

Radu Tudoran

Yes – it suddenly occurred on something that used to work. I am restarting the deployment to see if this solves the problem

 

Dr. Radu Tudoran

Research Engineer - Big Data Expert

IT R&D Division

 

cid:image007.jpg@01CD52EB.AD060EE0

HUAWEI TECHNOLOGIES Duesseldorf GmbH

European Research Center

Riesstrasse 25, 80992 München

 

E-mail: [hidden email]

Mobile: +49 15209084330

Telephone: +49 891588344173

 

HUAWEI TECHNOLOGIES Duesseldorf GmbH
Hansaallee 205, 40549 Düsseldorf, Germany,
www.huawei.com
Registered Office: Düsseldorf, Register Court Düsseldorf, HRB 56063,
Managing Director: Bo PENG, Wanzhou MENG, Lifang CHEN
Sitz der Gesellschaft: Düsseldorf, Amtsgericht Düsseldorf, HRB 56063,
Geschäftsführer: Bo PENG, Wanzhou MENG, Lifang CHEN

This e-mail and its attachments contain confidential information from HUAWEI, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it!

 

From: Chesnay Schepler [mailto:[hidden email]]
Sent: Thursday, April 21, 2016 4:26 PM
To: [hidden email]
Subject: Re: lost connection

 

That is an exempt from the client log, can you check the JobManager log? It could have crashed, and if so the cause is hopefully in there.

Did this issue suddenly occur; as in have you run a job successfully on the system before? (to exclude network configuration issues)

Regards,
Chesnay

On 21.04.2016 16:09, Radu Tudoran wrote:

- Could not submit job Operator2 execution (170aef70d31f3fee62f8a483930be213), because there is no connection to a JobManager.

15:59:48,456 WARN  Remoting                                                      - Tried to associate with unreachable remote address [[hidden email]]. Address is now gated for 5000 ms, all messages to this address will be delivered to dead letters. Reason: Connection refused: /10.204.62.71:6123

16:01:28,409 ERROR org.apache.flink.client.CliFrontend                           - Error while running the command.

org.apache.flink.client.program.ProgramInvocationException: The program execution failed: Communication with JobManager failed: Lost connection to the JobManager.

 

I do not understand what could be the root cause of this… the IPs look ok and there is not firewall to block things…

 

Dr. Radu Tudoran

Research Engineer - Big Data Expert

IT R&D Division

 

cid:image007.jpg@01CD52EB.AD060EE0

HUAWEI TECHNOLOGIES Duesseldorf GmbH

European Research Center

Riesstrasse 25, 80992 München

 

E-mail: [hidden email]

Mobile: +49 15209084330

Telephone: +49 891588344173

 

HUAWEI TECHNOLOGIES Duesseldorf GmbH
Hansaallee 205, 40549 Düsseldorf, Germany,
www.huawei.com
Registered Office: Düsseldorf, Register Court Düsseldorf, HRB 56063,
Managing Director: Bo PENG, Wanzhou MENG, Lifang CHEN
Sitz der Gesellschaft: Düsseldorf, Amtsgericht Düsseldorf, HRB 56063,
Geschäftsführer: Bo PENG, Wanzhou MENG, Lifang CHEN

This e-mail and its attachments contain confidential information from HUAWEI, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it!

 

From: Chesnay Schepler [[hidden email]]
Sent: Thursday, April 21, 2016 3:58 PM
To: [hidden email]
Subject: Re: lost connection

 

Hello,

the first step is always to check the logs under /log. The JobManager log in particular may contain clues as why no connection could be established.

Regards,
Chesnay

On 21.04.2016 15:44, Radu Tudoran wrote:

Hi,

 

I am trying to submit a jar via the console (flink run my.jar). The result is that I get an error saying that the communication with the jobmanager failed: Lost connection to the jobmanager.

Can you give me some hints/ recommendations about approaching this issue.

 

Thanks

 

Dr. Radu Tudoran

Research Engineer - Big Data Expert

IT R&D Division

 

cid:image007.jpg@01CD52EB.AD060EE0

HUAWEI TECHNOLOGIES Duesseldorf GmbH

European Research Center

Riesstrasse 25, 80992 München

 

E-mail: [hidden email]

Mobile: +49 15209084330

Telephone: +49 891588344173

 

HUAWEI TECHNOLOGIES Duesseldorf GmbH
Hansaallee 205, 40549 Düsseldorf, Germany,
www.huawei.com
Registered Office: Düsseldorf, Register Court Düsseldorf, HRB 56063,
Managing Director: Bo PENG, Wanzhou MENG, Lifang CHEN
Sitz der Gesellschaft: Düsseldorf, Amtsgericht Düsseldorf, HRB 56063,
Geschäftsführer: Bo PENG, Wanzhou MENG, Lifang CHEN

This e-mail and its attachments contain confidential information from HUAWEI, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it!