(DEPRECATED) Apache Flink User Mailing List archive.

Not terminating process on a cluster

Classic

List

Threaded

6 messages Options

Hilmi Yildirim

Not terminating process on a cluster

Hi,
I built a batch process which reads from Hbase, process the data and
writes the result into a text file. When I run the process local then it
works great. If I run it on a cluster then it seems to work but it does
not terminate. In the logs there is the following message:

09:04:39,194 INFO
org.apache.hadoop.hbase.client.RpcRetryingCaller - Call
exception, tries=10, retries=35, retryTime=109803ms, msg=row
'5797669374912039332' on table 'table' at null
09:04:59,350 INFO
org.apache.hadoop.hbase.client.RpcRetryingCaller - Call
exception, tries=11, retries=35, retryTime=129959ms, msg=row
'5797669374912039332' on table 'table' at null
09:05:19,361 INFO
org.apache.hadoop.hbase.client.RpcRetryingCaller - Call
exception, tries=12, retries=35, retryTime=149970ms, msg=row
'5797669374912039332' on table 'table' at null
09:05:39,392 INFO
org.apache.hadoop.hbase.client.RpcRetryingCaller - Call
exception, tries=13, retries=35, retryTime=170001ms, msg=row
'5797669374912039332' on table 'table' at null
09:05:59,465 INFO
org.apache.hadoop.hbase.client.RpcRetryingCaller - Call
exception, tries=14, retries=35, retryTime=190074ms, msg=row
'5797669374912039332' on table 'table' at null
09:06:19,554 INFO
org.apache.hadoop.hbase.client.RpcRetryingCaller - Call
exception, tries=15, retries=35, retryTime=210163ms, msg=row
'5797669374912039332' on table 'table' at null

Does anyone know the reason for that?

Best Regards,

--
--
Hilmi Yildirim
Software Developer R&D

http://www.neofonie.de

Besuchen Sie den Neo Tech Blog für Anwender:
http://blog.neofonie.de/

Folgen Sie uns:
https://plus.google.com/+neofonie
http://www.linkedin.com/company/neofonie-gmbh
https://www.xing.com/companies/neofoniegmbh

Neofonie GmbH | Robert-Koch-Platz 4 | 10115 Berlin
Handelsregister Berlin-Charlottenburg: HRB 67460
Geschäftsführung: Thomas Kitlitschko

Stephan Ewen

Re: Not terminating process on a cluster

This looks like an HBase specific think.

At what point does this log come? After the data source task finished? During processing?

On Wed, May 27, 2015 at 9:11 AM, Hilmi Yildirim <[hidden email]> wrote:

Hi,
I built a batch process which reads from Hbase, process the data and writes the result into a text file. When I run the process local then it works great. If I run it on a cluster then it seems to work but it does not terminate. In the logs there is the following message:

09:04:39,194 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller - Call exception, tries=10, retries=35, retryTime=109803ms, msg=row '5797669374912039332' on table 'table' at null
09:04:59,350 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller - Call exception, tries=11, retries=35, retryTime=129959ms, msg=row '5797669374912039332' on table 'table' at null
09:05:19,361 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller - Call exception, tries=12, retries=35, retryTime=149970ms, msg=row '5797669374912039332' on table 'table' at null
09:05:39,392 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller - Call exception, tries=13, retries=35, retryTime=170001ms, msg=row '5797669374912039332' on table 'table' at null
09:05:59,465 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller - Call exception, tries=14, retries=35, retryTime=190074ms, msg=row '5797669374912039332' on table 'table' at null
09:06:19,554 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller - Call exception, tries=15, retries=35, retryTime=210163ms, msg=row '5797669374912039332' on table 'table' at null

Does anyone know the reason for that?

Best Regards,

--
--
Hilmi Yildirim
Software Developer R&D

http://www.neofonie.de

Besuchen Sie den Neo Tech Blog für Anwender:
http://blog.neofonie.de/

Folgen Sie uns:
https://plus.google.com/+neofonie
http://www.linkedin.com/company/neofonie-gmbh
https://www.xing.com/companies/neofoniegmbh

Neofonie GmbH | Robert-Koch-Platz 4 | 10115 Berlin
Handelsregister Berlin-Charlottenburg: HRB 67460
Geschäftsführung: Thomas Kitlitschko

Hilmi Yildirim

Re: Not terminating process on a cluster

it is during the reading process

Am 27.05.2015 um 10:12 schrieb Stephan Ewen:

This looks like an HBase specific think.

At what point does this log come? After the data source task finished? During processing?

On Wed, May 27, 2015 at 9:11 AM, Hilmi Yildirim <[hidden email]> wrote:

Hi,
I built a batch process which reads from Hbase, process the data and writes the result into a text file. When I run the process local then it works great. If I run it on a cluster then it seems to work but it does not terminate. In the logs there is the following message:

09:04:39,194 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller - Call exception, tries=10, retries=35, retryTime=109803ms, msg=row '5797669374912039332' on table 'table' at null
09:04:59,350 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller - Call exception, tries=11, retries=35, retryTime=129959ms, msg=row '5797669374912039332' on table 'table' at null
09:05:19,361 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller - Call exception, tries=12, retries=35, retryTime=149970ms, msg=row '5797669374912039332' on table 'table' at null
09:05:39,392 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller - Call exception, tries=13, retries=35, retryTime=170001ms, msg=row '5797669374912039332' on table 'table' at null
09:05:59,465 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller - Call exception, tries=14, retries=35, retryTime=190074ms, msg=row '5797669374912039332' on table 'table' at null
09:06:19,554 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller - Call exception, tries=15, retries=35, retryTime=210163ms, msg=row '5797669374912039332' on table 'table' at null

Does anyone know the reason for that?

Best Regards,

--
--
Hilmi Yildirim
Software Developer R&D

http://www.neofonie.de

Besuchen Sie den Neo Tech Blog für Anwender:
http://blog.neofonie.de/

Folgen Sie uns:
https://plus.google.com/+neofonie
http://www.linkedin.com/company/neofonie-gmbh
https://www.xing.com/companies/neofoniegmbh

Neofonie GmbH | Robert-Koch-Platz 4 | 10115 Berlin
Handelsregister Berlin-Charlottenburg: HRB 67460
Geschäftsführung: Thomas Kitlitschko

-- 
--
Hilmi Yildirim
Software Developer R&D

T: +49 30 24627-281
[hidden email]

http://www.neofonie.de

Besuchen Sie den Neo Tech Blog für Anwender:
http://blog.neofonie.de/

Folgen Sie uns:
https://plus.google.com/+neofonie
http://www.linkedin.com/company/neofonie-gmbh
https://www.xing.com/companies/neofoniegmbh

Neofonie GmbH | Robert-Koch-Platz 4 | 10115 Berlin
Handelsregister Berlin-Charlottenburg: HRB 67460
Geschäftsführung: Thomas Kitlitschko

Hilmi Yildirim

Fwd: Re: Not terminating process on a cluster

In my job I modified the TableInputFormat so that it only reads the first 100 records of the HBase table. With this modification the errors occured. Now, I imported the first 100 entries of the HBase table to another table and I configured that the job reads the whole table. As a result, it works now.

-------- Weitergeleitete Nachricht --------

Betreff:	Re: Not terminating process on a cluster
Datum:	Wed, 27 May 2015 10:40:13 +0200
Von:	Hilmi Yildirim [hidden email]
An:	[hidden email]

it is during the reading process

Am 27.05.2015 um 10:12 schrieb Stephan Ewen:

This looks like an HBase specific think.

At what point does this log come? After the data source task finished? During processing?

On Wed, May 27, 2015 at 9:11 AM, Hilmi Yildirim <[hidden email]> wrote:

Hi,
I built a batch process which reads from Hbase, process the data and writes the result into a text file. When I run the process local then it works great. If I run it on a cluster then it seems to work but it does not terminate. In the logs there is the following message:

09:04:39,194 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller - Call exception, tries=10, retries=35, retryTime=109803ms, msg=row '5797669374912039332' on table 'table' at null
09:04:59,350 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller - Call exception, tries=11, retries=35, retryTime=129959ms, msg=row '5797669374912039332' on table 'table' at null
09:05:19,361 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller - Call exception, tries=12, retries=35, retryTime=149970ms, msg=row '5797669374912039332' on table 'table' at null
09:05:39,392 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller - Call exception, tries=13, retries=35, retryTime=170001ms, msg=row '5797669374912039332' on table 'table' at null
09:05:59,465 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller - Call exception, tries=14, retries=35, retryTime=190074ms, msg=row '5797669374912039332' on table 'table' at null
09:06:19,554 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller - Call exception, tries=15, retries=35, retryTime=210163ms, msg=row '5797669374912039332' on table 'table' at null

Does anyone know the reason for that?

Best Regards,

--
--
Hilmi Yildirim
Software Developer R&D

http://www.neofonie.de

Besuchen Sie den Neo Tech Blog für Anwender:
http://blog.neofonie.de/

Folgen Sie uns:
https://plus.google.com/+neofonie
http://www.linkedin.com/company/neofonie-gmbh
https://www.xing.com/companies/neofoniegmbh

Neofonie GmbH | Robert-Koch-Platz 4 | 10115 Berlin
Handelsregister Berlin-Charlottenburg: HRB 67460
Geschäftsführung: Thomas Kitlitschko

-- 
--
Hilmi Yildirim
Software Developer R&D

T: +49 30 24627-281
[hidden email]

http://www.neofonie.de

Besuchen Sie den Neo Tech Blog für Anwender:
http://blog.neofonie.de/

Folgen Sie uns:
https://plus.google.com/+neofonie
http://www.linkedin.com/company/neofonie-gmbh
https://www.xing.com/companies/neofoniegmbh

Neofonie GmbH | Robert-Koch-Platz 4 | 10115 Berlin
Handelsregister Berlin-Charlottenburg: HRB 67460
Geschäftsführung: Thomas Kitlitschko

Stephan Ewen

Re: Re: Not terminating process on a cluster

Okay. Can there have been some bug in that logic, trying to address a non existing row?

On Wed, May 27, 2015 at 2:25 PM, Hilmi Yildirim <[hidden email]> wrote:

In my job I modified the TableInputFormat so that it only reads the first 100 records of the HBase table. With this modification the errors occured. Now, I imported the first 100 entries of the HBase table to another table and I configured that the job reads the whole table. As a result, it works now.
-------- Weitergeleitete Nachricht --------

Betreff: Re: Not terminating process on a cluster

Datum: Wed, 27 May 2015 10:40:13 +0200

Von: Hilmi Yildirim [hidden email]

An: [hidden email]
it is during the reading process

Am 27.05.2015 um 10:12 schrieb Stephan Ewen:

This looks like an HBase specific think.

At what point does this log come? After the data source task finished? During processing?

On Wed, May 27, 2015 at 9:11 AM, Hilmi Yildirim <[hidden email]> wrote:

Hi,
I built a batch process which reads from Hbase, process the data and writes the result into a text file. When I run the process local then it works great. If I run it on a cluster then it seems to work but it does not terminate. In the logs there is the following message:

09:04:39,194 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller - Call exception, tries=10, retries=35, retryTime=109803ms, msg=row '5797669374912039332' on table 'table' at null
09:04:59,350 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller - Call exception, tries=11, retries=35, retryTime=129959ms, msg=row '5797669374912039332' on table 'table' at null
09:05:19,361 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller - Call exception, tries=12, retries=35, retryTime=149970ms, msg=row '5797669374912039332' on table 'table' at null
09:05:39,392 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller - Call exception, tries=13, retries=35, retryTime=170001ms, msg=row '5797669374912039332' on table 'table' at null
09:05:59,465 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller - Call exception, tries=14, retries=35, retryTime=190074ms, msg=row '5797669374912039332' on table 'table' at null
09:06:19,554 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller - Call exception, tries=15, retries=35, retryTime=210163ms, msg=row '5797669374912039332' on table 'table' at null

Does anyone know the reason for that?

Best Regards,

--
--
Hilmi Yildirim
Software Developer R&D

http://www.neofonie.de

Besuchen Sie den Neo Tech Blog für Anwender:
http://blog.neofonie.de/

Folgen Sie uns:
https://plus.google.com/+neofonie
http://www.linkedin.com/company/neofonie-gmbh
https://www.xing.com/companies/neofoniegmbh

Neofonie GmbH | Robert-Koch-Platz 4 | 10115 Berlin
Handelsregister Berlin-Charlottenburg: HRB 67460
Geschäftsführung: Thomas Kitlitschko
-- 
--
Hilmi Yildirim
Software Developer R&D

T: <a href="tel:%2B49%2030%2024627-281" value="+493024627281" target="_blank">+49 30 24627-281
[hidden email]

http://www.neofonie.de

Besuchen Sie den Neo Tech Blog für Anwender:
http://blog.neofonie.de/

Folgen Sie uns:
https://plus.google.com/+neofonie
http://www.linkedin.com/company/neofonie-gmbh
https://www.xing.com/companies/neofoniegmbh

Neofonie GmbH | Robert-Koch-Platz 4 | 10115 Berlin
Handelsregister Berlin-Charlottenburg: HRB 67460
Geschäftsführung: Thomas Kitlitschko

Hilmi Yildirim

Re: Not terminating process on a cluster

I'm not sue. I have overwritten the reachedEnd() method of the TableInputFormat. There I have declared that it returns true if it the nextRecord method was called 100 times.

Am 27.05.2015 um 15:57 schrieb Stephan Ewen:

Okay. Can there have been some bug in that logic, trying to address a non existing row?
On Wed, May 27, 2015 at 2:25 PM, Hilmi Yildirim <[hidden email]> wrote:
In my job I modified the TableInputFormat so that it only reads the first 100 records of the HBase table. With this modification the errors occured. Now, I imported the first 100 entries of the HBase table to another table and I configured that the job reads the whole table. As a result, it works now.
-------- Weitergeleitete Nachricht --------

Betreff: Re: Not terminating process on a cluster

Datum: Wed, 27 May 2015 10:40:13 +0200

Von: Hilmi Yildirim [hidden email]

An: [hidden email]
it is during the reading process

Am 27.05.2015 um 10:12 schrieb Stephan Ewen:

This looks like an HBase specific think.

At what point does this log come? After the data source task finished? During processing?

On Wed, May 27, 2015 at 9:11 AM, Hilmi Yildirim <[hidden email]> wrote:

Hi,
I built a batch process which reads from Hbase, process the data and writes the result into a text file. When I run the process local then it works great. If I run it on a cluster then it seems to work but it does not terminate. In the logs there is the following message:

09:04:39,194 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller - Call exception, tries=10, retries=35, retryTime=109803ms, msg=row '5797669374912039332' on table 'table' at null
09:04:59,350 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller - Call exception, tries=11, retries=35, retryTime=129959ms, msg=row '5797669374912039332' on table 'table' at null
09:05:19,361 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller - Call exception, tries=12, retries=35, retryTime=149970ms, msg=row '5797669374912039332' on table 'table' at null
09:05:39,392 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller - Call exception, tries=13, retries=35, retryTime=170001ms, msg=row '5797669374912039332' on table 'table' at null
09:05:59,465 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller - Call exception, tries=14, retries=35, retryTime=190074ms, msg=row '5797669374912039332' on table 'table' at null
09:06:19,554 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller - Call exception, tries=15, retries=35, retryTime=210163ms, msg=row '5797669374912039332' on table 'table' at null

Does anyone know the reason for that?

Best Regards,

--
--
Hilmi Yildirim
Software Developer R&D

http://www.neofonie.de

Besuchen Sie den Neo Tech Blog für Anwender:
http://blog.neofonie.de/

Folgen Sie uns:
https://plus.google.com/+neofonie
http://www.linkedin.com/company/neofonie-gmbh
https://www.xing.com/companies/neofoniegmbh

Neofonie GmbH | Robert-Koch-Platz 4 | 10115 Berlin
Handelsregister Berlin-Charlottenburg: HRB 67460
Geschäftsführung: Thomas Kitlitschko
-- 
--
Hilmi Yildirim
Software Developer R&D

T: <a moz-do-not-send="true" href="tel:%2B49%2030%2024627-281" value="+493024627281" target="_blank">+49 30 24627-281
[hidden email]

http://www.neofonie.de

Besuchen Sie den Neo Tech Blog für Anwender:
http://blog.neofonie.de/

Folgen Sie uns:
https://plus.google.com/+neofonie
http://www.linkedin.com/company/neofonie-gmbh
https://www.xing.com/companies/neofoniegmbh

Neofonie GmbH | Robert-Koch-Platz 4 | 10115 Berlin
Handelsregister Berlin-Charlottenburg: HRB 67460
Geschäftsführung: Thomas Kitlitschko

-- 
--
Hilmi Yildirim
Software Developer R&D

T: +49 30 24627-281
[hidden email]

http://www.neofonie.de

Besuchen Sie den Neo Tech Blog für Anwender:
http://blog.neofonie.de/

Folgen Sie uns:
https://plus.google.com/+neofonie
http://www.linkedin.com/company/neofonie-gmbh
https://www.xing.com/companies/neofoniegmbh

Neofonie GmbH | Robert-Koch-Platz 4 | 10115 Berlin
Handelsregister Berlin-Charlottenburg: HRB 67460
Geschäftsführung: Thomas Kitlitschko