Not terminating process on a cluster

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Not terminating process on a cluster

Hilmi Yildirim
Hi,
I built a batch process which reads from Hbase, process the data and
writes the result into a text file. When I run the process local then it
works great. If I run it on a cluster then it seems to work but it does
not terminate. In the logs there is the following message:

09:04:39,194 INFO
org.apache.hadoop.hbase.client.RpcRetryingCaller              - Call
exception, tries=10, retries=35, retryTime=109803ms, msg=row
'5797669374912039332' on table 'table' at null
09:04:59,350 INFO
org.apache.hadoop.hbase.client.RpcRetryingCaller              - Call
exception, tries=11, retries=35, retryTime=129959ms, msg=row
'5797669374912039332' on table 'table' at null
09:05:19,361 INFO
org.apache.hadoop.hbase.client.RpcRetryingCaller              - Call
exception, tries=12, retries=35, retryTime=149970ms, msg=row
'5797669374912039332' on table 'table' at null
09:05:39,392 INFO
org.apache.hadoop.hbase.client.RpcRetryingCaller              - Call
exception, tries=13, retries=35, retryTime=170001ms, msg=row
'5797669374912039332' on table 'table' at null
09:05:59,465 INFO
org.apache.hadoop.hbase.client.RpcRetryingCaller              - Call
exception, tries=14, retries=35, retryTime=190074ms, msg=row
'5797669374912039332' on table 'table' at null
09:06:19,554 INFO
org.apache.hadoop.hbase.client.RpcRetryingCaller              - Call
exception, tries=15, retries=35, retryTime=210163ms, msg=row
'5797669374912039332' on table 'table' at null


Does anyone know the reason for that?

Best Regards,

--
--
Hilmi Yildirim
Software Developer R&D


http://www.neofonie.de

Besuchen Sie den Neo Tech Blog für Anwender:
http://blog.neofonie.de/

Folgen Sie uns:
https://plus.google.com/+neofonie
http://www.linkedin.com/company/neofonie-gmbh
https://www.xing.com/companies/neofoniegmbh

Neofonie GmbH | Robert-Koch-Platz 4 | 10115 Berlin
Handelsregister Berlin-Charlottenburg: HRB 67460
Geschäftsführung: Thomas Kitlitschko

Reply | Threaded
Open this post in threaded view
|

Re: Not terminating process on a cluster

Stephan Ewen
This looks like an HBase specific think.

At what point does this log come? After the data source task finished? During processing?

On Wed, May 27, 2015 at 9:11 AM, Hilmi Yildirim <[hidden email]> wrote:
Hi,
I built a batch process which reads from Hbase, process the data and writes the result into a text file. When I run the process local then it works great. If I run it on a cluster then it seems to work but it does not terminate. In the logs there is the following message:

09:04:39,194 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller              - Call exception, tries=10, retries=35, retryTime=109803ms, msg=row '5797669374912039332' on table 'table' at null
09:04:59,350 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller              - Call exception, tries=11, retries=35, retryTime=129959ms, msg=row '5797669374912039332' on table 'table' at null
09:05:19,361 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller              - Call exception, tries=12, retries=35, retryTime=149970ms, msg=row '5797669374912039332' on table 'table' at null
09:05:39,392 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller              - Call exception, tries=13, retries=35, retryTime=170001ms, msg=row '5797669374912039332' on table 'table' at null
09:05:59,465 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller              - Call exception, tries=14, retries=35, retryTime=190074ms, msg=row '5797669374912039332' on table 'table' at null
09:06:19,554 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller              - Call exception, tries=15, retries=35, retryTime=210163ms, msg=row '5797669374912039332' on table 'table' at null


Does anyone know the reason for that?

Best Regards,

--
--
Hilmi Yildirim
Software Developer R&D


http://www.neofonie.de

Besuchen Sie den Neo Tech Blog für Anwender:
http://blog.neofonie.de/

Folgen Sie uns:
https://plus.google.com/+neofonie
http://www.linkedin.com/company/neofonie-gmbh
https://www.xing.com/companies/neofoniegmbh

Neofonie GmbH | Robert-Koch-Platz 4 | 10115 Berlin
Handelsregister Berlin-Charlottenburg: HRB 67460
Geschäftsführung: Thomas Kitlitschko


Reply | Threaded
Open this post in threaded view
|

Re: Not terminating process on a cluster

Hilmi Yildirim
it is during the reading process

Am 27.05.2015 um 10:12 schrieb Stephan Ewen:
This looks like an HBase specific think.

At what point does this log come? After the data source task finished? During processing?

On Wed, May 27, 2015 at 9:11 AM, Hilmi Yildirim <[hidden email]> wrote:
Hi,
I built a batch process which reads from Hbase, process the data and writes the result into a text file. When I run the process local then it works great. If I run it on a cluster then it seems to work but it does not terminate. In the logs there is the following message:

09:04:39,194 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller              - Call exception, tries=10, retries=35, retryTime=109803ms, msg=row '5797669374912039332' on table 'table' at null
09:04:59,350 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller              - Call exception, tries=11, retries=35, retryTime=129959ms, msg=row '5797669374912039332' on table 'table' at null
09:05:19,361 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller              - Call exception, tries=12, retries=35, retryTime=149970ms, msg=row '5797669374912039332' on table 'table' at null
09:05:39,392 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller              - Call exception, tries=13, retries=35, retryTime=170001ms, msg=row '5797669374912039332' on table 'table' at null
09:05:59,465 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller              - Call exception, tries=14, retries=35, retryTime=190074ms, msg=row '5797669374912039332' on table 'table' at null
09:06:19,554 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller              - Call exception, tries=15, retries=35, retryTime=210163ms, msg=row '5797669374912039332' on table 'table' at null


Does anyone know the reason for that?

Best Regards,

--
--
Hilmi Yildirim
Software Developer R&D


http://www.neofonie.de

Besuchen Sie den Neo Tech Blog für Anwender:
http://blog.neofonie.de/

Folgen Sie uns:
https://plus.google.com/+neofonie
http://www.linkedin.com/company/neofonie-gmbh
https://www.xing.com/companies/neofoniegmbh

Neofonie GmbH | Robert-Koch-Platz 4 | 10115 Berlin
Handelsregister Berlin-Charlottenburg: HRB 67460
Geschäftsführung: Thomas Kitlitschko



-- 
--
Hilmi Yildirim
Software Developer R&D

T: +49 30 24627-281
[hidden email]

http://www.neofonie.de

Besuchen Sie den Neo Tech Blog für Anwender:
http://blog.neofonie.de/

Folgen Sie uns:
https://plus.google.com/+neofonie
http://www.linkedin.com/company/neofonie-gmbh
https://www.xing.com/companies/neofoniegmbh

Neofonie GmbH | Robert-Koch-Platz 4 | 10115 Berlin
Handelsregister Berlin-Charlottenburg: HRB 67460
Geschäftsführung: Thomas Kitlitschko
Reply | Threaded
Open this post in threaded view
|

Fwd: Re: Not terminating process on a cluster

Hilmi Yildirim
In my job I modified the TableInputFormat so that it only reads the first 100 records of the HBase table. With this modification the errors occured. Now, I imported the first 100 entries of the HBase table to another table and I configured that the job reads the whole table. As a result, it works now.

-------- Weitergeleitete Nachricht --------
Betreff: Re: Not terminating process on a cluster
Datum: Wed, 27 May 2015 10:40:13 +0200
Von: Hilmi Yildirim [hidden email]
An: [hidden email]


it is during the reading process

Am 27.05.2015 um 10:12 schrieb Stephan Ewen:
This looks like an HBase specific think.

At what point does this log come? After the data source task finished? During processing?

On Wed, May 27, 2015 at 9:11 AM, Hilmi Yildirim <[hidden email]> wrote:
Hi,
I built a batch process which reads from Hbase, process the data and writes the result into a text file. When I run the process local then it works great. If I run it on a cluster then it seems to work but it does not terminate. In the logs there is the following message:

09:04:39,194 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller              - Call exception, tries=10, retries=35, retryTime=109803ms, msg=row '5797669374912039332' on table 'table' at null
09:04:59,350 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller              - Call exception, tries=11, retries=35, retryTime=129959ms, msg=row '5797669374912039332' on table 'table' at null
09:05:19,361 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller              - Call exception, tries=12, retries=35, retryTime=149970ms, msg=row '5797669374912039332' on table 'table' at null
09:05:39,392 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller              - Call exception, tries=13, retries=35, retryTime=170001ms, msg=row '5797669374912039332' on table 'table' at null
09:05:59,465 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller              - Call exception, tries=14, retries=35, retryTime=190074ms, msg=row '5797669374912039332' on table 'table' at null
09:06:19,554 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller              - Call exception, tries=15, retries=35, retryTime=210163ms, msg=row '5797669374912039332' on table 'table' at null


Does anyone know the reason for that?

Best Regards,

--
--
Hilmi Yildirim
Software Developer R&D


http://www.neofonie.de

Besuchen Sie den Neo Tech Blog für Anwender:
http://blog.neofonie.de/

Folgen Sie uns:
https://plus.google.com/+neofonie
http://www.linkedin.com/company/neofonie-gmbh
https://www.xing.com/companies/neofoniegmbh

Neofonie GmbH | Robert-Koch-Platz 4 | 10115 Berlin
Handelsregister Berlin-Charlottenburg: HRB 67460
Geschäftsführung: Thomas Kitlitschko



-- 
--
Hilmi Yildirim
Software Developer R&D

T: +49 30 24627-281
[hidden email]

http://www.neofonie.de

Besuchen Sie den Neo Tech Blog für Anwender:
http://blog.neofonie.de/

Folgen Sie uns:
https://plus.google.com/+neofonie
http://www.linkedin.com/company/neofonie-gmbh
https://www.xing.com/companies/neofoniegmbh

Neofonie GmbH | Robert-Koch-Platz 4 | 10115 Berlin
Handelsregister Berlin-Charlottenburg: HRB 67460
Geschäftsführung: Thomas Kitlitschko


Reply | Threaded
Open this post in threaded view
|

Re: Re: Not terminating process on a cluster

Stephan Ewen
Okay. Can there have been some bug in that logic, trying to address a non existing row?

On Wed, May 27, 2015 at 2:25 PM, Hilmi Yildirim <[hidden email]> wrote:
In my job I modified the TableInputFormat so that it only reads the first 100 records of the HBase table. With this modification the errors occured. Now, I imported the first 100 entries of the HBase table to another table and I configured that the job reads the whole table. As a result, it works now.

-------- Weitergeleitete Nachricht --------
Betreff: Re: Not terminating process on a cluster
Datum: Wed, 27 May 2015 10:40:13 +0200
Von: Hilmi Yildirim [hidden email]
An: [hidden email]


it is during the reading process

Am 27.05.2015 um 10:12 schrieb Stephan Ewen:
This looks like an HBase specific think.

At what point does this log come? After the data source task finished? During processing?

On Wed, May 27, 2015 at 9:11 AM, Hilmi Yildirim <[hidden email]> wrote:
Hi,
I built a batch process which reads from Hbase, process the data and writes the result into a text file. When I run the process local then it works great. If I run it on a cluster then it seems to work but it does not terminate. In the logs there is the following message:

09:04:39,194 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller              - Call exception, tries=10, retries=35, retryTime=109803ms, msg=row '5797669374912039332' on table 'table' at null
09:04:59,350 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller              - Call exception, tries=11, retries=35, retryTime=129959ms, msg=row '5797669374912039332' on table 'table' at null
09:05:19,361 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller              - Call exception, tries=12, retries=35, retryTime=149970ms, msg=row '5797669374912039332' on table 'table' at null
09:05:39,392 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller              - Call exception, tries=13, retries=35, retryTime=170001ms, msg=row '5797669374912039332' on table 'table' at null
09:05:59,465 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller              - Call exception, tries=14, retries=35, retryTime=190074ms, msg=row '5797669374912039332' on table 'table' at null
09:06:19,554 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller              - Call exception, tries=15, retries=35, retryTime=210163ms, msg=row '5797669374912039332' on table 'table' at null


Does anyone know the reason for that?

Best Regards,

--
--
Hilmi Yildirim
Software Developer R&D


http://www.neofonie.de

Besuchen Sie den Neo Tech Blog für Anwender:
http://blog.neofonie.de/

Folgen Sie uns:
https://plus.google.com/+neofonie
http://www.linkedin.com/company/neofonie-gmbh
https://www.xing.com/companies/neofoniegmbh

Neofonie GmbH | Robert-Koch-Platz 4 | 10115 Berlin
Handelsregister Berlin-Charlottenburg: HRB 67460
Geschäftsführung: Thomas Kitlitschko



-- 
--
Hilmi Yildirim
Software Developer R&D

T: <a href="tel:%2B49%2030%2024627-281" value="+493024627281" target="_blank">+49 30 24627-281
[hidden email]

http://www.neofonie.de

Besuchen Sie den Neo Tech Blog für Anwender:
http://blog.neofonie.de/

Folgen Sie uns:
https://plus.google.com/+neofonie
http://www.linkedin.com/company/neofonie-gmbh
https://www.xing.com/companies/neofoniegmbh

Neofonie GmbH | Robert-Koch-Platz 4 | 10115 Berlin
Handelsregister Berlin-Charlottenburg: HRB 67460
Geschäftsführung: Thomas Kitlitschko



Reply | Threaded
Open this post in threaded view
|

Re: Not terminating process on a cluster

Hilmi Yildirim
I'm not sue. I have overwritten the reachedEnd() method of the TableInputFormat. There I have declared that it returns true if it the nextRecord method was called 100 times.

Am 27.05.2015 um 15:57 schrieb Stephan Ewen:
Okay. Can there have been some bug in that logic, trying to address a non existing row?

On Wed, May 27, 2015 at 2:25 PM, Hilmi Yildirim <[hidden email]> wrote:
In my job I modified the TableInputFormat so that it only reads the first 100 records of the HBase table. With this modification the errors occured. Now, I imported the first 100 entries of the HBase table to another table and I configured that the job reads the whole table. As a result, it works now.

-------- Weitergeleitete Nachricht --------
Betreff: Re: Not terminating process on a cluster
Datum: Wed, 27 May 2015 10:40:13 +0200
Von: Hilmi Yildirim [hidden email]
An: [hidden email]


it is during the reading process

Am 27.05.2015 um 10:12 schrieb Stephan Ewen:
This looks like an HBase specific think.

At what point does this log come? After the data source task finished? During processing?

On Wed, May 27, 2015 at 9:11 AM, Hilmi Yildirim <[hidden email]> wrote:
Hi,
I built a batch process which reads from Hbase, process the data and writes the result into a text file. When I run the process local then it works great. If I run it on a cluster then it seems to work but it does not terminate. In the logs there is the following message:

09:04:39,194 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller              - Call exception, tries=10, retries=35, retryTime=109803ms, msg=row '5797669374912039332' on table 'table' at null
09:04:59,350 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller              - Call exception, tries=11, retries=35, retryTime=129959ms, msg=row '5797669374912039332' on table 'table' at null
09:05:19,361 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller              - Call exception, tries=12, retries=35, retryTime=149970ms, msg=row '5797669374912039332' on table 'table' at null
09:05:39,392 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller              - Call exception, tries=13, retries=35, retryTime=170001ms, msg=row '5797669374912039332' on table 'table' at null
09:05:59,465 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller              - Call exception, tries=14, retries=35, retryTime=190074ms, msg=row '5797669374912039332' on table 'table' at null
09:06:19,554 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller              - Call exception, tries=15, retries=35, retryTime=210163ms, msg=row '5797669374912039332' on table 'table' at null


Does anyone know the reason for that?

Best Regards,

--
--
Hilmi Yildirim
Software Developer R&D


http://www.neofonie.de

Besuchen Sie den Neo Tech Blog für Anwender:
http://blog.neofonie.de/

Folgen Sie uns:
https://plus.google.com/+neofonie
http://www.linkedin.com/company/neofonie-gmbh
https://www.xing.com/companies/neofoniegmbh

Neofonie GmbH | Robert-Koch-Platz 4 | 10115 Berlin
Handelsregister Berlin-Charlottenburg: HRB 67460
Geschäftsführung: Thomas Kitlitschko



-- 
--
Hilmi Yildirim
Software Developer R&D

T: <a moz-do-not-send="true" href="tel:%2B49%2030%2024627-281" value="+493024627281" target="_blank">+49 30 24627-281
[hidden email]

http://www.neofonie.de

Besuchen Sie den Neo Tech Blog für Anwender:
http://blog.neofonie.de/

Folgen Sie uns:
https://plus.google.com/+neofonie
http://www.linkedin.com/company/neofonie-gmbh
https://www.xing.com/companies/neofoniegmbh

Neofonie GmbH | Robert-Koch-Platz 4 | 10115 Berlin
Handelsregister Berlin-Charlottenburg: HRB 67460
Geschäftsführung: Thomas Kitlitschko




-- 
--
Hilmi Yildirim
Software Developer R&D

T: +49 30 24627-281
[hidden email]

http://www.neofonie.de

Besuchen Sie den Neo Tech Blog für Anwender:
http://blog.neofonie.de/

Folgen Sie uns:
https://plus.google.com/+neofonie
http://www.linkedin.com/company/neofonie-gmbh
https://www.xing.com/companies/neofoniegmbh

Neofonie GmbH | Robert-Koch-Platz 4 | 10115 Berlin
Handelsregister Berlin-Charlottenburg: HRB 67460
Geschäftsführung: Thomas Kitlitschko