thanks a lot for the support. Stack trace appended.Any idea of where the issue could be?I log the execution time of every single map, and everything finishes within milliseconds. Even then the exception happens. (as I catch it, print, and throw it again).I think it may be something about GC.My problem is that it may fail or it may not. Yesterday I could complete the whole scan without problems, the the job failed over another error. Today, the same code failed after 3.5h, a little before completion of the first phase.The caching block is set to 100, and the scan timeout is 900.000 milliseconds (15min). The job would run normally in around 0.5 seconds on a 100 entries... therefore I must be hitting something deep. Something related on how Hadoop and Hbase work together.The problem happens randomly, and I don't think it is related to the business logic, or the scan configurations for what matters.The code runs like a charm on a dataset of a million with few duplicates (3 minutes), but hits the scanner timeout over a dataset of 9.2M.hi all,I am facing an odd issue while running a quite complex duplicates detection process.
saluti,
Stefano
Error: org.apache.hadoop.hbase.client.ScannerTimeoutException: 2387347ms passed since the last invocation, timeout is currently set to 900000
at org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:352)
at org.apache.flink.addons.hbase.TableInputFormat.nextRecord(TableInputFormat.java:106)
at org.apache.flink.addons.hbase.TableInputFormat.nextRecord(TableInputFormat.java:48)
at org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:195)
at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:246)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.hbase.UnknownScannerException: org.apache.hadoop.hbase.UnknownScannerException: Name: 291, already closed?
at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3043)
at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29497)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2012)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:160)
at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:38)
at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:110)
at java.lang.Thread.run(Thread.java:745)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)
at org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:283)
at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:198)
at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:57)
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:114)
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:90)
at org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:336)
... 5 more
Caused by: org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.UnknownScannerException): org.apache.hadoop.hbase.UnknownScannerException: Name: 291, already closed?
at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3043)
at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29497)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2012)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:160)
at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:38)
at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:110)
at java.lang.Thread.run(Thread.java:745)
at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1458)
at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1662)
at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1720)
at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:29900)
at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:168)
Free forum by Nabble | Edit this page |