Hi, I am trying to get the HBaseReadExample to run. I have filled a table with the HBaseWriteExample and purposely split it over 3 regions. Now when I try to read from it the first split seems to be scanned (170 rows) fine and after that the Connections of Zookeeper and RCP are suddenly closed down. Does anyone has an idea why this is happening? Best regards, Lydia 22:28:10,178 DEBUG org.apache.flink.runtime.operators.DataSourceTask - Opening input split Locatable Split (2) at [grips5:60020]: DataSource (at createInput(ExecutionEnvironment.java:502) (org.apache.flink.HBaseReadExample$1)) (1/1) 22:28:10,178 INFO org.apache.flink.addons.hbase.TableInputFormat - opening split [2|[grips5:60020]|aaaaaaaa|-] 22:28:10,189 DEBUG org.apache.zookeeper.ClientCnxn - Reading reply sessionid:0x24ff6a96ecd000a, packet:: clientPath:null serverPath:null finished:false header:: 3,4 replyHeader:: 3,51539607639,0 request:: '/hbase/meta-region-server,F response:: #ffffffff0001a726567696f6e7365727665723a363030$ 22:28:10,202 DEBUG org.apache.zookeeper.ClientCnxn - Reading reply sessionid:0x24ff6a96ecd000a, packet:: clientPath:null serverPath:null finished:false header:: 4,4 replyHeader:: 4,51539607639,0 request:: '/hbase/meta-region-server,F response:: #ffffffff0001a726567696f6e7365727665723a363030$ 22:28:10,211 DEBUG LocalActorRefProvider(<a href="akka://flink" class="">akka://flink) - resolve of path sequence [/temp/$b] failed 22:28:10,233 DEBUG org.apache.hadoop.hbase.util.ByteStringer - Failed to classload HBaseZeroCopyByteString: java.lang.IllegalAccessError: class com.google.protobuf.HBaseZeroCopyByteString cannot access its superclass com.google.protobuf.LiteralByteString 22:28:10,358 DEBUG org.apache.hadoop.ipc.RpcClient - Use SIMPLE authentication for service ClientService, sasl=false 22:28:10,370 DEBUG org.apache.hadoop.ipc.RpcClient - Connecting to grips1/130.73.20.14:60020 22:28:10,380 DEBUG org.apache.hadoop.ipc.RpcClient - IPC Client (2145423150) connection to grips1/130.73.20.14:60020 from hduser: starting, connections 1 22:28:10,394 DEBUG org.apache.hadoop.ipc.RpcClient - IPC Client (2145423150) connection to grips1/130.73.20.14:60020 from hduser: got response header call_id: 0, totalSize: 469 bytes 22:28:10,397 DEBUG org.apache.hadoop.ipc.RpcClient - IPC Client (2145423150) connection to grips1/130.73.20.14:60020 from hduser: wrote request header call_id: 0 method_name: "Get" request_param: true 22:28:10,413 DEBUG org.apache.zookeeper.ClientCnxn - Reading reply sessionid:0x24ff6a96ecd000a, packet:: clientPath:null serverPath:null finished:false header:: 5,4 replyHeader:: 5,51539607639,0 request:: '/hbase/meta-region-server,F response:: #ffffffff0001a726567696f6e7365727665723a363030$ 22:28:10,424 DEBUG org.apache.hadoop.ipc.RpcClient - IPC Client (2145423150) connection to grips1/130.73.20.14:60020 from hduser: wrote request header call_id: 1 method_name: "Scan" request_param: true priority: 100 22:28:10,426 DEBUG org.apache.hadoop.ipc.RpcClient - IPC Client (2145423150) connection to grips1/130.73.20.14:60020 from hduser: got response header call_id: 1 cell_block_meta { length: 480 }, totalSize: 497 bytes 22:28:10,432 DEBUG org.apache.hadoop.hbase.client.ClientSmallScanner - Finished with small scan at {ENCODED => 1588230740, NAME => 'hbase:meta,,1', STARTKEY => '', ENDKEY => ''} 22:28:10,434 DEBUG org.apache.hadoop.ipc.RpcClient - Use SIMPLE authentication for service ClientService, sasl=false 22:28:10,434 DEBUG org.apache.hadoop.ipc.RpcClient - Connecting to grips5/130.73.20.16:60020 22:28:10,435 DEBUG org.apache.hadoop.ipc.RpcClient - IPC Client (2145423150) connection to grips5/130.73.20.16:60020 from hduser: wrote request header call_id: 2 method_name: "Scan" request_param: true 22:28:10,436 DEBUG org.apache.hadoop.ipc.RpcClient - IPC Client (2145423150) connection to grips5/130.73.20.16:60020 from hduser: starting, connections 2 22:28:10,437 DEBUG org.apache.hadoop.ipc.RpcClient - IPC Client (2145423150) connection to grips5/130.73.20.16:60020 from hduser: got response header call_id: 2, totalSize: 12 bytes 22:28:10,438 DEBUG org.apache.flink.runtime.operators.DataSourceTask - Starting to read input from split Locatable Split (2) at [grips5:60020]: DataSource (at createInput(ExecutionEnvironment.java:502) (org.apache.flink.HBaseReadExample$1)) (1/1) 22:28:10,438 DEBUG org.apache.hadoop.ipc.RpcClient - IPC Client (2145423150) connection to grips5/130.73.20.16:60020 from hduser: wrote request header call_id: 3 method_name: "Scan" request_param: true 22:28:10,457 DEBUG org.apache.hadoop.ipc.RpcClient - IPC Client (2145423150) connection to grips5/130.73.20.16:60020 from hduser: got response header call_id: 3 cell_block_meta { length: 4679 }, totalSize: 4899 bytes 22:28:10,476 DEBUG org.apache.hadoop.ipc.RpcClient - IPC Client (2145423150) connection to grips5/130.73.20.16:60020 from hduser: wrote request header call_id: 4 method_name: "Scan" request_param: true 22:28:10,480 DEBUG org.apache.hadoop.ipc.RpcClient - IPC Client (2145423150) connection to grips5/130.73.20.16:60020 from hduser: got response header call_id: 4 cell_block_meta { length: 3306 }, totalSize: 3466 bytes 22:28:10,482 DEBUG org.apache.hadoop.ipc.RpcClient - IPC Client (2145423150) connection to grips5/130.73.20.16:60020 from hduser: got response header call_id: 5, totalSize: 8 bytes 22:28:10,482 DEBUG org.apache.hadoop.ipc.RpcClient - IPC Client (2145423150) connection to grips5/130.73.20.16:60020 from hduser: wrote request header call_id: 5 method_name: "Scan" request_param: true 22:28:10,487 DEBUG org.apache.flink.runtime.operators.DataSourceTask - Closing input split Locatable Split (2) at [grips5:60020]: DataSource (at createInput(ExecutionEnvironment.java:502) (org.apache.flink.HBaseReadExample$1)) (1/1) 22:28:10,489 INFO org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Closing zookeeper sessionid=0x24ff6a96ecd000a 22:28:10,489 DEBUG org.apache.zookeeper.ZooKeeper - Closing session: 0x24ff6a96ecd000a 22:28:10,489 DEBUG org.apache.zookeeper.ClientCnxn - Closing client for session: 0x24ff6a96ecd000a 22:28:10,499 DEBUG org.apache.zookeeper.ClientCnxn - Reading reply sessionid:0x24ff6a96ecd000a, packet:: clientPath:null serverPath:null finished:false header:: 6,-11 replyHeader:: 6,51539607640,0 request:: null response:: null 22:28:10,499 DEBUG org.apache.zookeeper.ClientCnxn - Disconnecting client for session: 0x24ff6a96ecd000a 22:28:10,499 INFO org.apache.zookeeper.ClientCnxn - EventThread shut down 22:28:10,499 INFO org.apache.zookeeper.ZooKeeper - Session: 0x24ff6a96ecd000a closed 22:28:10,499 DEBUG org.apache.hadoop.ipc.RpcClient - Stopping rpc client 22:28:10,501 DEBUG org.apache.hadoop.ipc.RpcClient - IPC Client (2145423150) connection to grips1/130.73.20.14:60020 from hduser: closed 22:28:10,502 DEBUG org.apache.hadoop.ipc.RpcClient - IPC Client (2145423150) connection to grips1/130.73.20.14:60020 from hduser: stopped, connections 0 22:28:10,502 DEBUG org.apache.hadoop.ipc.RpcClient - IPC Client (2145423150) connection to grips5/130.73.20.16:60020 from hduser: closed 22:28:10,502 DEBUG org.apache.hadoop.ipc.RpcClient - IPC Client (2145423150) connection to grips5/130.73.20.16:60020 from hduser: stopped, connections 0 22:28:10,502 INFO org.apache.flink.addons.hbase.TableInputFormat - Closing split (scanned 170 rows) 22:28:10,508 DEBUG org.apache.flink.runtime.operators.DataSourceTask - Opening input split Locatable Split (1) at [grips4:60020]: DataSource (at createInput(ExecutionEnvironment.java:502) (org.apache.flink.HBaseReadExample$1)) (1/1) 22:28:10,509 INFO org.apache.flink.addons.hbase.TableInputFormat - opening split [1|[grips4:60020]|55555555|aaaaaaaa] 22:28:11,380 DEBUG org.apache.flink.runtime.taskmanager.TaskManager - Received message SendHeartbeat at <a href="akka://flink/user/taskmanager" class="">akka://flink/user/taskmanager from Actor[<a href="akka://flink/deadLetters" class="">akka://flink/deadLetters]. 22:28:11,380 DEBUG org.apache.flink.runtime.taskmanager.TaskManager - Sending heartbeat to JobManager |
It might me that this is causing the problem: https://issues.apache.org/jira/browse/HBASE-10304
In your log I see the same exception. Anyone has any idea what we could do about this? On Tue, 22 Sep 2015 at 22:40 Lydia Ickler <[hidden email]> wrote:
|
In the issue, it states that it should be sufficient to append the hbase-protocol.jar file to the Hadoop classpath. Flink respects the Hadoop classpath and will append it to its own classpath upon launching a cluster. To do that, you need to modify the classpath with one of the commands below. Note that this has to be performed on all cluster nodes. export HADOOP_CLASSPATH="${HADOOP_CLASSPATH}:/path/to/hbase-protocol.jar" export HADOOP_CLASSPATH="${HADOOP_CLASSPATH}:$(hbase mapredcp)" export HADOOP_CLASSPATH="${HADOOP_CLASSPATH}:$(hbase classpath)" Alternatively, you can build a fat jar from your project with the missing dependency. Flink will then automatically distribute the jar file upon job submission. Just add this Maven dependency to your fat-jar pom: <dependency> <groupId>org.apache.hbase</groupId> <artifactId>hbase-protocol</artifactId> <version>1.1.2</version> </dependency> Let me know if any of the two approaches work for you. After all, this is a workaround because of an HBase optimzation.. Cheers, Max On Wed, Sep 23, 2015 at 11:16 AM, Aljoscha Krettek <[hidden email]> wrote:
|
Hi I tried that but unfortunately it still gets stuck at the second split.
Can it be that I have set something in my configurations wrong? In Hadoop? Or Flink? The strange thing is that the HBaseWriteExample works great! Best regards, Lydia
|
In reply to this post by Maximilian Michels
I am really trying to get HBase to work... Is there maybe a tutorial for all the config files? Best regards, Lydia
|
I'm really sorry that you are facing the issue. I saw your message on the Hbase-user mailing list [1]. Maybe you can follow up with Ted so that he can help you. There are only a few Flink user on this mailing list using it with HBase. I actually think that the problem is more on the HBase than on the Flink side, so the HBase list can probably help you better. (I'm not saying we should stop helping you here, I just think that the chances on the HBase list are much higher ;) ) On Thu, Sep 24, 2015 at 12:49 PM, Lydia Ickler <[hidden email]> wrote:
|
I'm actually the last developer that touched the HBase connector but I never faced that problems with the version specified in the extension pom.
From what I can tell looking at your logs it seems that there are some classpath problem ( Failed to classload HBaseZeroCopyByteString: java.lang.IllegalAccessError: class com.google.protobuf.HBaseZeroCopyByteString cannot access its superclass com.google.protobuf.LiteralByteString). If I were you I'd check if there are any jar conflict resolved bad by maven and if all the required jars are in the classpath. The only strange thing that that extension does is to manage client timeouts that kills the scanner resources of HBase since Flink jobs are lazy in data fetching and HBase can close the client connection if two consecutive calls to scanner.next() takes too much. Best, Flavio On Thu, Sep 24, 2015 at 2:07 PM, Robert Metzger <[hidden email]> wrote:
|
Free forum by Nabble | Edit this page |