HBase write problem
Hi all. I have a problem writing to HBase. I am using a slightly modified example of this class to proof the concept: https://github.com/apache/flink/blob/master/flink-batch-connectors/flink-hbase/src/test/java/org/apache/flink/addons/hbase/example/HBaseWriteExample.java However all the HBase-specific stuff is exactly the same as in the HBaseWriteExample. The problem I see is that the job never completes (been running for more than an hour now) and it is only 13 key/value pairs that is to be written to HBase :-) I have tested the map/reduce stuff works if I replace the HBase connection stuff with just a write to a text file - works OK. I have also tested that I can insert data in HBase from a similar Hadoop MapReduce job. Here is the part of the code where I guess the problem is: @Override public Tuple2<Text, Mutation> map(Tuple2<String, Integer> t) throws Exception { LOG.info("Tuple2 map() called"); reuse.f0 = new Text(t.f0); Put put = new Put(t.f0.getBytes()); put.add(MasterConstants.CF_SOME, MasterConstants.COUNT, Bytes.toBytes(t.f1)); reuse.f1 = put; return reuse; } }).output(new HadoopOutputFormat<Text, Mutation>(new TableOutputFormat<Text>(), job)); env.execute("Flink HBase Event Count Hello World Test"); This code matches the code in the HBaseWriteExample.java I should think. The "Tuple2" log line I see exactly the 13 times I expect, and the last log line I see is this: 2016-05-10 21:48:42,715 INFO org.apache.hadoop.hbase.mapreduce.TableOutputFormat - Created table instance for event_type_count Any suggestions to what the problem could be? Thanks, Palle |
Do you have the hbase-site.xml available in the classpath? On 10 May 2016 23:10, "Palle" <[hidden email]> wrote:
HBase write problem |
Thanks for the response, but I don't think the problem is the classpath - hbase-site.xml should be added. This is what it looks like (hbase conf is added at the end):
2016-05-11 09:16:45,831 INFO org.apache.zookeeper.ZooKeeper - Client environment:java.class.path=C:\systems\packages\flink-1.0.2\lib\flink-dist_2.11-1.0.2.jar;C:\systems\packages\flink-1.0.2\lib\flink-python_2.11-1.0.2.jar;C:\systems\packages\flink-1.0.2\lib\guava-11.0.2.jar;C:\systems\packages\flink-1.0.2\lib\hbase-annotations-1.2.1-tests.jar;C:\systems\packages\flink-1.0.2\lib\hbase-annotations-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-client-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-common-1.2.1-tests.jar;C:\systems\packages\flink-1.0.2\lib\hbase-common-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-examples-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-external-blockcache-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-hadoop-compat-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-hadoop2-compat-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-it-1.2.1-tests.jar;C:\systems\packages\flink-1.0.2\lib\hbase-it-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-prefix-tree-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-procedure-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-protocol-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-resource-bundle-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-rest-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-server-1.2.1-tests.jar;C:\systems\packages\flink-1.0.2\lib\hbase-server-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-shell-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-thrift-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\log4j-1.2.17.jar;C:\systems\packages\flink-1.0.2\lib\slf4j-log4j12-1.7.7.jar;C:\systems\master_flink\bin;C:\systems\packages\flink-1.0.2\lib;;C:\systems\packages\hbase-1.2.1\lib;C:\systems\hbase\conf;C:\systems\hbase\conf\hbase-site.xml; 2016-05-11 09:16:45,831 INFO org.apache.zookeeper.ZooKeeper - Client environment:java.library.path=C:\systems\packages\jre-1.8.0_74_x64\bin;C:\Windows\Sun\Java\bin;C:\Windows\system32;C:\Windows;C:\systems\master_flink\bin;C:\systems\packages\appsync-1.0.6\bin;C:\systems\packages\flink-1.0.2\bin;C:\systems\packages\jre-1.8.0_74_x64\bin;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;C:\Program Files\Microsoft SQL Server\120\DTS\Binn\;C:\Program Files\Microsoft SQL Server\Client SDK\ODBC\110\Tools\Binn\;C:\Program Files (x86)\Microsoft SQL Server\120\Tools\Binn\;C:\Program Files\Microsoft SQL Server\120\Tools\Binn\;C:\Program Files (x86)\Microsoft SQL Server\120\Tools\Binn\ManagementStudio\;C:\Program Files (x86)\Microsoft SQL Server\120\DTS\Binn\;C:\systems\packages\hadoop-2.7.2\bin;C:\systems\packages\jdk-1.8.0_74_x64\bin;C:\systems\packages\apache-maven-3.3.9\bin;C:\systems\packages\protoc-2.5.0-win32\;C:\systems\packages\cygwin64\bin\;C:\systems\packages\cmake-3.5.2-win32-x86\bin;C:\Program Files\Microsoft Windows Performance Toolkit\;C:\systems\packages\perl-5.6.0-win\bin;C:\systems\hbase\conf;. ----- Original meddelelse ----- Fra: Flavio Pompermaier <[hidden email]> |
Do you run the job from your IDE or from the cluster?
On Wed, May 11, 2016 at 9:22 AM, Palle <[hidden email]> wrote:
|
I run the job from the cluster. I run it through the web UI.
The jar file submitted does not contain the hbase-site.xml file. ----- Original meddelelse ----- Fra: Flavio Pompermaier <[hidden email]> |
And which version of HBase and Hadoop are you running?
Did you try to put the hbase-site.xml in the jar? Moreover, I don't know how much reliable is at the moment the web client UI..my experience is that the command line client is much more reliable. You just need to run from the flink dir something like: bin/flink run -c xxx.yyy.MyMainClass /path/to/shadedJar.jar On Wed, May 11, 2016 at 4:19 PM, Palle <[hidden email]> wrote:
|
Hadoop 2.7.2
HBase 1.2.1 I have this running from a Hadoop job, but just not from Flink. I will look into your suggestions, but would I be better off choosing another DB for storage? I can see that Cassandra gets some attention in this mailing list. I need to store app 2 bio key value pairs consisting of 100 bytes for each pair. ----- Original meddelelse ----- Fra: Flavio Pompermaier <[hidden email]> |
I can't help you with the choice of the db storage, as always the answer is "it depends" on a lot of factors :) For what I can tell you the problem could be that Flink support HBase 0.98, so it could worth to update Flink connectors to a more recent version (that should be backward compatible hopefully..) or maybe create two separte hbase connectors (one for hbase-0.9x and one for 0.1x). Let me know about your attempts :)On Wed, May 11, 2016 at 4:47 PM, Palle <[hidden email]> wrote:
|
Just to narrow down the problem: The insertion into HBase actually works, but the job does not finish after that? And the same job (same source of data) that writes to a file, or prints, finishes? If that is the case, can you check what status each task is in, via the web dashboard? Are all tasks still in "running"? On Wed, May 11, 2016 at 4:53 PM, Flavio Pompermaier <[hidden email]> wrote:
|
Hi guys.
Thanks for helping out. We downgraded to HBase 0.98 and resolved some classpath issues and then it worked. /Palle ----- Original meddelelse ----- Fra: Stephan Ewen <[hidden email]> |
Great :)
On Thu, May 12, 2016 at 10:01 AM, Palle <[hidden email]> wrote:
|
Free forum by Nabble | Edit this page |