HBase write problem

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

HBase write problem

palle
HBase write problem

Hi all.

I have a problem writing to HBase.

I am using a slightly modified example of this class to proof the concept:
https://github.com/apache/flink/blob/master/flink-batch-connectors/flink-hbase/src/test/java/org/apache/flink/addons/hbase/example/HBaseWriteExample.java

However all the HBase-specific stuff is exactly the same as in the HBaseWriteExample.

The problem I see is that the job never completes (been running for more than an hour now) and it is only 13 key/value pairs that is to be written to HBase :-)
I have tested the map/reduce stuff works if I replace the HBase connection stuff with just a write to a text file - works OK. I have also tested that I can insert data in HBase from a similar Hadoop MapReduce job.

Here is the part of the code where I guess the problem is:

      @Override
      public Tuple2<Text, Mutation> map(Tuple2<String, Integer> t) throws Exception {
        LOG.info("Tuple2 map() called");
        reuse.f0 = new Text(t.f0);
        Put put = new Put(t.f0.getBytes());
        put.add(MasterConstants.CF_SOME, MasterConstants.COUNT, Bytes.toBytes(t.f1));
        reuse.f1 = put;
        return reuse;
      }
    }).output(new HadoopOutputFormat<Text, Mutation>(new TableOutputFormat<Text>(), job));

    env.execute("Flink HBase Event Count Hello World Test");
 
This code matches the code in the HBaseWriteExample.java I should think.
 
The "Tuple2" log line I see exactly the 13 times I expect, and the last log line I see is this:
2016-05-10 21:48:42,715 INFO  org.apache.hadoop.hbase.mapreduce.TableOutputFormat           - Created table instance for event_type_count

Any suggestions to what the problem could be?

Thanks,
Palle
Reply | Threaded
Open this post in threaded view
|

Re: HBase write problem

Flavio Pompermaier

Do you have the hbase-site.xml available in the classpath?

On 10 May 2016 23:10, "Palle" <[hidden email]> wrote:
HBase write problem

Hi all.

I have a problem writing to HBase.

I am using a slightly modified example of this class to proof the concept:
https://github.com/apache/flink/blob/master/flink-batch-connectors/flink-hbase/src/test/java/org/apache/flink/addons/hbase/example/HBaseWriteExample.java

However all the HBase-specific stuff is exactly the same as in the HBaseWriteExample.

The problem I see is that the job never completes (been running for more than an hour now) and it is only 13 key/value pairs that is to be written to HBase :-)
I have tested the map/reduce stuff works if I replace the HBase connection stuff with just a write to a text file - works OK. I have also tested that I can insert data in HBase from a similar Hadoop MapReduce job.

Here is the part of the code where I guess the problem is:

      @Override
      public Tuple2<Text, Mutation> map(Tuple2<String, Integer> t) throws Exception {
        LOG.info("Tuple2 map() called");
        reuse.f0 = new Text(t.f0);
        Put put = new Put(t.f0.getBytes());
        put.add(MasterConstants.CF_SOME, MasterConstants.COUNT, Bytes.toBytes(t.f1));
        reuse.f1 = put;
        return reuse;
      }
    }).output(new HadoopOutputFormat<Text, Mutation>(new TableOutputFormat<Text>(), job));

    env.execute("Flink HBase Event Count Hello World Test");

This code matches the code in the HBaseWriteExample.java I should think.

The "Tuple2" log line I see exactly the 13 times I expect, and the last log line I see is this:
2016-05-10 21:48:42,715 INFO  org.apache.hadoop.hbase.mapreduce.TableOutputFormat           - Created table instance for event_type_count

Any suggestions to what the problem could be?

Thanks,
Palle
Reply | Threaded
Open this post in threaded view
|

Re: HBase write problem

palle
Thanks for the response, but I don't think the problem is the classpath - hbase-site.xml should be added. This is what it looks like (hbase conf is added at the end):

2016-05-11 09:16:45,831 INFO  org.apache.zookeeper.ZooKeeper                                - Client environment:java.class.path=C:\systems\packages\flink-1.0.2\lib\flink-dist_2.11-1.0.2.jar;C:\systems\packages\flink-1.0.2\lib\flink-python_2.11-1.0.2.jar;C:\systems\packages\flink-1.0.2\lib\guava-11.0.2.jar;C:\systems\packages\flink-1.0.2\lib\hbase-annotations-1.2.1-tests.jar;C:\systems\packages\flink-1.0.2\lib\hbase-annotations-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-client-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-common-1.2.1-tests.jar;C:\systems\packages\flink-1.0.2\lib\hbase-common-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-examples-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-external-blockcache-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-hadoop-compat-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-hadoop2-compat-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-it-1.2.1-tests.jar;C:\systems\packages\flink-1.0.2\lib\hbase-it-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-prefix-tree-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-procedure-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-protocol-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-resource-bundle-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-rest-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-server-1.2.1-tests.jar;C:\systems\packages\flink-1.0.2\lib\hbase-server-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-shell-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-thrift-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\log4j-1.2.17.jar;C:\systems\packages\flink-1.0.2\lib\slf4j-log4j12-1.7.7.jar;C:\systems\master_flink\bin;C:\systems\packages\flink-1.0.2\lib;;C:\systems\packages\hbase-1.2.1\lib;C:\systems\hbase\conf;C:\systems\hbase\conf\hbase-site.xml;

2016-05-11 09:16:45,831 INFO  org.apache.zookeeper.ZooKeeper                                - Client environment:java.library.path=C:\systems\packages\jre-1.8.0_74_x64\bin;C:\Windows\Sun\Java\bin;C:\Windows\system32;C:\Windows;C:\systems\master_flink\bin;C:\systems\packages\appsync-1.0.6\bin;C:\systems\packages\flink-1.0.2\bin;C:\systems\packages\jre-1.8.0_74_x64\bin;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;C:\Program Files\Microsoft SQL Server\120\DTS\Binn\;C:\Program Files\Microsoft SQL Server\Client SDK\ODBC\110\Tools\Binn\;C:\Program Files (x86)\Microsoft SQL Server\120\Tools\Binn\;C:\Program Files\Microsoft SQL Server\120\Tools\Binn\;C:\Program Files (x86)\Microsoft SQL Server\120\Tools\Binn\ManagementStudio\;C:\Program Files (x86)\Microsoft SQL Server\120\DTS\Binn\;C:\systems\packages\hadoop-2.7.2\bin;C:\systems\packages\jdk-1.8.0_74_x64\bin;C:\systems\packages\apache-maven-3.3.9\bin;C:\systems\packages\protoc-2.5.0-win32\;C:\systems\packages\cygwin64\bin\;C:\systems\packages\cmake-3.5.2-win32-x86\bin;C:\Program Files\Microsoft Windows Performance Toolkit\;C:\systems\packages\perl-5.6.0-win\bin;C:\systems\hbase\conf;.

----- Original meddelelse -----
Fra: Flavio Pompermaier <[hidden email]>
Til: user <[hidden email]>
Dato: Ons, 11. maj 2016 00:05
Emne: Re: HBase write problem

Do you have the hbase-site.xml available in the classpath?

On 10 May 2016 23:10, "Palle" <[hidden email]> wrote:
HBase write problem

Hi all.

I have a problem writing to HBase.

I am using a slightly modified example of this class to proof the concept:
https://github.com/apache/flink/blob/master/flink-batch-connectors/flink-hbase/src/test/java/org/apache/flink/addons/hbase/example/HBaseWriteExample.java

However all the HBase-specific stuff is exactly the same as in the HBaseWriteExample.

The problem I see is that the job never completes (been running for more than an hour now) and it is only 13 key/value pairs that is to be written to HBase :-)
I have tested the map/reduce stuff works if I replace the HBase connection stuff with just a write to a text file - works OK. I have also tested that I can insert data in HBase from a similar Hadoop MapReduce job.

Here is the part of the code where I guess the problem is:

      @Override
      public Tuple2<Text, Mutation> map(Tuple2<String, Integer> t) throws Exception {
        LOG.info("Tuple2 map() called");
        reuse.f0 = new Text(t.f0);
        Put put = new Put(t.f0.getBytes());
        put.add(MasterConstants.CF_SOME, MasterConstants.COUNT, Bytes.toBytes(t.f1));
        reuse.f1 = put;
        return reuse;
      }
    }).output(new HadoopOutputFormat<Text, Mutation>(new TableOutputFormat<Text>(), job));

    env.execute("Flink HBase Event Count Hello World Test");

This code matches the code in the HBaseWriteExample.java I should think.

The "Tuple2" log line I see exactly the 13 times I expect, and the last log line I see is this:
2016-05-10 21:48:42,715 INFO  org.apache.hadoop.hbase.mapreduce.TableOutputFormat           - Created table instance for event_type_count

Any suggestions to what the problem could be?

Thanks,
Palle
 
Reply | Threaded
Open this post in threaded view
|

Re: HBase write problem

Flavio Pompermaier
Do you run the job from your IDE or from the cluster?

On Wed, May 11, 2016 at 9:22 AM, Palle <[hidden email]> wrote:
Thanks for the response, but I don't think the problem is the classpath - hbase-site.xml should be added. This is what it looks like (hbase conf is added at the end):

2016-05-11 09:16:45,831 INFO  org.apache.zookeeper.ZooKeeper                                - Client environment:java.class.path=C:\systems\packages\flink-1.0.2\lib\flink-dist_2.11-1.0.2.jar;C:\systems\packages\flink-1.0.2\lib\flink-python_2.11-1.0.2.jar;C:\systems\packages\flink-1.0.2\lib\guava-11.0.2.jar;C:\systems\packages\flink-1.0.2\lib\hbase-annotations-1.2.1-tests.jar;C:\systems\packages\flink-1.0.2\lib\hbase-annotations-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-client-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-common-1.2.1-tests.jar;C:\systems\packages\flink-1.0.2\lib\hbase-common-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-examples-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-external-blockcache-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-hadoop-compat-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-hadoop2-compat-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-it-1.2.1-tests.jar;C:\systems\packages\flink-1.0.2\lib\hbase-it-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-prefix-tree-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-procedure-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-protocol-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-resource-bundle-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-rest-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-server-1.2.1-tests.jar;C:\systems\packages\flink-1.0.2\lib\hbase-server-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-shell-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-thrift-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\log4j-1.2.17.jar;C:\systems\packages\flink-1.0.2\lib\slf4j-log4j12-1.7.7.jar;C:\systems\master_flink\bin;C:\systems\packages\flink-1.0.2\lib;;C:\systems\packages\hbase-1.2.1\lib;C:\systems\hbase\conf;C:\systems\hbase\conf\hbase-site.xml;

2016-05-11 09:16:45,831 INFO  org.apache.zookeeper.ZooKeeper                                - Client environment:java.library.path=C:\systems\packages\jre-1.8.0_74_x64\bin;C:\Windows\Sun\Java\bin;C:\Windows\system32;C:\Windows;C:\systems\master_flink\bin;C:\systems\packages\appsync-1.0.6\bin;C:\systems\packages\flink-1.0.2\bin;C:\systems\packages\jre-1.8.0_74_x64\bin;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;C:\Program Files\Microsoft SQL Server\120\DTS\Binn\;C:\Program Files\Microsoft SQL Server\Client SDK\ODBC\110\Tools\Binn\;C:\Program Files (x86)\Microsoft SQL Server\120\Tools\Binn\;C:\Program Files\Microsoft SQL Server\120\Tools\Binn\;C:\Program Files (x86)\Microsoft SQL Server\120\Tools\Binn\ManagementStudio\;C:\Program Files (x86)\Microsoft SQL Server\120\DTS\Binn\;C:\systems\packages\hadoop-2.7.2\bin;C:\systems\packages\jdk-1.8.0_74_x64\bin;C:\systems\packages\apache-maven-3.3.9\bin;C:\systems\packages\protoc-2.5.0-win32\;C:\systems\packages\cygwin64\bin\;C:\systems\packages\cmake-3.5.2-win32-x86\bin;C:\Program Files\Microsoft Windows Performance Toolkit\;C:\systems\packages\perl-5.6.0-win\bin;C:\systems\hbase\conf;.

----- Original meddelelse -----
Fra: Flavio Pompermaier <[hidden email]>
Til: user <[hidden email]>
Dato: Ons, 11. maj 2016 00:05
Emne: Re: HBase write problem


Do you have the hbase-site.xml available in the classpath?

On 10 May 2016 23:10, "Palle" <[hidden email]> wrote:
HBase write problem

Hi all.

I have a problem writing to HBase.

I am using a slightly modified example of this class to proof the concept:
https://github.com/apache/flink/blob/master/flink-batch-connectors/flink-hbase/src/test/java/org/apache/flink/addons/hbase/example/HBaseWriteExample.java

However all the HBase-specific stuff is exactly the same as in the HBaseWriteExample.

The problem I see is that the job never completes (been running for more than an hour now) and it is only 13 key/value pairs that is to be written to HBase :-)
I have tested the map/reduce stuff works if I replace the HBase connection stuff with just a write to a text file - works OK. I have also tested that I can insert data in HBase from a similar Hadoop MapReduce job.

Here is the part of the code where I guess the problem is:

      @Override
      public Tuple2<Text, Mutation> map(Tuple2<String, Integer> t) throws Exception {
        LOG.info("Tuple2 map() called");
        reuse.f0 = new Text(t.f0);
        Put put = new Put(t.f0.getBytes());
        put.add(MasterConstants.CF_SOME, MasterConstants.COUNT, Bytes.toBytes(t.f1));
        reuse.f1 = put;
        return reuse;
      }
    }).output(new HadoopOutputFormat<Text, Mutation>(new TableOutputFormat<Text>(), job));

    env.execute("Flink HBase Event Count Hello World Test");

This code matches the code in the HBaseWriteExample.java I should think.

The "Tuple2" log line I see exactly the 13 times I expect, and the last log line I see is this:
2016-05-10 21:48:42,715 INFO  org.apache.hadoop.hbase.mapreduce.TableOutputFormat           - Created table instance for event_type_count

Any suggestions to what the problem could be?

Thanks,
Palle
 

Reply | Threaded
Open this post in threaded view
|

Re: HBase write problem

palle
I run the job from the cluster. I run it through the web UI.
The jar file submitted does not contain the hbase-site.xml file.

----- Original meddelelse -----
Fra: Flavio Pompermaier <[hidden email]>
Til: user <[hidden email]>
Dato: Ons, 11. maj 2016 09:36
Emne: Re: HBase write problem

Do you run the job from your IDE or from the cluster?

On Wed, May 11, 2016 at 9:22 AM, Palle <[hidden email]> wrote:
Thanks for the response, but I don't think the problem is the classpath - hbase-site.xml should be added. This is what it looks like (hbase conf is added at the end):

2016-05-11 09:16:45,831 INFO  org.apache.zookeeper.ZooKeeper                                - Client environment:java.class.path=C:\systems\packages\flink-1.0.2\lib\flink-dist_2.11-1.0.2.jar;C:\systems\packages\flink-1.0.2\lib\flink-python_2.11-1.0.2.jar;C:\systems\packages\flink-1.0.2\lib\guava-11.0.2.jar;C:\systems\packages\flink-1.0.2\lib\hbase-annotations-1.2.1-tests.jar;C:\systems\packages\flink-1.0.2\lib\hbase-annotations-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-client-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-common-1.2.1-tests.jar;C:\systems\packages\flink-1.0.2\lib\hbase-common-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-examples-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-external-blockcache-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-hadoop-compat-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-hadoop2-compat-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-it-1.2.1-tests.jar;C:\systems\packages\flink-1.0.2\lib\hbase-it-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-prefix-tree-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-procedure-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-protocol-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-resource-bundle-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-rest-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-server-1.2.1-tests.jar;C:\systems\packages\flink-1.0.2\lib\hbase-server-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-shell-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-thrift-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\log4j-1.2.17.jar;C:\systems\packages\flink-1.0.2\lib\slf4j-log4j12-1.7.7.jar;C:\systems\master_flink\bin;C:\systems\packages\flink-1.0.2\lib;;C:\systems\packages\hbase-1.2.1\lib;C:\systems\hbase\conf;C:\systems\hbase\conf\hbase-site.xml;

2016-05-11 09:16:45,831 INFO  org.apache.zookeeper.ZooKeeper                                - Client environment:java.library.path=C:\systems\packages\jre-1.8.0_74_x64\bin;C:\Windows\Sun\Java\bin;C:\Windows\system32;C:\Windows;C:\systems\master_flink\bin;C:\systems\packages\appsync-1.0.6\bin;C:\systems\packages\flink-1.0.2\bin;C:\systems\packages\jre-1.8.0_74_x64\bin;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;C:\Program Files\Microsoft SQL Server\120\DTS\Binn\;C:\Program Files\Microsoft SQL Server\Client SDK\ODBC\110\Tools\Binn\;C:\Program Files (x86)\Microsoft SQL Server\120\Tools\Binn\;C:\Program Files\Microsoft SQL Server\120\Tools\Binn\;C:\Program Files (x86)\Microsoft SQL Server\120\Tools\Binn\ManagementStudio\;C:\Program Files (x86)\Microsoft SQL Server\120\DTS\Binn\;C:\systems\packages\hadoop-2.7.2\bin;C:\systems\packages\jdk-1.8.0_74_x64\bin;C:\systems\packages\apache-maven-3.3.9\bin;C:\systems\packages\protoc-2.5.0-win32\;C:\systems\packages\cygwin64\bin\;C:\systems\packages\cmake-3.5.2-win32-x86\bin;C:\Program Files\Microsoft Windows Performance Toolkit\;C:\systems\packages\perl-5.6.0-win\bin;C:\systems\hbase\conf;.

----- Original meddelelse -----
Fra: Flavio Pompermaier <[hidden email]>
Til: user <[hidden email]>
Dato: Ons, 11. maj 2016 00:05
Emne: Re: HBase write problem


Do you have the hbase-site.xml available in the classpath?

On 10 May 2016 23:10, "Palle" <[hidden email]> wrote:
HBase write problem

Hi all.

I have a problem writing to HBase.

I am using a slightly modified example of this class to proof the concept:
https://github.com/apache/flink/blob/master/flink-batch-connectors/flink-hbase/src/test/java/org/apache/flink/addons/hbase/example/HBaseWriteExample.java

However all the HBase-specific stuff is exactly the same as in the HBaseWriteExample.

The problem I see is that the job never completes (been running for more than an hour now) and it is only 13 key/value pairs that is to be written to HBase :-)
I have tested the map/reduce stuff works if I replace the HBase connection stuff with just a write to a text file - works OK. I have also tested that I can insert data in HBase from a similar Hadoop MapReduce job.

Here is the part of the code where I guess the problem is:

      @Override
      public Tuple2<Text, Mutation> map(Tuple2<String, Integer> t) throws Exception {
        LOG.info("Tuple2 map() called");
        reuse.f0 = new Text(t.f0);
        Put put = new Put(t.f0.getBytes());
        put.add(MasterConstants.CF_SOME, MasterConstants.COUNT, Bytes.toBytes(t.f1));
        reuse.f1 = put;
        return reuse;
      }
    }).output(new HadoopOutputFormat<Text, Mutation>(new TableOutputFormat<Text>(), job));

    env.execute("Flink HBase Event Count Hello World Test");

This code matches the code in the HBaseWriteExample.java I should think.

The "Tuple2" log line I see exactly the 13 times I expect, and the last log line I see is this:
2016-05-10 21:48:42,715 INFO  org.apache.hadoop.hbase.mapreduce.TableOutputFormat           - Created table instance for event_type_count

Any suggestions to what the problem could be?

Thanks,
Palle
 

 
Reply | Threaded
Open this post in threaded view
|

Re: HBase write problem

Flavio Pompermaier
And which version of HBase and Hadoop are you running? 
Did you try to put the hbase-site.xml in the jar?
Moreover, I don't know how much reliable is at the moment the web client UI..my experience is that the command line client is much more reliable.

You just need to run from the flink dir something like:

   bin/flink  run -c  xxx.yyy.MyMainClass /path/to/shadedJar.jar

On Wed, May 11, 2016 at 4:19 PM, Palle <[hidden email]> wrote:
I run the job from the cluster. I run it through the web UI.
The jar file submitted does not contain the hbase-site.xml file.

----- Original meddelelse -----
Fra: Flavio Pompermaier <[hidden email]>
Til: user <[hidden email]>
Dato: Ons, 11. maj 2016 09:36

Emne: Re: HBase write problem

Do you run the job from your IDE or from the cluster?

On Wed, May 11, 2016 at 9:22 AM, Palle <[hidden email]> wrote:
Thanks for the response, but I don't think the problem is the classpath - hbase-site.xml should be added. This is what it looks like (hbase conf is added at the end):

2016-05-11 09:16:45,831 INFO  org.apache.zookeeper.ZooKeeper                                - Client environment:java.class.path=C:\systems\packages\flink-1.0.2\lib\flink-dist_2.11-1.0.2.jar;C:\systems\packages\flink-1.0.2\lib\flink-python_2.11-1.0.2.jar;C:\systems\packages\flink-1.0.2\lib\guava-11.0.2.jar;C:\systems\packages\flink-1.0.2\lib\hbase-annotations-1.2.1-tests.jar;C:\systems\packages\flink-1.0.2\lib\hbase-annotations-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-client-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-common-1.2.1-tests.jar;C:\systems\packages\flink-1.0.2\lib\hbase-common-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-examples-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-external-blockcache-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-hadoop-compat-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-hadoop2-compat-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-it-1.2.1-tests.jar;C:\systems\packages\flink-1.0.2\lib\hbase-it-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-prefix-tree-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-procedure-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-protocol-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-resource-bundle-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-rest-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-server-1.2.1-tests.jar;C:\systems\packages\flink-1.0.2\lib\hbase-server-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-shell-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-thrift-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\log4j-1.2.17.jar;C:\systems\packages\flink-1.0.2\lib\slf4j-log4j12-1.7.7.jar;C:\systems\master_flink\bin;C:\systems\packages\flink-1.0.2\lib;;C:\systems\packages\hbase-1.2.1\lib;C:\systems\hbase\conf;C:\systems\hbase\conf\hbase-site.xml;

2016-05-11 09:16:45,831 INFO  org.apache.zookeeper.ZooKeeper                                - Client environment:java.library.path=C:\systems\packages\jre-1.8.0_74_x64\bin;C:\Windows\Sun\Java\bin;C:\Windows\system32;C:\Windows;C:\systems\master_flink\bin;C:\systems\packages\appsync-1.0.6\bin;C:\systems\packages\flink-1.0.2\bin;C:\systems\packages\jre-1.8.0_74_x64\bin;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;C:\Program Files\Microsoft SQL Server\120\DTS\Binn\;C:\Program Files\Microsoft SQL Server\Client SDK\ODBC\110\Tools\Binn\;C:\Program Files (x86)\Microsoft SQL Server\120\Tools\Binn\;C:\Program Files\Microsoft SQL Server\120\Tools\Binn\;C:\Program Files (x86)\Microsoft SQL Server\120\Tools\Binn\ManagementStudio\;C:\Program Files (x86)\Microsoft SQL Server\120\DTS\Binn\;C:\systems\packages\hadoop-2.7.2\bin;C:\systems\packages\jdk-1.8.0_74_x64\bin;C:\systems\packages\apache-maven-3.3.9\bin;C:\systems\packages\protoc-2.5.0-win32\;C:\systems\packages\cygwin64\bin\;C:\systems\packages\cmake-3.5.2-win32-x86\bin;C:\Program Files\Microsoft Windows Performance Toolkit\;C:\systems\packages\perl-5.6.0-win\bin;C:\systems\hbase\conf;.

----- Original meddelelse -----
Fra: Flavio Pompermaier <[hidden email]>
Til: user <[hidden email]>
Dato: Ons, 11. maj 2016 00:05
Emne: Re: HBase write problem


Do you have the hbase-site.xml available in the classpath?

On 10 May 2016 23:10, "Palle" <[hidden email]> wrote:
HBase write problem

Hi all.

I have a problem writing to HBase.

I am using a slightly modified example of this class to proof the concept:
https://github.com/apache/flink/blob/master/flink-batch-connectors/flink-hbase/src/test/java/org/apache/flink/addons/hbase/example/HBaseWriteExample.java

However all the HBase-specific stuff is exactly the same as in the HBaseWriteExample.

The problem I see is that the job never completes (been running for more than an hour now) and it is only 13 key/value pairs that is to be written to HBase :-)
I have tested the map/reduce stuff works if I replace the HBase connection stuff with just a write to a text file - works OK. I have also tested that I can insert data in HBase from a similar Hadoop MapReduce job.

Here is the part of the code where I guess the problem is:

      @Override
      public Tuple2<Text, Mutation> map(Tuple2<String, Integer> t) throws Exception {
        LOG.info("Tuple2 map() called");
        reuse.f0 = new Text(t.f0);
        Put put = new Put(t.f0.getBytes());
        put.add(MasterConstants.CF_SOME, MasterConstants.COUNT, Bytes.toBytes(t.f1));
        reuse.f1 = put;
        return reuse;
      }
    }).output(new HadoopOutputFormat<Text, Mutation>(new TableOutputFormat<Text>(), job));

    env.execute("Flink HBase Event Count Hello World Test");

This code matches the code in the HBaseWriteExample.java I should think.

The "Tuple2" log line I see exactly the 13 times I expect, and the last log line I see is this:
2016-05-10 21:48:42,715 INFO  org.apache.hadoop.hbase.mapreduce.TableOutputFormat           - Created table instance for event_type_count

Any suggestions to what the problem could be?

Thanks,
Palle
 

 

Reply | Threaded
Open this post in threaded view
|

Re: HBase write problem

palle
Hadoop 2.7.2
HBase 1.2.1

I have this running from a Hadoop job, but just not from Flink.

I will look into your suggestions, but would I be better off choosing another DB for storage? I can see that  Cassandra gets some attention in this mailing list. I need to store app 2 bio key value pairs consisting of 100 bytes for each pair. 

----- Original meddelelse -----
Fra: Flavio Pompermaier <[hidden email]>
Til: user <[hidden email]>
Dato: Ons, 11. maj 2016 16:29
Emne: Re: HBase write problem

And which version of HBase and Hadoop are you running? 
Did you try to put the hbase-site.xml in the jar?
Moreover, I don't know how much reliable is at the moment the web client UI..my experience is that the command line client is much more reliable.
You just need to run from the flink dir something like:
   bin/flink  run -c  xxx.yyy.MyMainClass /path/to/shadedJar.jar

On Wed, May 11, 2016 at 4:19 PM, Palle <[hidden email]> wrote:
I run the job from the cluster. I run it through the web UI.
The jar file submitted does not contain the hbase-site.xml file.

----- Original meddelelse -----
Fra: Flavio Pompermaier <[hidden email]>
Til: user <[hidden email]>
Dato: Ons, 11. maj 2016 09:36

Emne: Re: HBase write problem

Do you run the job from your IDE or from the cluster?

On Wed, May 11, 2016 at 9:22 AM, Palle <[hidden email]> wrote:
Thanks for the response, but I don't think the problem is the classpath - hbase-site.xml should be added. This is what it looks like (hbase conf is added at the end):

2016-05-11 09:16:45,831 INFO  org.apache.zookeeper.ZooKeeper                                - Client environment:java.class.path=C:\systems\packages\flink-1.0.2\lib\flink-dist_2.11-1.0.2.jar;C:\systems\packages\flink-1.0.2\lib\flink-python_2.11-1.0.2.jar;C:\systems\packages\flink-1.0.2\lib\guava-11.0.2.jar;C:\systems\packages\flink-1.0.2\lib\hbase-annotations-1.2.1-tests.jar;C:\systems\packages\flink-1.0.2\lib\hbase-annotations-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-client-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-common-1.2.1-tests.jar;C:\systems\packages\flink-1.0.2\lib\hbase-common-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-examples-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-external-blockcache-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-hadoop-compat-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-hadoop2-compat-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-it-1.2.1-tests.jar;C:\systems\packages\flink-1.0.2\lib\hbase-it-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-prefix-tree-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-procedure-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-protocol-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-resource-bundle-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-rest-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-server-1.2.1-tests.jar;C:\systems\packages\flink-1.0.2\lib\hbase-server-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-shell-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-thrift-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\log4j-1.2.17.jar;C:\systems\packages\flink-1.0.2\lib\slf4j-log4j12-1.7.7.jar;C:\systems\master_flink\bin;C:\systems\packages\flink-1.0.2\lib;;C:\systems\packages\hbase-1.2.1\lib;C:\systems\hbase\conf;C:\systems\hbase\conf\hbase-site.xml;

2016-05-11 09:16:45,831 INFO  org.apache.zookeeper.ZooKeeper                                - Client environment:java.library.path=C:\systems\packages\jre-1.8.0_74_x64\bin;C:\Windows\Sun\Java\bin;C:\Windows\system32;C:\Windows;C:\systems\master_flink\bin;C:\systems\packages\appsync-1.0.6\bin;C:\systems\packages\flink-1.0.2\bin;C:\systems\packages\jre-1.8.0_74_x64\bin;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;C:\Program Files\Microsoft SQL Server\120\DTS\Binn\;C:\Program Files\Microsoft SQL Server\Client SDK\ODBC\110\Tools\Binn\;C:\Program Files (x86)\Microsoft SQL Server\120\Tools\Binn\;C:\Program Files\Microsoft SQL Server\120\Tools\Binn\;C:\Program Files (x86)\Microsoft SQL Server\120\Tools\Binn\ManagementStudio\;C:\Program Files (x86)\Microsoft SQL Server\120\DTS\Binn\;C:\systems\packages\hadoop-2.7.2\bin;C:\systems\packages\jdk-1.8.0_74_x64\bin;C:\systems\packages\apache-maven-3.3.9\bin;C:\systems\packages\protoc-2.5.0-win32\;C:\systems\packages\cygwin64\bin\;C:\systems\packages\cmake-3.5.2-win32-x86\bin;C:\Program Files\Microsoft Windows Performance Toolkit\;C:\systems\packages\perl-5.6.0-win\bin;C:\systems\hbase\conf;.

----- Original meddelelse -----
Fra: Flavio Pompermaier <[hidden email]>
Til: user <[hidden email]>
Dato: Ons, 11. maj 2016 00:05
Emne: Re: HBase write problem


Do you have the hbase-site.xml available in the classpath?

On 10 May 2016 23:10, "Palle" <[hidden email]> wrote:
HBase write problem

Hi all.

I have a problem writing to HBase.

I am using a slightly modified example of this class to proof the concept:
https://github.com/apache/flink/blob/master/flink-batch-connectors/flink-hbase/src/test/java/org/apache/flink/addons/hbase/example/HBaseWriteExample.java

However all the HBase-specific stuff is exactly the same as in the HBaseWriteExample.

The problem I see is that the job never completes (been running for more than an hour now) and it is only 13 key/value pairs that is to be written to HBase :-)
I have tested the map/reduce stuff works if I replace the HBase connection stuff with just a write to a text file - works OK. I have also tested that I can insert data in HBase from a similar Hadoop MapReduce job.

Here is the part of the code where I guess the problem is:

      @Override
      public Tuple2<Text, Mutation> map(Tuple2<String, Integer> t) throws Exception {
        LOG.info("Tuple2 map() called");
        reuse.f0 = new Text(t.f0);
        Put put = new Put(t.f0.getBytes());
        put.add(MasterConstants.CF_SOME, MasterConstants.COUNT, Bytes.toBytes(t.f1));
        reuse.f1 = put;
        return reuse;
      }
    }).output(new HadoopOutputFormat<Text, Mutation>(new TableOutputFormat<Text>(), job));

    env.execute("Flink HBase Event Count Hello World Test");

This code matches the code in the HBaseWriteExample.java I should think.

The "Tuple2" log line I see exactly the 13 times I expect, and the last log line I see is this:
2016-05-10 21:48:42,715 INFO  org.apache.hadoop.hbase.mapreduce.TableOutputFormat           - Created table instance for event_type_count

Any suggestions to what the problem could be?

Thanks,
Palle
 

 

 
Reply | Threaded
Open this post in threaded view
|

Re: HBase write problem

Flavio Pompermaier
I can't help you with the choice of the db storage, as always the answer is "it depends" on a lot of factors :)

For what I can tell you the problem could be that Flink support HBase 0.98, so it could worth to update Flink connectors to a more recent version (that should be backward compatible hopefully..) or maybe create two separte hbase connectors (one for hbase-0.9x and one for 0.1x). Let me know about your attempts :)

On Wed, May 11, 2016 at 4:47 PM, Palle <[hidden email]> wrote:
Hadoop 2.7.2
HBase 1.2.1

I have this running from a Hadoop job, but just not from Flink.

I will look into your suggestions, but would I be better off choosing another DB for storage? I can see that  Cassandra gets some attention in this mailing list. I need to store app 2 bio key value pairs consisting of 100 bytes for each pair. 

----- Original meddelelse -----
Fra: Flavio Pompermaier <[hidden email]>
Til: user <[hidden email]>
Dato: Ons, 11. maj 2016 16:29

Emne: Re: HBase write problem

And which version of HBase and Hadoop are you running? 
Did you try to put the hbase-site.xml in the jar?
Moreover, I don't know how much reliable is at the moment the web client UI..my experience is that the command line client is much more reliable.
You just need to run from the flink dir something like:
   bin/flink  run -c  xxx.yyy.MyMainClass /path/to/shadedJar.jar

On Wed, May 11, 2016 at 4:19 PM, Palle <[hidden email]> wrote:
I run the job from the cluster. I run it through the web UI.
The jar file submitted does not contain the hbase-site.xml file.

----- Original meddelelse -----
Fra: Flavio Pompermaier <[hidden email]>
Til: user <[hidden email]>
Dato: Ons, 11. maj 2016 09:36

Emne: Re: HBase write problem

Do you run the job from your IDE or from the cluster?

On Wed, May 11, 2016 at 9:22 AM, Palle <[hidden email]> wrote:
Thanks for the response, but I don't think the problem is the classpath - hbase-site.xml should be added. This is what it looks like (hbase conf is added at the end):

2016-05-11 09:16:45,831 INFO  org.apache.zookeeper.ZooKeeper                                - Client environment:java.class.path=C:\systems\packages\flink-1.0.2\lib\flink-dist_2.11-1.0.2.jar;C:\systems\packages\flink-1.0.2\lib\flink-python_2.11-1.0.2.jar;C:\systems\packages\flink-1.0.2\lib\guava-11.0.2.jar;C:\systems\packages\flink-1.0.2\lib\hbase-annotations-1.2.1-tests.jar;C:\systems\packages\flink-1.0.2\lib\hbase-annotations-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-client-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-common-1.2.1-tests.jar;C:\systems\packages\flink-1.0.2\lib\hbase-common-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-examples-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-external-blockcache-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-hadoop-compat-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-hadoop2-compat-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-it-1.2.1-tests.jar;C:\systems\packages\flink-1.0.2\lib\hbase-it-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-prefix-tree-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-procedure-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-protocol-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-resource-bundle-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-rest-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-server-1.2.1-tests.jar;C:\systems\packages\flink-1.0.2\lib\hbase-server-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-shell-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-thrift-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\log4j-1.2.17.jar;C:\systems\packages\flink-1.0.2\lib\slf4j-log4j12-1.7.7.jar;C:\systems\master_flink\bin;C:\systems\packages\flink-1.0.2\lib;;C:\systems\packages\hbase-1.2.1\lib;C:\systems\hbase\conf;C:\systems\hbase\conf\hbase-site.xml;

2016-05-11 09:16:45,831 INFO  org.apache.zookeeper.ZooKeeper                                - Client environment:java.library.path=C:\systems\packages\jre-1.8.0_74_x64\bin;C:\Windows\Sun\Java\bin;C:\Windows\system32;C:\Windows;C:\systems\master_flink\bin;C:\systems\packages\appsync-1.0.6\bin;C:\systems\packages\flink-1.0.2\bin;C:\systems\packages\jre-1.8.0_74_x64\bin;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;C:\Program Files\Microsoft SQL Server\120\DTS\Binn\;C:\Program Files\Microsoft SQL Server\Client SDK\ODBC\110\Tools\Binn\;C:\Program Files (x86)\Microsoft SQL Server\120\Tools\Binn\;C:\Program Files\Microsoft SQL Server\120\Tools\Binn\;C:\Program Files (x86)\Microsoft SQL Server\120\Tools\Binn\ManagementStudio\;C:\Program Files (x86)\Microsoft SQL Server\120\DTS\Binn\;C:\systems\packages\hadoop-2.7.2\bin;C:\systems\packages\jdk-1.8.0_74_x64\bin;C:\systems\packages\apache-maven-3.3.9\bin;C:\systems\packages\protoc-2.5.0-win32\;C:\systems\packages\cygwin64\bin\;C:\systems\packages\cmake-3.5.2-win32-x86\bin;C:\Program Files\Microsoft Windows Performance Toolkit\;C:\systems\packages\perl-5.6.0-win\bin;C:\systems\hbase\conf;.

----- Original meddelelse -----
Fra: Flavio Pompermaier <[hidden email]>
Til: user <[hidden email]>
Dato: Ons, 11. maj 2016 00:05
Emne: Re: HBase write problem


Do you have the hbase-site.xml available in the classpath?

On 10 May 2016 23:10, "Palle" <[hidden email]> wrote:
HBase write problem

Hi all.

I have a problem writing to HBase.

I am using a slightly modified example of this class to proof the concept:
https://github.com/apache/flink/blob/master/flink-batch-connectors/flink-hbase/src/test/java/org/apache/flink/addons/hbase/example/HBaseWriteExample.java

However all the HBase-specific stuff is exactly the same as in the HBaseWriteExample.

The problem I see is that the job never completes (been running for more than an hour now) and it is only 13 key/value pairs that is to be written to HBase :-)
I have tested the map/reduce stuff works if I replace the HBase connection stuff with just a write to a text file - works OK. I have also tested that I can insert data in HBase from a similar Hadoop MapReduce job.

Here is the part of the code where I guess the problem is:

      @Override
      public Tuple2<Text, Mutation> map(Tuple2<String, Integer> t) throws Exception {
        LOG.info("Tuple2 map() called");
        reuse.f0 = new Text(t.f0);
        Put put = new Put(t.f0.getBytes());
        put.add(MasterConstants.CF_SOME, MasterConstants.COUNT, Bytes.toBytes(t.f1));
        reuse.f1 = put;
        return reuse;
      }
    }).output(new HadoopOutputFormat<Text, Mutation>(new TableOutputFormat<Text>(), job));

    env.execute("Flink HBase Event Count Hello World Test");

This code matches the code in the HBaseWriteExample.java I should think.

The "Tuple2" log line I see exactly the 13 times I expect, and the last log line I see is this:
2016-05-10 21:48:42,715 INFO  org.apache.hadoop.hbase.mapreduce.TableOutputFormat           - Created table instance for event_type_count

Any suggestions to what the problem could be?

Thanks,
Palle
 

 

 

Reply | Threaded
Open this post in threaded view
|

Re: HBase write problem

Stephan Ewen
Just to narrow down the problem:

The insertion into HBase actually works, but the job does not finish after that?
And the same job (same source of data) that writes to a file, or prints, finishes?

If that is the case, can you check what status each task is in, via the web dashboard? Are all tasks still in "running"?


On Wed, May 11, 2016 at 4:53 PM, Flavio Pompermaier <[hidden email]> wrote:
I can't help you with the choice of the db storage, as always the answer is "it depends" on a lot of factors :)

For what I can tell you the problem could be that Flink support HBase 0.98, so it could worth to update Flink connectors to a more recent version (that should be backward compatible hopefully..) or maybe create two separte hbase connectors (one for hbase-0.9x and one for 0.1x). Let me know about your attempts :)


On Wed, May 11, 2016 at 4:47 PM, Palle <[hidden email]> wrote:
Hadoop 2.7.2
HBase 1.2.1

I have this running from a Hadoop job, but just not from Flink.

I will look into your suggestions, but would I be better off choosing another DB for storage? I can see that  Cassandra gets some attention in this mailing list. I need to store app 2 bio key value pairs consisting of 100 bytes for each pair. 

----- Original meddelelse -----
Fra: Flavio Pompermaier <[hidden email]>
Til: user <[hidden email]>
Dato: Ons, 11. maj 2016 16:29

Emne: Re: HBase write problem

And which version of HBase and Hadoop are you running? 
Did you try to put the hbase-site.xml in the jar?
Moreover, I don't know how much reliable is at the moment the web client UI..my experience is that the command line client is much more reliable.
You just need to run from the flink dir something like:
   bin/flink  run -c  xxx.yyy.MyMainClass /path/to/shadedJar.jar

On Wed, May 11, 2016 at 4:19 PM, Palle <[hidden email]> wrote:
I run the job from the cluster. I run it through the web UI.
The jar file submitted does not contain the hbase-site.xml file.

----- Original meddelelse -----
Fra: Flavio Pompermaier <[hidden email]>
Til: user <[hidden email]>
Dato: Ons, 11. maj 2016 09:36

Emne: Re: HBase write problem

Do you run the job from your IDE or from the cluster?

On Wed, May 11, 2016 at 9:22 AM, Palle <[hidden email]> wrote:
Thanks for the response, but I don't think the problem is the classpath - hbase-site.xml should be added. This is what it looks like (hbase conf is added at the end):

2016-05-11 09:16:45,831 INFO  org.apache.zookeeper.ZooKeeper                                - Client environment:java.class.path=C:\systems\packages\flink-1.0.2\lib\flink-dist_2.11-1.0.2.jar;C:\systems\packages\flink-1.0.2\lib\flink-python_2.11-1.0.2.jar;C:\systems\packages\flink-1.0.2\lib\guava-11.0.2.jar;C:\systems\packages\flink-1.0.2\lib\hbase-annotations-1.2.1-tests.jar;C:\systems\packages\flink-1.0.2\lib\hbase-annotations-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-client-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-common-1.2.1-tests.jar;C:\systems\packages\flink-1.0.2\lib\hbase-common-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-examples-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-external-blockcache-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-hadoop-compat-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-hadoop2-compat-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-it-1.2.1-tests.jar;C:\systems\packages\flink-1.0.2\lib\hbase-it-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-prefix-tree-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-procedure-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-protocol-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-resource-bundle-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-rest-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-server-1.2.1-tests.jar;C:\systems\packages\flink-1.0.2\lib\hbase-server-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-shell-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-thrift-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\log4j-1.2.17.jar;C:\systems\packages\flink-1.0.2\lib\slf4j-log4j12-1.7.7.jar;C:\systems\master_flink\bin;C:\systems\packages\flink-1.0.2\lib;;C:\systems\packages\hbase-1.2.1\lib;C:\systems\hbase\conf;C:\systems\hbase\conf\hbase-site.xml;

2016-05-11 09:16:45,831 INFO  org.apache.zookeeper.ZooKeeper                                - Client environment:java.library.path=C:\systems\packages\jre-1.8.0_74_x64\bin;C:\Windows\Sun\Java\bin;C:\Windows\system32;C:\Windows;C:\systems\master_flink\bin;C:\systems\packages\appsync-1.0.6\bin;C:\systems\packages\flink-1.0.2\bin;C:\systems\packages\jre-1.8.0_74_x64\bin;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;C:\Program Files\Microsoft SQL Server\120\DTS\Binn\;C:\Program Files\Microsoft SQL Server\Client SDK\ODBC\110\Tools\Binn\;C:\Program Files (x86)\Microsoft SQL Server\120\Tools\Binn\;C:\Program Files\Microsoft SQL Server\120\Tools\Binn\;C:\Program Files (x86)\Microsoft SQL Server\120\Tools\Binn\ManagementStudio\;C:\Program Files (x86)\Microsoft SQL Server\120\DTS\Binn\;C:\systems\packages\hadoop-2.7.2\bin;C:\systems\packages\jdk-1.8.0_74_x64\bin;C:\systems\packages\apache-maven-3.3.9\bin;C:\systems\packages\protoc-2.5.0-win32\;C:\systems\packages\cygwin64\bin\;C:\systems\packages\cmake-3.5.2-win32-x86\bin;C:\Program Files\Microsoft Windows Performance Toolkit\;C:\systems\packages\perl-5.6.0-win\bin;C:\systems\hbase\conf;.

----- Original meddelelse -----
Fra: Flavio Pompermaier <[hidden email]>
Til: user <[hidden email]>
Dato: Ons, 11. maj 2016 00:05
Emne: Re: HBase write problem


Do you have the hbase-site.xml available in the classpath?

On 10 May 2016 23:10, "Palle" <[hidden email]> wrote:
HBase write problem

Hi all.

I have a problem writing to HBase.

I am using a slightly modified example of this class to proof the concept:
https://github.com/apache/flink/blob/master/flink-batch-connectors/flink-hbase/src/test/java/org/apache/flink/addons/hbase/example/HBaseWriteExample.java

However all the HBase-specific stuff is exactly the same as in the HBaseWriteExample.

The problem I see is that the job never completes (been running for more than an hour now) and it is only 13 key/value pairs that is to be written to HBase :-)
I have tested the map/reduce stuff works if I replace the HBase connection stuff with just a write to a text file - works OK. I have also tested that I can insert data in HBase from a similar Hadoop MapReduce job.

Here is the part of the code where I guess the problem is:

      @Override
      public Tuple2<Text, Mutation> map(Tuple2<String, Integer> t) throws Exception {
        LOG.info("Tuple2 map() called");
        reuse.f0 = new Text(t.f0);
        Put put = new Put(t.f0.getBytes());
        put.add(MasterConstants.CF_SOME, MasterConstants.COUNT, Bytes.toBytes(t.f1));
        reuse.f1 = put;
        return reuse;
      }
    }).output(new HadoopOutputFormat<Text, Mutation>(new TableOutputFormat<Text>(), job));

    env.execute("Flink HBase Event Count Hello World Test");

This code matches the code in the HBaseWriteExample.java I should think.

The "Tuple2" log line I see exactly the 13 times I expect, and the last log line I see is this:
2016-05-10 21:48:42,715 INFO  org.apache.hadoop.hbase.mapreduce.TableOutputFormat           - Created table instance for event_type_count

Any suggestions to what the problem could be?

Thanks,
Palle
 

 

 


Reply | Threaded
Open this post in threaded view
|

Re: HBase write problem

palle
Hi guys.

Thanks for helping out.

We downgraded to HBase 0.98 and resolved some classpath issues and then it worked.

/Palle

----- Original meddelelse -----
Fra: Stephan Ewen <[hidden email]>
Til: [hidden email]
Dato: Ons, 11. maj 2016 17:19
Emne: Re: HBase write problem

Just to narrow down the problem:
The insertion into HBase actually works, but the job does not finish after that?
And the same job (same source of data) that writes to a file, or prints, finishes?
If that is the case, can you check what status each task is in, via the web dashboard? Are all tasks still in "running"?


On Wed, May 11, 2016 at 4:53 PM, Flavio Pompermaier <[hidden email]> wrote:
I can't help you with the choice of the db storage, as always the answer is "it depends" on a lot of factors :)

For what I can tell you the problem could be that Flink support HBase 0.98, so it could worth to update Flink connectors to a more recent version (that should be backward compatible hopefully..) or maybe create two separte hbase connectors (one for hbase-0.9x and one for 0.1x). Let me know about your attempts :)


On Wed, May 11, 2016 at 4:47 PM, Palle <[hidden email]> wrote:
Hadoop 2.7.2
HBase 1.2.1

I have this running from a Hadoop job, but just not from Flink.

I will look into your suggestions, but would I be better off choosing another DB for storage? I can see that  Cassandra gets some attention in this mailing list. I need to store app 2 bio key value pairs consisting of 100 bytes for each pair. 

----- Original meddelelse -----
Fra: Flavio Pompermaier <[hidden email]>
Til: user <[hidden email]>
Dato: Ons, 11. maj 2016 16:29

Emne: Re: HBase write problem

And which version of HBase and Hadoop are you running? 
Did you try to put the hbase-site.xml in the jar?
Moreover, I don't know how much reliable is at the moment the web client UI..my experience is that the command line client is much more reliable.
You just need to run from the flink dir something like:
   bin/flink  run -c  xxx.yyy.MyMainClass /path/to/shadedJar.jar

On Wed, May 11, 2016 at 4:19 PM, Palle <[hidden email]> wrote:
I run the job from the cluster. I run it through the web UI.
The jar file submitted does not contain the hbase-site.xml file.

----- Original meddelelse -----
Fra: Flavio Pompermaier <[hidden email]>
Til: user <[hidden email]>
Dato: Ons, 11. maj 2016 09:36

Emne: Re: HBase write problem

Do you run the job from your IDE or from the cluster?

On Wed, May 11, 2016 at 9:22 AM, Palle <[hidden email]> wrote:
Thanks for the response, but I don't think the problem is the classpath - hbase-site.xml should be added. This is what it looks like (hbase conf is added at the end):

2016-05-11 09:16:45,831 INFO  org.apache.zookeeper.ZooKeeper                                - Client environment:java.class.path=C:\systems\packages\flink-1.0.2\lib\flink-dist_2.11-1.0.2.jar;C:\systems\packages\flink-1.0.2\lib\flink-python_2.11-1.0.2.jar;C:\systems\packages\flink-1.0.2\lib\guava-11.0.2.jar;C:\systems\packages\flink-1.0.2\lib\hbase-annotations-1.2.1-tests.jar;C:\systems\packages\flink-1.0.2\lib\hbase-annotations-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-client-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-common-1.2.1-tests.jar;C:\systems\packages\flink-1.0.2\lib\hbase-common-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-examples-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-external-blockcache-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-hadoop-compat-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-hadoop2-compat-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-it-1.2.1-tests.jar;C:\systems\packages\flink-1.0.2\lib\hbase-it-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-prefix-tree-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-procedure-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-protocol-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-resource-bundle-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-rest-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-server-1.2.1-tests.jar;C:\systems\packages\flink-1.0.2\lib\hbase-server-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-shell-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-thrift-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\log4j-1.2.17.jar;C:\systems\packages\flink-1.0.2\lib\slf4j-log4j12-1.7.7.jar;C:\systems\master_flink\bin;C:\systems\packages\flink-1.0.2\lib;;C:\systems\packages\hbase-1.2.1\lib;C:\systems\hbase\conf;C:\systems\hbase\conf\hbase-site.xml;

2016-05-11 09:16:45,831 INFO  org.apache.zookeeper.ZooKeeper                                - Client environment:java.library.path=C:\systems\packages\jre-1.8.0_74_x64\bin;C:\Windows\Sun\Java\bin;C:\Windows\system32;C:\Windows;C:\systems\master_flink\bin;C:\systems\packages\appsync-1.0.6\bin;C:\systems\packages\flink-1.0.2\bin;C:\systems\packages\jre-1.8.0_74_x64\bin;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;C:\Program Files\Microsoft SQL Server\120\DTS\Binn\;C:\Program Files\Microsoft SQL Server\Client SDK\ODBC\110\Tools\Binn\;C:\Program Files (x86)\Microsoft SQL Server\120\Tools\Binn\;C:\Program Files\Microsoft SQL Server\120\Tools\Binn\;C:\Program Files (x86)\Microsoft SQL Server\120\Tools\Binn\ManagementStudio\;C:\Program Files (x86)\Microsoft SQL Server\120\DTS\Binn\;C:\systems\packages\hadoop-2.7.2\bin;C:\systems\packages\jdk-1.8.0_74_x64\bin;C:\systems\packages\apache-maven-3.3.9\bin;C:\systems\packages\protoc-2.5.0-win32\;C:\systems\packages\cygwin64\bin\;C:\systems\packages\cmake-3.5.2-win32-x86\bin;C:\Program Files\Microsoft Windows Performance Toolkit\;C:\systems\packages\perl-5.6.0-win\bin;C:\systems\hbase\conf;.

----- Original meddelelse -----
Fra: Flavio Pompermaier <[hidden email]>
Til: user <[hidden email]>
Dato: Ons, 11. maj 2016 00:05
Emne: Re: HBase write problem


Do you have the hbase-site.xml available in the classpath?

On 10 May 2016 23:10, "Palle" <[hidden email]> wrote:
HBase write problem

Hi all.

I have a problem writing to HBase.

I am using a slightly modified example of this class to proof the concept:
https://github.com/apache/flink/blob/master/flink-batch-connectors/flink-hbase/src/test/java/org/apache/flink/addons/hbase/example/HBaseWriteExample.java

However all the HBase-specific stuff is exactly the same as in the HBaseWriteExample.

The problem I see is that the job never completes (been running for more than an hour now) and it is only 13 key/value pairs that is to be written to HBase :-)
I have tested the map/reduce stuff works if I replace the HBase connection stuff with just a write to a text file - works OK. I have also tested that I can insert data in HBase from a similar Hadoop MapReduce job.

Here is the part of the code where I guess the problem is:

      @Override
      public Tuple2<Text, Mutation> map(Tuple2<String, Integer> t) throws Exception {
        LOG.info("Tuple2 map() called");
        reuse.f0 = new Text(t.f0);
        Put put = new Put(t.f0.getBytes());
        put.add(MasterConstants.CF_SOME, MasterConstants.COUNT, Bytes.toBytes(t.f1));
        reuse.f1 = put;
        return reuse;
      }
    }).output(new HadoopOutputFormat<Text, Mutation>(new TableOutputFormat<Text>(), job));

    env.execute("Flink HBase Event Count Hello World Test");

This code matches the code in the HBaseWriteExample.java I should think.

The "Tuple2" log line I see exactly the 13 times I expect, and the last log line I see is this:
2016-05-10 21:48:42,715 INFO  org.apache.hadoop.hbase.mapreduce.TableOutputFormat           - Created table instance for event_type_count

Any suggestions to what the problem could be?

Thanks,
Palle
 

 

 
 
Reply | Threaded
Open this post in threaded view
|

Re: HBase write problem

Flavio Pompermaier
Great :)

On Thu, May 12, 2016 at 10:01 AM, Palle <[hidden email]> wrote:
Hi guys.

Thanks for helping out.

We downgraded to HBase 0.98 and resolved some classpath issues and then it worked.

/Palle

----- Original meddelelse -----
Fra: Stephan Ewen <[hidden email]>
Til: [hidden email]
Dato: Ons, 11. maj 2016 17:19

Emne: Re: HBase write problem

Just to narrow down the problem:
The insertion into HBase actually works, but the job does not finish after that?
And the same job (same source of data) that writes to a file, or prints, finishes?
If that is the case, can you check what status each task is in, via the web dashboard? Are all tasks still in "running"?


On Wed, May 11, 2016 at 4:53 PM, Flavio Pompermaier <[hidden email]> wrote:
I can't help you with the choice of the db storage, as always the answer is "it depends" on a lot of factors :)

For what I can tell you the problem could be that Flink support HBase 0.98, so it could worth to update Flink connectors to a more recent version (that should be backward compatible hopefully..) or maybe create two separte hbase connectors (one for hbase-0.9x and one for 0.1x). Let me know about your attempts :)


On Wed, May 11, 2016 at 4:47 PM, Palle <[hidden email]> wrote:
Hadoop 2.7.2
HBase 1.2.1

I have this running from a Hadoop job, but just not from Flink.

I will look into your suggestions, but would I be better off choosing another DB for storage? I can see that  Cassandra gets some attention in this mailing list. I need to store app 2 bio key value pairs consisting of 100 bytes for each pair. 

----- Original meddelelse -----
Fra: Flavio Pompermaier <[hidden email]>
Til: user <[hidden email]>
Dato: Ons, 11. maj 2016 16:29

Emne: Re: HBase write problem

And which version of HBase and Hadoop are you running? 
Did you try to put the hbase-site.xml in the jar?
Moreover, I don't know how much reliable is at the moment the web client UI..my experience is that the command line client is much more reliable.
You just need to run from the flink dir something like:
   bin/flink  run -c  xxx.yyy.MyMainClass /path/to/shadedJar.jar

On Wed, May 11, 2016 at 4:19 PM, Palle <[hidden email]> wrote:
I run the job from the cluster. I run it through the web UI.
The jar file submitted does not contain the hbase-site.xml file.

----- Original meddelelse -----
Fra: Flavio Pompermaier <[hidden email]>
Til: user <[hidden email]>
Dato: Ons, 11. maj 2016 09:36

Emne: Re: HBase write problem

Do you run the job from your IDE or from the cluster?

On Wed, May 11, 2016 at 9:22 AM, Palle <[hidden email]> wrote:
Thanks for the response, but I don't think the problem is the classpath - hbase-site.xml should be added. This is what it looks like (hbase conf is added at the end):

2016-05-11 09:16:45,831 INFO  org.apache.zookeeper.ZooKeeper                                - Client environment:java.class.path=C:\systems\packages\flink-1.0.2\lib\flink-dist_2.11-1.0.2.jar;C:\systems\packages\flink-1.0.2\lib\flink-python_2.11-1.0.2.jar;C:\systems\packages\flink-1.0.2\lib\guava-11.0.2.jar;C:\systems\packages\flink-1.0.2\lib\hbase-annotations-1.2.1-tests.jar;C:\systems\packages\flink-1.0.2\lib\hbase-annotations-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-client-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-common-1.2.1-tests.jar;C:\systems\packages\flink-1.0.2\lib\hbase-common-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-examples-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-external-blockcache-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-hadoop-compat-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-hadoop2-compat-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-it-1.2.1-tests.jar;C:\systems\packages\flink-1.0.2\lib\hbase-it-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-prefix-tree-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-procedure-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-protocol-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-resource-bundle-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-rest-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-server-1.2.1-tests.jar;C:\systems\packages\flink-1.0.2\lib\hbase-server-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-shell-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\hbase-thrift-1.2.1.jar;C:\systems\packages\flink-1.0.2\lib\log4j-1.2.17.jar;C:\systems\packages\flink-1.0.2\lib\slf4j-log4j12-1.7.7.jar;C:\systems\master_flink\bin;C:\systems\packages\flink-1.0.2\lib;;C:\systems\packages\hbase-1.2.1\lib;C:\systems\hbase\conf;C:\systems\hbase\conf\hbase-site.xml;

2016-05-11 09:16:45,831 INFO  org.apache.zookeeper.ZooKeeper                                - Client environment:java.library.path=C:\systems\packages\jre-1.8.0_74_x64\bin;C:\Windows\Sun\Java\bin;C:\Windows\system32;C:\Windows;C:\systems\master_flink\bin;C:\systems\packages\appsync-1.0.6\bin;C:\systems\packages\flink-1.0.2\bin;C:\systems\packages\jre-1.8.0_74_x64\bin;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;C:\Program Files\Microsoft SQL Server\120\DTS\Binn\;C:\Program Files\Microsoft SQL Server\Client SDK\ODBC\110\Tools\Binn\;C:\Program Files (x86)\Microsoft SQL Server\120\Tools\Binn\;C:\Program Files\Microsoft SQL Server\120\Tools\Binn\;C:\Program Files (x86)\Microsoft SQL Server\120\Tools\Binn\ManagementStudio\;C:\Program Files (x86)\Microsoft SQL Server\120\DTS\Binn\;C:\systems\packages\hadoop-2.7.2\bin;C:\systems\packages\jdk-1.8.0_74_x64\bin;C:\systems\packages\apache-maven-3.3.9\bin;C:\systems\packages\protoc-2.5.0-win32\;C:\systems\packages\cygwin64\bin\;C:\systems\packages\cmake-3.5.2-win32-x86\bin;C:\Program Files\Microsoft Windows Performance Toolkit\;C:\systems\packages\perl-5.6.0-win\bin;C:\systems\hbase\conf;.

----- Original meddelelse -----
Fra: Flavio Pompermaier <[hidden email]>
Til: user <[hidden email]>
Dato: Ons, 11. maj 2016 00:05
Emne: Re: HBase write problem


Do you have the hbase-site.xml available in the classpath?

On 10 May 2016 23:10, "Palle" <[hidden email]> wrote:
HBase write problem

Hi all.

I have a problem writing to HBase.

I am using a slightly modified example of this class to proof the concept:
https://github.com/apache/flink/blob/master/flink-batch-connectors/flink-hbase/src/test/java/org/apache/flink/addons/hbase/example/HBaseWriteExample.java

However all the HBase-specific stuff is exactly the same as in the HBaseWriteExample.

The problem I see is that the job never completes (been running for more than an hour now) and it is only 13 key/value pairs that is to be written to HBase :-)
I have tested the map/reduce stuff works if I replace the HBase connection stuff with just a write to a text file - works OK. I have also tested that I can insert data in HBase from a similar Hadoop MapReduce job.

Here is the part of the code where I guess the problem is:

      @Override
      public Tuple2<Text, Mutation> map(Tuple2<String, Integer> t) throws Exception {
        LOG.info("Tuple2 map() called");
        reuse.f0 = new Text(t.f0);
        Put put = new Put(t.f0.getBytes());
        put.add(MasterConstants.CF_SOME, MasterConstants.COUNT, Bytes.toBytes(t.f1));
        reuse.f1 = put;
        return reuse;
      }
    }).output(new HadoopOutputFormat<Text, Mutation>(new TableOutputFormat<Text>(), job));

    env.execute("Flink HBase Event Count Hello World Test");

This code matches the code in the HBaseWriteExample.java I should think.

The "Tuple2" log line I see exactly the 13 times I expect, and the last log line I see is this:
2016-05-10 21:48:42,715 INFO  org.apache.hadoop.hbase.mapreduce.TableOutputFormat           - Created table instance for event_type_count

Any suggestions to what the problem could be?

Thanks,
Palle