Hi,
I want to write a stream continuously into an HBase. For example, I have 1 source and 4 workers. I want that each worker writes autonomously into HBase. Is there a proper way to do it? Best Regards, -- -- Hilmi Yildirim Software Developer R&D T: +49 30 24627-281 [hidden email] http://www.neofonie.de Besuchen Sie den Neo Tech Blog für Anwender: http://blog.neofonie.de/ Folgen Sie uns: https://plus.google.com/+neofonie http://www.linkedin.com/company/neofonie-gmbh https://www.xing.com/companies/neofoniegmbh Neofonie GmbH | Robert-Koch-Platz 4 | 10115 Berlin Handelsregister Berlin-Charlottenburg: HRB 67460 Geschäftsführung: Thomas Kitlitschko |
I've added an example of HBase writing at flink-staging/flink-hbase/src/test/java/org/apache/flink/addons/hbase/example/HBaseWriteExample.java.
Otherwise you can look at these 2 URLs:
Best, Flavio On Wed, May 20, 2015 at 10:16 AM, Hilmi Yildirim <[hidden email]> wrote: Hi, |
Thank you Flavio,
these are examples for Batch Processing. But I want to write a continuous stream into an HBase within a StreamExecutionEnvironment instead of a ExecutionEnvironment. Best Regards, Hilmi Am 20.05.2015 um 10:42 schrieb Flavio
Pompermaier:
-- -- Hilmi Yildirim Software Developer R&D T: +49 30 24627-281 [hidden email] http://www.neofonie.de Besuchen Sie den Neo Tech Blog für Anwender: http://blog.neofonie.de/ Folgen Sie uns: https://plus.google.com/+neofonie http://www.linkedin.com/company/neofonie-gmbh https://www.xing.com/companies/neofoniegmbh Neofonie GmbH | Robert-Koch-Platz 4 | 10115 Berlin Handelsregister Berlin-Charlottenburg: HRB 67460 Geschäftsführung: Thomas Kitlitschko |
Hi, I agree with Hilmi, Flavio's examples are for batch. I'm not aware of a StreamingHBaseSink for Flink yet. I'll filed a JIRA for the feature request: https://issues.apache.org/jira/browse/FLINK-2055 Are you interested in implementing this? On Wed, May 20, 2015 at 10:50 AM, Hilmi Yildirim <[hidden email]> wrote:
|
Maybe we can also use the Batch HBase OutputFormat. In the DataStream API there is a private method: private DataStreamSink<OUT> writeToFile(OutputFormat<OUT> format, long millis) {which seems to allow batch output formats. The naming of the method seems weird because its called "toFile" but its expecting an OutputFormat instead of a FileOutputFormat. Maybe its worth trying to see how far we can get with this method. On Wed, May 20, 2015 at 11:00 AM, Robert Metzger <[hidden email]> wrote:
|
There is this pending pull request which is addressing exactly the issues I've mentioned (wrong naming, private method): https://github.com/apache/flink/pull/521 I'll see whats blocking the PR ... On Wed, May 20, 2015 at 11:11 AM, Robert Metzger <[hidden email]> wrote:
|
I'm merging the pull request, it was blocked by the streaming operator rework so it is free to go since yesterday. I do agree that it needs some additional love before it can be on the master, but I am positive that it should be there this week. On May 20, 2015 11:16 AM, "Robert Metzger" <[hidden email]> wrote:
|
In reply to this post by rmetzger0
Hi,
I've changed "writeToFile" to public and then I implemented an Outputformat to write the stream into the HBase. This is working very well. I will do later a pull request. Maybe the method name "writeToFile" should be changed in, for example, "write". Alternatively, I can create a method writeToHBase in the DataStream class. Best Regards, Hilmi Am 20.05.2015 um 11:15 schrieb Robert
Metzger:
-- -- Hilmi Yildirim Software Developer R&D T: +49 30 24627-281 [hidden email] http://www.neofonie.de Besuchen Sie den Neo Tech Blog für Anwender: http://blog.neofonie.de/ Folgen Sie uns: https://plus.google.com/+neofonie http://www.linkedin.com/company/neofonie-gmbh https://www.xing.com/companies/neofoniegmbh Neofonie GmbH | Robert-Koch-Platz 4 | 10115 Berlin Handelsregister Berlin-Charlottenburg: HRB 67460 Geschäftsführung: Thomas Kitlitschko |
Hi, great to hear that it is working. If the PR is going to be only about adding the "write()" method, you probably don't need to open the PR. https://github.com/apache/flink/pull/521 is going to add a method called: public <OUT> DataStreamSource<OUT> createInput(InputFormat<OUT, ?> inputFormat, TypeInformation<OUT> typeInfo) The issue with that pull request is probably only that we have to wait for another week until its merged. On Wed, May 20, 2015 at 2:23 PM, Hilmi Yildirim <[hidden email]> wrote:
|
createInput creates a stream out of a file and can be used for
HBase, correct?
But I do not want to read from HBase. I only want to write to HBase. For that I implemented an HBaseOutputFormat which I pass to the writeToFile method of the dataStream. Then, the results of the stream processsing are written into HBase. Am 20.05.2015 um 14:26 schrieb Robert
Metzger:
-- -- Hilmi Yildirim Software Developer R&D T: +49 30 24627-281 [hidden email] http://www.neofonie.de Besuchen Sie den Neo Tech Blog für Anwender: http://blog.neofonie.de/ Folgen Sie uns: https://plus.google.com/+neofonie http://www.linkedin.com/company/neofonie-gmbh https://www.xing.com/companies/neofoniegmbh Neofonie GmbH | Robert-Koch-Platz 4 | 10115 Berlin Handelsregister Berlin-Charlottenburg: HRB 67460 Geschäftsführung: Thomas Kitlitschko |
Hi, sorry. I was doing too many things at the same time. I confused inputs and outputs ;) Please open a pull request for the changed method name... On Wed, May 20, 2015 at 2:44 PM, Hilmi Yildirim <[hidden email]> wrote:
|
Free forum by Nabble | Edit this page |