Re: Reading from HBase problem
Posted by
Fabian Hueske-2 on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/Reading-from-HBase-problem-tp1545p1553.html
Hi Hilmi,
I see two possible reasons:
1) The data source / InputFormat is not properly working, so not all HBase records are read/forwarded, or
2) The aggregation / count is buggy
Roberts suggestion will use an alternative mechanism to do the count. In fact, you can count with groupBy(0).sum() and accumulators at the same time.
If both counts are the same, this will indicate that the aggregation is correct and hint that the HBase format is faulty.
In any case, it would be very good to know your findings. Please keep us updated.
One more hint, if you want to do a full aggregate, you don't have to use a "dummy" key like "a". Instead, you can work with Tuple1<Long> and directly call sum(0) without doing the groupBy().
Best, Fabian