|
Hi to all, I was looking for an approx_count and freq_item in Flink and I'm not sure which road to follow. At the moment I found 2 valuable options: - Wait for STREAMLINE to unveil their code of HLL_DISTINCT_COUNT[1]
- Use the Yahoo Datasketches lib [2], following the example of Tobias Lindener [3][4] (and maybe release a better and reusable third party lib for Flink)
What do you advice about it? Is there any other ongoing effort on approx statistics?
Best, Flavio
|