Flink streaming sql是否支持两层group by聚合
Posted by
dixingxing85@163.com on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/Flink-streaming-sql-group-by-tp34412.html
Hi all:
我们有个streaming sql得到的结果不正确,现象是sink得到的数据一会大一会小,我们想确认下,这是否是个bug, 或者flink还不支持这种sql。
具体场景是:先group by A, B两个维度计算UV,然后再group by A 把维度B的UV sum起来,对应的SQL如下:(A -> dt, B -> pvareaid)
SELECT dt, SUM(a.uv) AS uv
FROM (
SELECT dt, pvareaid, COUNT(DISTINCT cuid) AS uv
FROM streaming_log_event
WHERE action IN ('action1')
AND pvareaid NOT IN ('pv1', 'pv2')
AND pvareaid IS NOT NULL
GROUP BY dt, pvareaid
) a
GROUP BY dt;
sink接收到的数据对应日志为:
我们使用的是1.7.2, 测试作业的并行度为1。