Flink streaming sql是否支持两层group by聚合

Posted by dixingxing85@163.com on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/Flink-streaming-sql-group-by-tp34412.html


Hi all:

我们有个streaming sql得到的结果不正确,现象是sink得到的数据一会大一会小,我们想确认下,这是否是个bug, 或者flink还不支持这种sql
具体场景是:先group by A, B两个维度计算UV,然后再group by A 把维度B的UV sum起来,对应的SQL如下:(A -> dt,  B -> pvareaid)
SELECT dt, SUM(a.uv) AS uv
FROM (
SELECT dt, pvareaid, COUNT(DISTINCT cuid) AS uv
FROM streaming_log_event
WHERE action IN ('action1')
AND pvareaid NOT IN ('pv1', 'pv2')
AND pvareaid IS NOT NULL
GROUP BY dt, pvareaid
) a
GROUP BY dt;
sink接收到的数据对应日志为:

我们使用的是1.7.2, 测试作业的并行度为1。
这是对应的 issue: https://issues.apache.org/jira/browse/FLINK-17228