Aggregation on multiple Key combinations and multiple Windows

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Aggregation on multiple Key combinations and multiple Windows

Basanth Gowda
Hello,
Posted this yesterday, but not sure if it went through or not.

I am fairly new to Flink. I have a use case which needs aggregation on different combination of keys and windowing for different intervals. I searched through but couldn't find anything that could help.


Came across this model on a presentation for Apex . This sums up what we are trying to achieve. What is the best way to do this in Flink



{"keys":[{"name":"campaignId","type":"integer"},
 {"name":"adId","type":"integer"},
 {"name":"creativeId","type":"integer"},
 {"name":"publisherId","type":"integer"},
 {"name":"adOrderId","type":"integer"}],
 "timeBuckets":["1h","1d"],
 "values":
[{"name":"impressions","type":"integer","aggregators":["SUM"]}
,
 {"name":"clicks","type":"integer","aggregators":["SUM"]},
 {"name":"revenue","type":"integer"}],
 "dimensions":
 [{"combination":["campaignId","adId"]},
 {"combination":["creativeId","campaignId"]},
 {"combination":["campaignId"]},
 {"combination":["publisherId","adOrderId","campaignId"],
"additionalValues":["revenue:SUM"]}]
}


I have been able to do this by the following and repeating this for every key + window combination. So in the above case there would be 8 blocks like below. (4 combinations and 2 window period for each combination)

modelDataStream.keyBy("campaiginId","addId")
.timeWindow(Time.minutes(1))
.trigger(CountTrigger.of(2))
.reduce(
..)