Can an Aggregate the key from a WindowedStream.aggregate()

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Can an Aggregate the key from a WindowedStream.aggregate()

Stephen Connolly
If I write my aggregation logic as a WindowFunction then I get access to the key as the first parameter in WindowFunction.apply(...) however the Javadocs for calling WindowedStream.apply(WindowFunction) state:

> Note that this function requires that all data in the windows is buffered until the window
> is evaluated, as the function provides no means of incremental aggregation.

Which sounds bad.

It seems the recommended alternative is to use one of the WindowFunction.aggregate(AggregateFunction) however I cannot see how to get access to the key...

Is my only solution to transform my data into a Tuple if I need access to the key post aggregation?

Thanks in advance

-stephenc
Reply | Threaded
Open this post in threaded view
|

Re: Can an Aggregate the key from a WindowedStream.aggregate()

Chesnay Schepler
There are also versions of WindowedStream#aggregate that accept an
additional WindowFunction/ProcessWindowFunction, which do have access to
the key via apply()/process() respectively. These functions are called
post aggregation.

On 08.02.2019 18:24, [hidden email] wrote:

> If I write my aggregation logic as a WindowFunction then I get access to the key as the first parameter in WindowFunction.apply(...) however the Javadocs for calling WindowedStream.apply(WindowFunction) state:
>
>> Note that this function requires that all data in the windows is buffered until the window
>> is evaluated, as the function provides no means of incremental aggregation.
> Which sounds bad.
>
> It seems the recommended alternative is to use one of the WindowFunction.aggregate(AggregateFunction) however I cannot see how to get access to the key...
>
> Is my only solution to transform my data into a Tuple if I need access to the key post aggregation?
>
> Thanks in advance
>
> -stephenc
>

Reply | Threaded
Open this post in threaded view
|

Re: Can an Aggregate the key from a WindowedStream.aggregate()

Stephen Connolly


On Sun, 10 Feb 2019 at 09:48, Chesnay Schepler <[hidden email]> wrote:
There are also versions of WindowedStream#aggregate that accept an
additional WindowFunction/ProcessWindowFunction, which do have access to
the key via apply()/process() respectively. These functions are called
post aggregation.

Cool I'll chase those down
 

On 08.02.2019 18:24, [hidden email] wrote:
> If I write my aggregation logic as a WindowFunction then I get access to the key as the first parameter in WindowFunction.apply(...) however the Javadocs for calling WindowedStream.apply(WindowFunction) state:
>
>> Note that this function requires that all data in the windows is buffered until the window
>> is evaluated, as the function provides no means of incremental aggregation.
> Which sounds bad.
>
> It seems the recommended alternative is to use one of the WindowFunction.aggregate(AggregateFunction) however I cannot see how to get access to the key...
>
> Is my only solution to transform my data into a Tuple if I need access to the key post aggregation?
>
> Thanks in advance
>
> -stephenc
>

Reply | Threaded
Open this post in threaded view
|

Re: Can an Aggregate the key from a WindowedStream.aggregate()

Rong Rong
Hi Stephen,

Chesney was right, you will have to use a more complex version of the window processing function. 
Perhaps your goal can be achieve by this specific function with incremental aggregation [1]. If not you can always use the regular process window function [2].
Both of these methods have access to the KEY information you required.

Thanks,
Rong




On Sun, Feb 10, 2019 at 11:29 AM Stephen Connolly <[hidden email]> wrote:


On Sun, 10 Feb 2019 at 09:48, Chesnay Schepler <[hidden email]> wrote:
There are also versions of WindowedStream#aggregate that accept an
additional WindowFunction/ProcessWindowFunction, which do have access to
the key via apply()/process() respectively. These functions are called
post aggregation.

Cool I'll chase those down
 

On 08.02.2019 18:24, [hidden email] wrote:
> If I write my aggregation logic as a WindowFunction then I get access to the key as the first parameter in WindowFunction.apply(...) however the Javadocs for calling WindowedStream.apply(WindowFunction) state:
>
>> Note that this function requires that all data in the windows is buffered until the window
>> is evaluated, as the function provides no means of incremental aggregation.
> Which sounds bad.
>
> It seems the recommended alternative is to use one of the WindowFunction.aggregate(AggregateFunction) however I cannot see how to get access to the key...
>
> Is my only solution to transform my data into a Tuple if I need access to the key post aggregation?
>
> Thanks in advance
>
> -stephenc
>