two phase aggregation

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

two phase aggregation

Fanbin Bu
Hi,

Does over window aggregation support two-phase mode?

SELECT
user_id
, event_time
, listagg(event_type, '*') over w as names
FROM table
WINDOW w AS
( PARTITION BY user_id
ORDER BY event_time
ROWS BETWEEN 256 PRECEDING AND CURRENT ROW
)
Reply | Threaded
Open this post in threaded view
|

Re: two phase aggregation

Jark Wu-3
Hi Fanbin,

Currently, over window aggregation doesn't support two-phase optimization.

Best,
Jark

On Tue, 23 Jun 2020 at 12:14, Fanbin Bu <[hidden email]> wrote:
Hi,

Does over window aggregation support two-phase mode?

SELECT
user_id
, event_time
, listagg(event_type, '*') over w as names
FROM table
WINDOW w AS
( PARTITION BY user_id
ORDER BY event_time
ROWS BETWEEN 256 PRECEDING AND CURRENT ROW
)
Reply | Threaded
Open this post in threaded view
|

Re: two phase aggregation

Fanbin Bu
Jark,
thanks for the reply. Do you know whether it's on the roadmap or what's the plan?

On Mon, Jun 22, 2020 at 9:36 PM Jark Wu <[hidden email]> wrote:
Hi Fanbin,

Currently, over window aggregation doesn't support two-phase optimization.

Best,
Jark

On Tue, 23 Jun 2020 at 12:14, Fanbin Bu <[hidden email]> wrote:
Hi,

Does over window aggregation support two-phase mode?

SELECT
user_id
, event_time
, listagg(event_type, '*') over w as names
FROM table
WINDOW w AS
( PARTITION BY user_id
ORDER BY event_time
ROWS BETWEEN 256 PRECEDING AND CURRENT ROW
)
Reply | Threaded
Open this post in threaded view
|

Re: two phase aggregation

Jark Wu-3
AFAIK, this is not on the roadmap. 

The problem is that it doesn't get much improvement for over window aggregates. 
If we support two-phase for over window aggregate, the local over operator doesn't reduce any data,
it has to emit the same number of records it received, and can't reduce pressure of the global operator.

Best,
Jark

On Tue, 23 Jun 2020 at 13:09, Fanbin Bu <[hidden email]> wrote:
Jark,
thanks for the reply. Do you know whether it's on the roadmap or what's the plan?

On Mon, Jun 22, 2020 at 9:36 PM Jark Wu <[hidden email]> wrote:
Hi Fanbin,

Currently, over window aggregation doesn't support two-phase optimization.

Best,
Jark

On Tue, 23 Jun 2020 at 12:14, Fanbin Bu <[hidden email]> wrote:
Hi,

Does over window aggregation support two-phase mode?

SELECT
user_id
, event_time
, listagg(event_type, '*') over w as names
FROM table
WINDOW w AS
( PARTITION BY user_id
ORDER BY event_time
ROWS BETWEEN 256 PRECEDING AND CURRENT ROW
)