Hi, Our job was experiencing high write amplification on aggregates so we decided to give mini-batch a go. There's a few things I've noticed that are different from our previous job and I would like some clarification. 1) Our operators now say they have Watermarks. We never explicitly added watermarks, and our state is essentially unbounded across all time since it consumes from Debezium and reshapes our database data into another store. Why does it say we have Watermarks then? 2) In our sources I see MiniBatchAssigner(interval=[1000ms], mode=[ProcTime], what does that do? 3) I don't really see anything else different yet in the shape of our plan even though we've turned on configuration.setString( "table.optimizer.agg-phase-strategy", "TWO_PHASE" ) Thanks! -- Rex Fenley | Software Engineer - Mobile and Backend Remind.com | BLOG | FOLLOW US | LIKE US |
Hello, Does anyone have any more information here? Thanks! On Wed, Jan 20, 2021 at 9:13 PM Rex Fenley <[hidden email]> wrote:
-- Rex Fenley | Software Engineer - Mobile and Backend Remind.com | BLOG | FOLLOW US | LIKE US |
I am pulling Jark and Godfrey who are more familiar with the planner internals. Best, Dawid On 22/01/2021 20:11, Rex Fenley wrote:
signature.asc (849 bytes) Download Attachment |
Hi Rex, Could you share your query here? It would be helpful to identify the root cause if we have the query. 1) watermark The framework automatically adds a node (the MiniBatchAssigner) to generate watermark events as the mini-batch id to broadcast and trigger mini-batch in the pipeline. 2) MiniBatchAssigner(interval=[1000ms], mode=[ProcTime] It generates a new mini-batch id in an interval of 1000ms in system time. The mini-batch id is represented by the watermark event. 3) TWO_PHASE optimization If users want to have TWO_PHASE optimization, it requires the aggregate functions all support the merge() method and the mini-batch is enabled. Best, Jark On Tue, 26 Jan 2021 at 19:01, Dawid Wysakowicz <[hidden email]> wrote:
|
Thanks, that all makes sense! On Wed, Jan 27, 2021 at 7:00 PM Jark Wu <[hidden email]> wrote:
-- Rex Fenley | Software Engineer - Mobile and Backend Remind.com | BLOG | FOLLOW US | LIKE US |
Free forum by Nabble | Edit this page |