Hi! I was curious if there are docs on how to optimize Flink joins. I looked around and on the Flink docs and didn't see much. I see a little on the Configuration page.
E.g. one of my jobs has an interval join. Does left vs right matter for interval join? |
Hi Dan,
the order of all joins depends on the order in the SQL query by default. You can also check the following example (not interval joins though) and swap e.g. b and c: env.createTemporaryView("a", env.fromValues(1, 2, 3)); env.createTemporaryView("b", env.fromValues(4, 5, 6)); env.createTemporaryView("c", env.fromValues(7, 8, 9)); System.out.println(env.sqlQuery("SELECT * FROM c, b, a").explain()); So you can reorder the tables in the query if that improves performance. For interval joins, we currently don't provide additional algorithms or options. Regards, Timo On 11.02.21 05:04, Dan Hill wrote: > Hi! I was curious if there are docs on how to optimize Flink joins. I > looked around and on the Flink docs and didn't see much. I see a little > on the Configuration page. > > E.g. one of my jobs has an interval join. Does left vs right matter for > interval join? |
Hi Timo! I'm moving away from SQL to DataStream. On Thu, Feb 11, 2021 at 9:11 AM Timo Walther <[hidden email]> wrote: Hi Dan, |
Hi Dan,
thanks for letting us know. Could you give us some feedback what is missing in SQL for this use case? Are you looking for some broadcast joining or which kind of algorithm would help you? Regards, Timo On 11.02.21 20:32, Dan Hill wrote: > Hi Timo! I'm moving away from SQL to DataStream. > > On Thu, Feb 11, 2021 at 9:11 AM Timo Walther <[hidden email] > <mailto:[hidden email]>> wrote: > > Hi Dan, > > the order of all joins depends on the order in the SQL query by default. > > You can also check the following example (not interval joins though) > and > swap e.g. b and c: > > env.createTemporaryView("a", env.fromValues(1, 2, 3)); > env.createTemporaryView("b", env.fromValues(4, 5, 6)); > env.createTemporaryView("c", env.fromValues(7, 8, 9)); > > System.out.println(env.sqlQuery("SELECT * FROM c, b, a").explain()); > > So you can reorder the tables in the query if that improves > performance. > For interval joins, we currently don't provide additional algorithms or > options. > > Regards, > Timo > > On 11.02.21 05:04, Dan Hill wrote: > > Hi! I was curious if there are docs on how to optimize Flink > joins. I > > looked around and on the Flink docs and didn't see much. I see a > little > > on the Configuration page. > > > > E.g. one of my jobs has an interval join. Does left vs right > matter for > > interval join? > |
Flink SQL is missing a reliable, commonly-used way of evolving a job (e.g. savepointing and checkpointing). The other concepts I heard about are not shared publicly enough to rely on (e.g. roll off). I wasn't able to find anything useful on this. On Fri, Feb 12, 2021, 02:05 Timo Walther <[hidden email]> wrote: Hi Dan, |
Free forum by Nabble | Edit this page |