Flink SQL - can I have multiple outputs per job?

classic Classic list List threaded Threaded
3 messages Options
Dan
Reply | Threaded
Open this post in threaded view
|

Flink SQL - can I have multiple outputs per job?

Dan
I have a few results that I want to produce.
- A join B
- A join B join C
- A join B join C join D
- A join B join C join D join E

When I use the DataSet API directly, I can execute all of these in the same job to reduce redundancy.  When I use the SQL interface, it looks like separate jobs are created for each of these (duplicating join calculations).

Is there a way to merge these joins?
Dan
Reply | Threaded
Open this post in threaded view
|

Re: Flink SQL - can I have multiple outputs per job?

Dan
I figured it out.  TableEnvironment.StatementSet.

Semi-related, query optimizers can mess up the reuse depending on which tables the join IDs come from.






On Fri, Sep 18, 2020 at 9:40 PM Dan Hill <[hidden email]> wrote:
I have a few results that I want to produce.
- A join B
- A join B join C
- A join B join C join D
- A join B join C join D join E

When I use the DataSet API directly, I can execute all of these in the same job to reduce redundancy.  When I use the SQL interface, it looks like separate jobs are created for each of these (duplicating join calculations).

Is there a way to merge these joins?
Reply | Threaded
Open this post in threaded view
|

Re: Flink SQL - can I have multiple outputs per job?

Jark Wu-3
You got it :)

On Sun, 20 Sep 2020 at 12:59, Dan Hill <[hidden email]> wrote:
I figured it out.  TableEnvironment.StatementSet.

Semi-related, query optimizers can mess up the reuse depending on which tables the join IDs come from.






On Fri, Sep 18, 2020 at 9:40 PM Dan Hill <[hidden email]> wrote:
I have a few results that I want to produce.
- A join B
- A join B join C
- A join B join C join D
- A join B join C join D join E

When I use the DataSet API directly, I can execute all of these in the same job to reduce redundancy.  When I use the SQL interface, it looks like separate jobs are created for each of these (duplicating join calculations).

Is there a way to merge these joins?