Flink SQL, temporal joins and backfilling data

classic Classic list List threaded Threaded
3 messages Options
Dan
Reply | Threaded
Open this post in threaded view
|

Flink SQL, temporal joins and backfilling data

Dan
Hi!

I have a Flink SQL job that does a few temporal joins and has been running for over a month on regular data.  No issues.  Ran well.

I'm trying to re-run the Flink SQL job on the same data set but it's failing to checkpoint and very slow to make progress.  I've modified some of the checkpoint settings.

What else do I have to modify?

My data size is really small so I'm guessing it's still keeping state for data outside the temporal join time windows.  Do I have to set Idle State Retention Time to forget older data?

- Dan
Reply | Threaded
Open this post in threaded view
|

Re: Flink SQL, temporal joins and backfilling data

Timo Walther
Hi Dan,

are you sure that your watermarks are still correct during reprocessing?
As far as I know, idle state retention is not used for temporal joins.
The watermark indicates when state can be removed in this case.

Maybe you can give us some more details about which kind of temporal
join you are using (event-time or processing-time?) and checkpoint settings?

Regards,
Timo

On 30.12.20 08:30, Dan Hill wrote:

> Hi!
>
> I have a Flink SQL job that does a few temporal joins and has been
> running for over a month on regular data.  No issues.  Ran well.
>
> I'm trying to re-run the Flink SQL job on the same data set but it's
> failing to checkpoint and very slow to make progress.  I've modified
> some of the checkpoint settings.
>
> What else do I have to modify?
>
> My data size is really small so I'm guessing it's still keeping state
> for data outside the temporal join time windows.  Do I have to set Idle
> State Retention Time to forget older data?
>
> - Dan

Dan
Reply | Threaded
Open this post in threaded view
|

Re: Flink SQL, temporal joins and backfilling data

Dan
Hi Timo.  Sorry for the delay.  I'll message this message the next time I hit this.  I haven't restarted my job in 12 days.  I'll check the watermarks the next time I restart.

On Tue, Jan 5, 2021 at 4:47 AM Timo Walther <[hidden email]> wrote:
Hi Dan,

are you sure that your watermarks are still correct during reprocessing?
As far as I know, idle state retention is not used for temporal joins.
The watermark indicates when state can be removed in this case.

Maybe you can give us some more details about which kind of temporal
join you are using (event-time or processing-time?) and checkpoint settings?

Regards,
Timo

On 30.12.20 08:30, Dan Hill wrote:
> Hi!
>
> I have a Flink SQL job that does a few temporal joins and has been
> running for over a month on regular data.  No issues.  Ran well.
>
> I'm trying to re-run the Flink SQL job on the same data set but it's
> failing to checkpoint and very slow to make progress.  I've modified
> some of the checkpoint settings.
>
> What else do I have to modify?
>
> My data size is really small so I'm guessing it's still keeping state
> for data outside the temporal join time windows.  Do I have to set Idle
> State Retention Time to forget older data?
>
> - Dan