Flink reads data from JDBC table only on startup

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Flink reads data from JDBC table only on startup

Taras Moisiuk
Hi everyone!
I'm using Flink 1.12.0 with SQL API.

I'm developing a streaming job with join and insertion into postgreSQL.
There is two tables in join:
1. Dynamic table based on kafka topic
2. Small lookup JDBC table

From what I can see Flink job reads data from JDBC table only on startup and
mark task as FINISHED.
Does it mean that Flink misses all updates from this table and join reflects
only table state on startup?

And the other question is, how to enable checkpointing for this job? I know
that checkpointing for jobs with finished tasks is not supported now, but
maybe I can keep such tasks in RUNNING state?

Thank you!



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Flink reads data from JDBC table only on startup

Danny Chan-2
Hi Taras ~

There is a look up cache for temporal join but it default is false, see [1]. That means, by default FLINK SQL would lookup the external databases on each record from the JOIN LHS.

Did you use the temporal table join syntax or normal stream-stream join syntax ? The temporal table join uses the SYSTEM_TIME AS OF keywords, see [2]


Taras Moisiuk <[hidden email]> 于2020年12月27日周日 上午3:13写道:
Hi everyone!
I'm using Flink 1.12.0 with SQL API.

I'm developing a streaming job with join and insertion into postgreSQL.
There is two tables in join:
1. Dynamic table based on kafka topic
2. Small lookup JDBC table

From what I can see Flink job reads data from JDBC table only on startup and
mark task as FINISHED.
Does it mean that Flink misses all updates from this table and join reflects
only table state on startup?

And the other question is, how to enable checkpointing for this job? I know
that checkpointing for jobs with finished tasks is not supported now, but
maybe I can keep such tasks in RUNNING state?

Thank you!



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Flink reads data from JDBC table only on startup

Taras Moisiuk
Hi Danny,

I use regular join and it looks like:

SELECT
...
FROM dynamic_kafka_table k
JOIN jdbc_table j ON k.id = j.k_id


Should I set some additional conditions for this join?


Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.
Reply | Threaded
Open this post in threaded view
|

Re: Flink reads data from JDBC table only on startup

Danny Chan-2
For your case, you should use a temporal table join syntax, and set up a refresh TTL for the RHS join cache.

Taras Moisiuk <[hidden email]> 于2020年12月28日周一 下午7:21写道:
Hi Danny,

I use regular join and it looks like:

SELECT
...
FROM dynamic_kafka_table k
JOIN jdbc_table j ON k.id = j.k_id


Should I set some additional conditions for this join?


Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.