(DEPRECATED) Apache Flink User Mailing List archive.

Flink reads data from JDBC table only on startup

Classic

List

Threaded

4 messages Options

Taras Moisiuk

Flink reads data from JDBC table only on startup

Hi everyone!
I'm using Flink 1.12.0 with SQL API.

I'm developing a streaming job with join and insertion into postgreSQL.
There is two tables in join:
1. Dynamic table based on kafka topic
2. Small lookup JDBC table

From what I can see Flink job reads data from JDBC table only on startup and
mark task as FINISHED.
Does it mean that Flink misses all updates from this table and join reflects
only table state on startup?

And the other question is, how to enable checkpointing for this job? I know
that checkpointing for jobs with finished tasks is not supported now, but
maybe I can keep such tasks in RUNNING state?

Thank you!

--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Danny Chan-2

Re: Flink reads data from JDBC table only on startup

Hi Taras ~

There is a look up cache for temporal join but it default is false, see [1]. That means, by default FLINK SQL would lookup the external databases on each record from the JOIN LHS.

Did you use the temporal table join syntax or normal stream-stream join syntax ? The temporal table join uses the SYSTEM_TIME AS OF keywords, see [2]

[1] https://ci.apache.org/projects/flink/flink-docs-master/dev/table/connectors/jdbc.html#lookup-cache

[2] https://ci.apache.org/projects/flink/flink-docs-master/dev/table/sql/queries.html#joins

Taras Moisiuk <[hidden email]> 于2020年12月27日周日上午3:13写道：

Hi everyone!
I'm using Flink 1.12.0 with SQL API.

I'm developing a streaming job with join and insertion into postgreSQL.
There is two tables in join:
1. Dynamic table based on kafka topic
2. Small lookup JDBC table

From what I can see Flink job reads data from JDBC table only on startup and
mark task as FINISHED.
Does it mean that Flink misses all updates from this table and join reflects
only table state on startup?

And the other question is, how to enable checkpointing for this job? I know
that checkpointing for jobs with finished tasks is not supported now, but
maybe I can keep such tasks in RUNNING state?

Thank you!

--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Taras Moisiuk

Re: Flink reads data from JDBC table only on startup

Hi Danny,

I use regular join and it looks like:


SELECT 

...

FROM dynamic_kafka_table k

JOIN jdbc_table j ON k.id = j.k_id






Should I set some additional conditions for this join? 



	
	
	


Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.

Danny Chan-2

Re: Flink reads data from JDBC table only on startup

For your case, you should use a temporal table join syntax, and set up a refresh TTL for the RHS join cache.

Taras Moisiuk <[hidden email]> 于2020年12月28日周一下午7:21写道：

Hi Danny,

I use regular join and it looks like:

SELECT ... FROM dynamic_kafka_table k JOIN jdbc_table j ON k.id = j.k_id Should I set some additional conditions for this join? Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.