(DEPRECATED) Apache Flink User Mailing List archive.

Re: Updating multiple database tables

Classic

List

Threaded

2 messages Options

Jason Sommer

Re: Updating multiple database tables

Hi Dylan,

I have a similar use case of saving updates to multiple RDBMS tables. While I'm leaning towards using multiple JDBCOutputFormats to solve the issue, I'm curious about which approach you ended up using.

Thanks,

Jason

On 2018/11/28 21:09:08, Dylan Adams <[hidden email]> wrote:

> Hello,>
>
> I was hoping to get input from the Flink community about patterns for>
> handling multiple dependent RDMS updates. A textbook example would be>
> order & order_line_item tables. I've encountered a few approaches to>
> this problem, and I'm curious to see if there are others, and the>
> benefits & drawbacks of those solutions.>
>
> Multiple JDBCOutputFormats. This is possible if you use an>
> application-generated primary key, such as a UUID. Drawback is that>
> it's only eventually consistent.>
>
> JDBC-emitting function and JDBCOutputFormat. When the primary key is>
> generated by the database, the program uses a JDBC-emitting>
> MapFunction to persist records for one table and retrieve its PK. The>
> other table is persisted using JDBCOutputFormat. Only eventually>
> consistent.>
>
> JDBCOutputFormat and database-specific features. Most broadly>
> supported would be stored procedures, but use other mechanisms like>
> CTEs. Atomic; requires non-portable database implementations.>
>
> Custom OutputFormat. Full control, allows for atomic updates at the>
> cost of maintaining custom OutputFormats for each combination of>
> updated tables.>
>
> Has anyone seen any other approaches to this challenge?>
>
> Regards,>
> Dylan>
>

Dylan Adams

Re: Updating multiple database tables

Jason,

I ended up using PostgreSQL’s writable CTEs. The target tables had database-generated SERIAL primary keys, so it seemed like the easiest way to keep the changes atomic.

Regards,

Dylan

On Thu, Jan 30, 2020 at 14:15 Jason Sommer <[hidden email]> wrote:

Hi Dylan,

I have a similar use case of saving updates to multiple RDBMS tables. While I'm leaning towards using multiple JDBCOutputFormats to solve the issue, I'm curious about which approach you ended up using.

Thanks,
Jason

On 2018/11/28 21:09:08, Dylan Adams <[hidden email]> wrote:

> Hello,>
>
> I was hoping to get input from the Flink community about patterns for>
> handling multiple dependent RDMS updates. A textbook example would be>
> order & order_line_item tables. I've encountered a few approaches to>
> this problem, and I'm curious to see if there are others, and the>
> benefits & drawbacks of those solutions.>
>
> Multiple JDBCOutputFormats. This is possible if you use an>
> application-generated primary key, such as a UUID. Drawback is that>
> it's only eventually consistent.>
>
> JDBC-emitting function and JDBCOutputFormat. When the primary key is>
> generated by the database, the program uses a JDBC-emitting>
> MapFunction to persist records for one table and retrieve its PK. The>
> other table is persisted using JDBCOutputFormat. Only eventually>
> consistent.>
>
> JDBCOutputFormat and database-specific features. Most broadly>
> supported would be stored procedures, but use other mechanisms like>
> CTEs. Atomic; requires non-portable database implementations.>
>
> Custom OutputFormat. Full control, allows for atomic updates at the>
> cost of maintaining custom OutputFormats for each combination of>
> updated tables.>
>
> Has anyone seen any other approaches to this challenge?>
>
> Regards,>
> Dylan>
>