Flink JDBC: Disable auto-commit mode

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Flink JDBC: Disable auto-commit mode

Papadopoulos, Konstantinos

Hi all,

We are facing an issue when trying to integrate PostgreSQL with Flink JDBC. When you establish a connection to the PostgreSQL database, it is in auto-commit mode. It means that each SQL statement is treated as a transaction and is automatically committed, but this functionality results in unexpected behavior (e.g., out-of-memory errors) when executed for large result sets. In order to bypass such issues, we must disable the auto-commit mode. To do this, in a simple Java application, we call the setAutoCommit() method of the Connection object.

So, my question is: How can we achieve this by using JDBCInputFormat of Flink?

Thanks in advance,

Konstantinos

Reply | Threaded
Open this post in threaded view
|

Re: Flink JDBC: Disable auto-commit mode

Rong Rong
Hi Konstantinos,

Seems like setting for auto commit is not directly possible in the current JDBCInputFormatBuilder. 
However there's a way to specify the fetch size [1] for your DB round-trip, doesn't that resolve your issue? 

Similarly in JDBCOutputFormat, a batching mode was also used to stash upload rows before flushing to DB.

--
Rong


On Fri, Apr 12, 2019 at 6:23 AM Papadopoulos, Konstantinos <[hidden email]> wrote:

Hi all,

We are facing an issue when trying to integrate PostgreSQL with Flink JDBC. When you establish a connection to the PostgreSQL database, it is in auto-commit mode. It means that each SQL statement is treated as a transaction and is automatically committed, but this functionality results in unexpected behavior (e.g., out-of-memory errors) when executed for large result sets. In order to bypass such issues, we must disable the auto-commit mode. To do this, in a simple Java application, we call the setAutoCommit() method of the Connection object.

So, my question is: How can we achieve this by using JDBCInputFormat of Flink?

Thanks in advance,

Konstantinos

Reply | Threaded
Open this post in threaded view
|

RE: Flink JDBC: Disable auto-commit mode

Papadopoulos, Konstantinos

Hi Rong,

 

We have already tried to set the fetch size with no success. According to PG documentation we have to set both configuration parameters (i.e., auto-commit to false and limit fetch) to achieve our purpose.

 

Thanks,

Konstantinos

 

From: Rong Rong <[hidden email]>
Sent: Παρασκευή, 12 Απριλίου 2019 6:50 μμ
To: Papadopoulos, Konstantinos <[hidden email]>
Cc: user <[hidden email]>
Subject: Re: Flink JDBC: Disable auto-commit mode

 

Hi Konstantinos,

 

Seems like setting for auto commit is not directly possible in the current JDBCInputFormatBuilder. 

However there's a way to specify the fetch size [1] for your DB round-trip, doesn't that resolve your issue? 

 

Similarly in JDBCOutputFormat, a batching mode was also used to stash upload rows before flushing to DB.

 

--

Rong

 

 

On Fri, Apr 12, 2019 at 6:23 AM Papadopoulos, Konstantinos <[hidden email]> wrote:

Hi all,

We are facing an issue when trying to integrate PostgreSQL with Flink JDBC. When you establish a connection to the PostgreSQL database, it is in auto-commit mode. It means that each SQL statement is treated as a transaction and is automatically committed, but this functionality results in unexpected behavior (e.g., out-of-memory errors) when executed for large result sets. In order to bypass such issues, we must disable the auto-commit mode. To do this, in a simple Java application, we call the setAutoCommit() method of the Connection object.

So, my question is: How can we achieve this by using JDBCInputFormat of Flink?

Thanks in advance,

Konstantinos

Reply | Threaded
Open this post in threaded view
|

Re: Flink JDBC: Disable auto-commit mode

Fabian Hueske-2
Hi Konstantinos,

This sounds like a useful extension to me.
Would you like to create a Jira issue and contribute the improvement?

In the meantime, you can just fork the code of JDBCInputFormat and adjust it to your needs.

Best, Fabian

Am Mo., 15. Apr. 2019 um 08:53 Uhr schrieb Papadopoulos, Konstantinos <[hidden email]>:

Hi Rong,

 

We have already tried to set the fetch size with no success. According to PG documentation we have to set both configuration parameters (i.e., auto-commit to false and limit fetch) to achieve our purpose.

 

Thanks,

Konstantinos

 

From: Rong Rong <[hidden email]>
Sent: Παρασκευή, 12 Απριλίου 2019 6:50 μμ
To: Papadopoulos, Konstantinos <[hidden email]>
Cc: user <[hidden email]>
Subject: Re: Flink JDBC: Disable auto-commit mode

 

Hi Konstantinos,

 

Seems like setting for auto commit is not directly possible in the current JDBCInputFormatBuilder. 

However there's a way to specify the fetch size [1] for your DB round-trip, doesn't that resolve your issue? 

 

Similarly in JDBCOutputFormat, a batching mode was also used to stash upload rows before flushing to DB.

 

--

Rong

 

 

On Fri, Apr 12, 2019 at 6:23 AM Papadopoulos, Konstantinos <[hidden email]> wrote:

Hi all,

We are facing an issue when trying to integrate PostgreSQL with Flink JDBC. When you establish a connection to the PostgreSQL database, it is in auto-commit mode. It means that each SQL statement is treated as a transaction and is automatically committed, but this functionality results in unexpected behavior (e.g., out-of-memory errors) when executed for large result sets. In order to bypass such issues, we must disable the auto-commit mode. To do this, in a simple Java application, we call the setAutoCommit() method of the Connection object.

So, my question is: How can we achieve this by using JDBCInputFormat of Flink?

Thanks in advance,

Konstantinos

Reply | Threaded
Open this post in threaded view
|

RE: Flink JDBC: Disable auto-commit mode

Papadopoulos, Konstantinos

Hi Fabian,

 

Glad to hear that you agree for such an improvement. Of course, I can handle it.

 

Best,             

Konstantinos

 

From: Fabian Hueske <[hidden email]>
Sent: Δευτέρα, 15 Απριλίου 2019 11:56 πμ
To: Papadopoulos, Konstantinos <[hidden email]>
Cc: Rong Rong <[hidden email]>; user <[hidden email]>
Subject: Re: Flink JDBC: Disable auto-commit mode

 

Hi Konstantinos,

 

This sounds like a useful extension to me.

Would you like to create a Jira issue and contribute the improvement?

 

In the meantime, you can just fork the code of JDBCInputFormat and adjust it to your needs.

 

Best, Fabian

 

Am Mo., 15. Apr. 2019 um 08:53 Uhr schrieb Papadopoulos, Konstantinos <[hidden email]>:

Hi Rong,

 

We have already tried to set the fetch size with no success. According to PG documentation we have to set both configuration parameters (i.e., auto-commit to false and limit fetch) to achieve our purpose.

 

Thanks,

Konstantinos

 

From: Rong Rong <[hidden email]>
Sent: Παρασκευή, 12 Απριλίου 2019 6:50 μμ
To: Papadopoulos, Konstantinos <[hidden email]>
Cc: user <[hidden email]>
Subject: Re: Flink JDBC: Disable auto-commit mode

 

Hi Konstantinos,

 

Seems like setting for auto commit is not directly possible in the current JDBCInputFormatBuilder. 

However there's a way to specify the fetch size [1] for your DB round-trip, doesn't that resolve your issue? 

 

Similarly in JDBCOutputFormat, a batching mode was also used to stash upload rows before flushing to DB.

 

--

Rong

 

 

On Fri, Apr 12, 2019 at 6:23 AM Papadopoulos, Konstantinos <[hidden email]> wrote:

Hi all,

We are facing an issue when trying to integrate PostgreSQL with Flink JDBC. When you establish a connection to the PostgreSQL database, it is in auto-commit mode. It means that each SQL statement is treated as a transaction and is automatically committed, but this functionality results in unexpected behavior (e.g., out-of-memory errors) when executed for large result sets. In order to bypass such issues, we must disable the auto-commit mode. To do this, in a simple Java application, we call the setAutoCommit() method of the Connection object.

So, my question is: How can we achieve this by using JDBCInputFormat of Flink?

Thanks in advance,

Konstantinos

Reply | Threaded
Open this post in threaded view
|

RE: Flink JDBC: Disable auto-commit mode

Papadopoulos, Konstantinos

Hi Fabian,

 

I opened the following issue to track the improvement proposed:

https://issues.apache.org/jira/browse/FLINK-12198

 

Best,

Konstantinos

 

From: Papadopoulos, Konstantinos <[hidden email]>
Sent: Δευτέρα, 15 Απριλίου 2019 12:30 μμ
To: Fabian Hueske <[hidden email]>
Cc: Rong Rong <[hidden email]>; user <[hidden email]>
Subject: RE: Flink JDBC: Disable auto-commit mode

 

Hi Fabian,

 

Glad to hear that you agree for such an improvement. Of course, I can handle it.

 

Best,             

Konstantinos

 

From: Fabian Hueske <[hidden email]>
Sent: Δευτέρα, 15 Απριλίου 2019 11:56 πμ
To: Papadopoulos, Konstantinos <[hidden email]>
Cc: Rong Rong <[hidden email]>; user <[hidden email]>
Subject: Re: Flink JDBC: Disable auto-commit mode

 

Hi Konstantinos,

 

This sounds like a useful extension to me.

Would you like to create a Jira issue and contribute the improvement?

 

In the meantime, you can just fork the code of JDBCInputFormat and adjust it to your needs.

 

Best, Fabian

 

Am Mo., 15. Apr. 2019 um 08:53 Uhr schrieb Papadopoulos, Konstantinos <[hidden email]>:

Hi Rong,

 

We have already tried to set the fetch size with no success. According to PG documentation we have to set both configuration parameters (i.e., auto-commit to false and limit fetch) to achieve our purpose.

 

Thanks,

Konstantinos

 

From: Rong Rong <[hidden email]>
Sent: Παρασκευή, 12 Απριλίου 2019 6:50 μμ
To: Papadopoulos, Konstantinos <[hidden email]>
Cc: user <[hidden email]>
Subject: Re: Flink JDBC: Disable auto-commit mode

 

Hi Konstantinos,

 

Seems like setting for auto commit is not directly possible in the current JDBCInputFormatBuilder. 

However there's a way to specify the fetch size [1] for your DB round-trip, doesn't that resolve your issue? 

 

Similarly in JDBCOutputFormat, a batching mode was also used to stash upload rows before flushing to DB.

 

--

Rong

 

 

On Fri, Apr 12, 2019 at 6:23 AM Papadopoulos, Konstantinos <[hidden email]> wrote:

Hi all,

We are facing an issue when trying to integrate PostgreSQL with Flink JDBC. When you establish a connection to the PostgreSQL database, it is in auto-commit mode. It means that each SQL statement is treated as a transaction and is automatically committed, but this functionality results in unexpected behavior (e.g., out-of-memory errors) when executed for large result sets. In order to bypass such issues, we must disable the auto-commit mode. To do this, in a simple Java application, we call the setAutoCommit() method of the Connection object.

So, my question is: How can we achieve this by using JDBCInputFormat of Flink?

Thanks in advance,

Konstantinos

Reply | Threaded
Open this post in threaded view
|

Re: Flink JDBC: Disable auto-commit mode

Fabian Hueske-2
Great, thank you!

Am Mo., 15. Apr. 2019 um 16:28 Uhr schrieb Papadopoulos, Konstantinos <[hidden email]>:

Hi Fabian,

 

I opened the following issue to track the improvement proposed:

https://issues.apache.org/jira/browse/FLINK-12198

 

Best,

Konstantinos

 

From: Papadopoulos, Konstantinos <[hidden email]>
Sent: Δευτέρα, 15 Απριλίου 2019 12:30 μμ
To: Fabian Hueske <[hidden email]>
Cc: Rong Rong <[hidden email]>; user <[hidden email]>
Subject: RE: Flink JDBC: Disable auto-commit mode

 

Hi Fabian,

 

Glad to hear that you agree for such an improvement. Of course, I can handle it.

 

Best,             

Konstantinos

 

From: Fabian Hueske <[hidden email]>
Sent: Δευτέρα, 15 Απριλίου 2019 11:56 πμ
To: Papadopoulos, Konstantinos <[hidden email]>
Cc: Rong Rong <[hidden email]>; user <[hidden email]>
Subject: Re: Flink JDBC: Disable auto-commit mode

 

Hi Konstantinos,

 

This sounds like a useful extension to me.

Would you like to create a Jira issue and contribute the improvement?

 

In the meantime, you can just fork the code of JDBCInputFormat and adjust it to your needs.

 

Best, Fabian

 

Am Mo., 15. Apr. 2019 um 08:53 Uhr schrieb Papadopoulos, Konstantinos <[hidden email]>:

Hi Rong,

 

We have already tried to set the fetch size with no success. According to PG documentation we have to set both configuration parameters (i.e., auto-commit to false and limit fetch) to achieve our purpose.

 

Thanks,

Konstantinos

 

From: Rong Rong <[hidden email]>
Sent: Παρασκευή, 12 Απριλίου 2019 6:50 μμ
To: Papadopoulos, Konstantinos <[hidden email]>
Cc: user <[hidden email]>
Subject: Re: Flink JDBC: Disable auto-commit mode

 

Hi Konstantinos,

 

Seems like setting for auto commit is not directly possible in the current JDBCInputFormatBuilder. 

However there's a way to specify the fetch size [1] for your DB round-trip, doesn't that resolve your issue? 

 

Similarly in JDBCOutputFormat, a batching mode was also used to stash upload rows before flushing to DB.

 

--

Rong

 

 

On Fri, Apr 12, 2019 at 6:23 AM Papadopoulos, Konstantinos <[hidden email]> wrote:

Hi all,

We are facing an issue when trying to integrate PostgreSQL with Flink JDBC. When you establish a connection to the PostgreSQL database, it is in auto-commit mode. It means that each SQL statement is treated as a transaction and is automatically committed, but this functionality results in unexpected behavior (e.g., out-of-memory errors) when executed for large result sets. In order to bypass such issues, we must disable the auto-commit mode. To do this, in a simple Java application, we call the setAutoCommit() method of the Connection object.

So, my question is: How can we achieve this by using JDBCInputFormat of Flink?

Thanks in advance,

Konstantinos

Reply | Threaded
Open this post in threaded view
|

Re: Flink JDBC: Disable auto-commit mode

Rong Rong
+1, Thanks Konstantinos for opening the ticket. 
This would definitely be a useful feature.

--
Rong

On Mon, Apr 15, 2019 at 7:34 AM Fabian Hueske <[hidden email]> wrote:
Great, thank you!

Am Mo., 15. Apr. 2019 um 16:28 Uhr schrieb Papadopoulos, Konstantinos <[hidden email]>:

Hi Fabian,

 

I opened the following issue to track the improvement proposed:

https://issues.apache.org/jira/browse/FLINK-12198

 

Best,

Konstantinos

 

From: Papadopoulos, Konstantinos <[hidden email]>
Sent: Δευτέρα, 15 Απριλίου 2019 12:30 μμ
To: Fabian Hueske <[hidden email]>
Cc: Rong Rong <[hidden email]>; user <[hidden email]>
Subject: RE: Flink JDBC: Disable auto-commit mode

 

Hi Fabian,

 

Glad to hear that you agree for such an improvement. Of course, I can handle it.

 

Best,             

Konstantinos

 

From: Fabian Hueske <[hidden email]>
Sent: Δευτέρα, 15 Απριλίου 2019 11:56 πμ
To: Papadopoulos, Konstantinos <[hidden email]>
Cc: Rong Rong <[hidden email]>; user <[hidden email]>
Subject: Re: Flink JDBC: Disable auto-commit mode

 

Hi Konstantinos,

 

This sounds like a useful extension to me.

Would you like to create a Jira issue and contribute the improvement?

 

In the meantime, you can just fork the code of JDBCInputFormat and adjust it to your needs.

 

Best, Fabian

 

Am Mo., 15. Apr. 2019 um 08:53 Uhr schrieb Papadopoulos, Konstantinos <[hidden email]>:

Hi Rong,

 

We have already tried to set the fetch size with no success. According to PG documentation we have to set both configuration parameters (i.e., auto-commit to false and limit fetch) to achieve our purpose.

 

Thanks,

Konstantinos

 

From: Rong Rong <[hidden email]>
Sent: Παρασκευή, 12 Απριλίου 2019 6:50 μμ
To: Papadopoulos, Konstantinos <[hidden email]>
Cc: user <[hidden email]>
Subject: Re: Flink JDBC: Disable auto-commit mode

 

Hi Konstantinos,

 

Seems like setting for auto commit is not directly possible in the current JDBCInputFormatBuilder. 

However there's a way to specify the fetch size [1] for your DB round-trip, doesn't that resolve your issue? 

 

Similarly in JDBCOutputFormat, a batching mode was also used to stash upload rows before flushing to DB.

 

--

Rong

 

 

On Fri, Apr 12, 2019 at 6:23 AM Papadopoulos, Konstantinos <[hidden email]> wrote:

Hi all,

We are facing an issue when trying to integrate PostgreSQL with Flink JDBC. When you establish a connection to the PostgreSQL database, it is in auto-commit mode. It means that each SQL statement is treated as a transaction and is automatically committed, but this functionality results in unexpected behavior (e.g., out-of-memory errors) when executed for large result sets. In order to bypass such issues, we must disable the auto-commit mode. To do this, in a simple Java application, we call the setAutoCommit() method of the Connection object.

So, my question is: How can we achieve this by using JDBCInputFormat of Flink?

Thanks in advance,

Konstantinos