Re: JDBC sink in flink

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: JDBC sink in flink

har777
Hi,

Are there any examples of implementing a jdbc sink in flink using a connection pool ?

Thanks

On Tue, Jul 5, 2016 at 2:00 PM, Harikrishnan S <[hidden email]> wrote:
Hi,

Are there any examples of implementing a jdbc sink in flink using a connection pool ?

Thanks

On Tue, Jul 5, 2016 at 1:57 PM, Harikrishnan S <[hidden email]> wrote:
Hi,

Are there any examples of implementing a jdbc sink in flink using a connection pool ?

Thanks


Reply | Threaded
Open this post in threaded view
|

Re: JDBC sink in flink

Flavio Pompermaier

why do you need a connection pool?

On 5 Jul 2016 11:41, "Harikrishnan S" <[hidden email]> wrote:
Hi,

Are there any examples of implementing a jdbc sink in flink using a connection pool ?

Thanks

On Tue, Jul 5, 2016 at 2:00 PM, Harikrishnan S <[hidden email]> wrote:
Hi,

Are there any examples of implementing a jdbc sink in flink using a connection pool ?

Thanks

On Tue, Jul 5, 2016 at 1:57 PM, Harikrishnan S <[hidden email]> wrote:
Hi,

Are there any examples of implementing a jdbc sink in flink using a connection pool ?

Thanks


Reply | Threaded
Open this post in threaded view
|

Re: JDBC sink in flink

Stefano Bortoli
The connection will be managed by the splitManager, no need of using a pool. However, if you had to, probably you should look into establishConnection() method of the JDBCInputFormat.



2016-07-05 10:52 GMT+02:00 Flavio Pompermaier <[hidden email]>:

why do you need a connection pool?

On 5 Jul 2016 11:41, "Harikrishnan S" <[hidden email]> wrote:
Hi,

Are there any examples of implementing a jdbc sink in flink using a connection pool ?

Thanks

On Tue, Jul 5, 2016 at 2:00 PM, Harikrishnan S <[hidden email]> wrote:
Hi,

Are there any examples of implementing a jdbc sink in flink using a connection pool ?

Thanks

On Tue, Jul 5, 2016 at 1:57 PM, Harikrishnan S <[hidden email]> wrote:
Hi,

Are there any examples of implementing a jdbc sink in flink using a connection pool ?

Thanks



Reply | Threaded
Open this post in threaded view
|

Re: JDBC sink in flink

har777
Oh. So you mean if I write a custom sink for a db, I just need to create one connection in the open() method and then the invoke() method will reuse it ? Basically I need to do 35k-50k+ upserts in postgres. Can I reuse JDBCOutputFormat for this purpose ? I couldn't find a proper document describing how sinks works in flink. 



On Tue, Jul 5, 2016 at 2:41 PM, Stefano Bortoli <[hidden email]> wrote:
The connection will be managed by the splitManager, no need of using a pool. However, if you had to, probably you should look into establishConnection() method of the JDBCInputFormat.



2016-07-05 10:52 GMT+02:00 Flavio Pompermaier <[hidden email]>:

why do you need a connection pool?

On 5 Jul 2016 11:41, "Harikrishnan S" <[hidden email]> wrote:
Hi,

Are there any examples of implementing a jdbc sink in flink using a connection pool ?

Thanks

On Tue, Jul 5, 2016 at 2:00 PM, Harikrishnan S <[hidden email]> wrote:
Hi,

Are there any examples of implementing a jdbc sink in flink using a connection pool ?

Thanks

On Tue, Jul 5, 2016 at 1:57 PM, Harikrishnan S <[hidden email]> wrote:
Hi,

Are there any examples of implementing a jdbc sink in flink using a connection pool ?

Thanks




Reply | Threaded
Open this post in threaded view
|

Re: JDBC sink in flink

har777
In reply to this post by Flavio Pompermaier
The basic idea was that I would create a pool of connections in the open() method in a custom sink and each invoke() method gets one connection from the pool and does the upserts needed. I might have misunderstood how sinks work in flink though.

On Tue, Jul 5, 2016 at 2:22 PM, Flavio Pompermaier <[hidden email]> wrote:

why do you need a connection pool?

On 5 Jul 2016 11:41, "Harikrishnan S" <[hidden email]> wrote:
Hi,

Are there any examples of implementing a jdbc sink in flink using a connection pool ?

Thanks

On Tue, Jul 5, 2016 at 2:00 PM, Harikrishnan S <[hidden email]> wrote:
Hi,

Are there any examples of implementing a jdbc sink in flink using a connection pool ?

Thanks

On Tue, Jul 5, 2016 at 1:57 PM, Harikrishnan S <[hidden email]> wrote:
Hi,

Are there any examples of implementing a jdbc sink in flink using a connection pool ?

Thanks



Reply | Threaded
Open this post in threaded view
|

Re: JDBC sink in flink

Chesnay Schepler
Hello,

an instance of the JDBCOutputFormat will use a single connection to send all values.

Essentially
- open(...) is called at the very start to create the connection
- then all invoke/writeRecord calls are executed (using the same connection)
- then close() is called to clean up.

The total number of connections made to the database depends on the parallelism of the Sink, as every parallel instance creates it's own connection.

Regards,
Chesnay

On 05.07.2016 12:04, Harikrishnan S wrote:
The basic idea was that I would create a pool of connections in the open() method in a custom sink and each invoke() method gets one connection from the pool and does the upserts needed. I might have misunderstood how sinks work in flink though.

On Tue, Jul 5, 2016 at 2:22 PM, Flavio Pompermaier <[hidden email]> wrote:

why do you need a connection pool?

On 5 Jul 2016 11:41, "Harikrishnan S" <[hidden email]> wrote:
Hi,

Are there any examples of implementing a jdbc sink in flink using a connection pool ?

Thanks

On Tue, Jul 5, 2016 at 2:00 PM, Harikrishnan S <[hidden email]> wrote:
Hi,

Are there any examples of implementing a jdbc sink in flink using a connection pool ?

Thanks

On Tue, Jul 5, 2016 at 1:57 PM, Harikrishnan S <[hidden email]> wrote:
Hi,

Are there any examples of implementing a jdbc sink in flink using a connection pool ?

Thanks




Reply | Threaded
Open this post in threaded view
|

Re: JDBC sink in flink

har777
Ah that makes send. Also what's the difference between a RichOutputFormat and a RichSinkFunction ? Can I use JDBCOutputFormat as a sink in a stream ?

On Tue, Jul 5, 2016 at 3:53 PM, Chesnay Schepler <[hidden email]> wrote:
Hello,

an instance of the JDBCOutputFormat will use a single connection to send all values.

Essentially
- open(...) is called at the very start to create the connection
- then all invoke/writeRecord calls are executed (using the same connection)
- then close() is called to clean up.

The total number of connections made to the database depends on the parallelism of the Sink, as every parallel instance creates it's own connection.

Regards,
Chesnay


On 05.07.2016 12:04, Harikrishnan S wrote:
The basic idea was that I would create a pool of connections in the open() method in a custom sink and each invoke() method gets one connection from the pool and does the upserts needed. I might have misunderstood how sinks work in flink though.

On Tue, Jul 5, 2016 at 2:22 PM, Flavio Pompermaier <[hidden email]> wrote:

why do you need a connection pool?

On 5 Jul 2016 11:41, "Harikrishnan S" <[hidden email]> wrote:
Hi,

Are there any examples of implementing a jdbc sink in flink using a connection pool ?

Thanks

On Tue, Jul 5, 2016 at 2:00 PM, Harikrishnan S <[hidden email][hidden email]> wrote:
Hi,

Are there any examples of implementing a jdbc sink in flink using a connection pool ?

Thanks

On Tue, Jul 5, 2016 at 1:57 PM, Harikrishnan S <[hidden email][hidden email]> wrote:
Hi,

Are there any examples of implementing a jdbc sink in flink using a connection pool ?

Thanks





Reply | Threaded
Open this post in threaded view
|

Re: JDBC sink in flink

Chesnay Schepler
They serve a similar purpose.

OutputFormats originate from the Batch API, whereas SinkFunctions are a Streaming API concept.

You can however use OutputFormats in the Streaming API using the DataStrea#writeUsingOutputFormat.

Regards,
Chesnay

On 05.07.2016 12:51, Harikrishnan S wrote:
Ah that makes send. Also what's the difference between a RichOutputFormat and a RichSinkFunction ? Can I use JDBCOutputFormat as a sink in a stream ?

On Tue, Jul 5, 2016 at 3:53 PM, Chesnay Schepler <[hidden email]> wrote:
Hello,

an instance of the JDBCOutputFormat will use a single connection to send all values.

Essentially
- open(...) is called at the very start to create the connection
- then all invoke/writeRecord calls are executed (using the same connection)
- then close() is called to clean up.

The total number of connections made to the database depends on the parallelism of the Sink, as every parallel instance creates it's own connection.

Regards,
Chesnay


On 05.07.2016 12:04, Harikrishnan S wrote:
The basic idea was that I would create a pool of connections in the open() method in a custom sink and each invoke() method gets one connection from the pool and does the upserts needed. I might have misunderstood how sinks work in flink though.

On Tue, Jul 5, 2016 at 2:22 PM, Flavio Pompermaier <[hidden email]> wrote:

why do you need a connection pool?

On 5 Jul 2016 11:41, "Harikrishnan S" <[hidden email]> wrote:
Hi,

Are there any examples of implementing a jdbc sink in flink using a connection pool ?

Thanks

On Tue, Jul 5, 2016 at 2:00 PM, Harikrishnan S <[hidden email]> wrote:
Hi,

Are there any examples of implementing a jdbc sink in flink using a connection pool ?

Thanks

On Tue, Jul 5, 2016 at 1:57 PM, Harikrishnan S <[hidden email]> wrote:
Hi,

Are there any examples of implementing a jdbc sink in flink using a connection pool ?

Thanks






Reply | Threaded
Open this post in threaded view
|

Re: JDBC sink in flink

har777
Awesome ! Thanks a lot ! I should probably write a blog post somewhere explaining this.

On Tue, Jul 5, 2016 at 4:28 PM, Chesnay Schepler <[hidden email]> wrote:
They serve a similar purpose.

OutputFormats originate from the Batch API, whereas SinkFunctions are a Streaming API concept.

You can however use OutputFormats in the Streaming API using the DataStrea#writeUsingOutputFormat.

Regards,
Chesnay


On 05.07.2016 12:51, Harikrishnan S wrote:
Ah that makes send. Also what's the difference between a RichOutputFormat and a RichSinkFunction ? Can I use JDBCOutputFormat as a sink in a stream ?

On Tue, Jul 5, 2016 at 3:53 PM, Chesnay Schepler <[hidden email]> wrote:
Hello,

an instance of the JDBCOutputFormat will use a single connection to send all values.

Essentially
- open(...) is called at the very start to create the connection
- then all invoke/writeRecord calls are executed (using the same connection)
- then close() is called to clean up.

The total number of connections made to the database depends on the parallelism of the Sink, as every parallel instance creates it's own connection.

Regards,
Chesnay


On 05.07.2016 12:04, Harikrishnan S wrote:
The basic idea was that I would create a pool of connections in the open() method in a custom sink and each invoke() method gets one connection from the pool and does the upserts needed. I might have misunderstood how sinks work in flink though.

On Tue, Jul 5, 2016 at 2:22 PM, Flavio Pompermaier <[hidden email][hidden email]> wrote:

why do you need a connection pool?

On 5 Jul 2016 11:41, "Harikrishnan S" <[hidden email][hidden email]> wrote:
Hi,

Are there any examples of implementing a jdbc sink in flink using a connection pool ?

Thanks

On Tue, Jul 5, 2016 at 2:00 PM, Harikrishnan S <[hidden email][hidden email]> wrote:
Hi,

Are there any examples of implementing a jdbc sink in flink using a connection pool ?

Thanks

On Tue, Jul 5, 2016 at 1:57 PM, Harikrishnan S <[hidden email][hidden email]> wrote:
Hi,

Are there any examples of implementing a jdbc sink in flink using a connection pool ?

Thanks







Reply | Threaded
Open this post in threaded view
|

Re: JDBC sink in flink

Stefano Bortoli
As Chesnay said, it not necessary to use a pool as the connection is reused across split. However, if you had to customize it for some reasons, you can do it starting from the JDBC Input and Output format.

cheers!

2016-07-05 13:27 GMT+02:00 Harikrishnan S <[hidden email]>:
Awesome ! Thanks a lot ! I should probably write a blog post somewhere explaining this.

On Tue, Jul 5, 2016 at 4:28 PM, Chesnay Schepler <[hidden email]> wrote:
They serve a similar purpose.

OutputFormats originate from the Batch API, whereas SinkFunctions are a Streaming API concept.

You can however use OutputFormats in the Streaming API using the DataStrea#writeUsingOutputFormat.

Regards,
Chesnay


On 05.07.2016 12:51, Harikrishnan S wrote:
Ah that makes send. Also what's the difference between a RichOutputFormat and a RichSinkFunction ? Can I use JDBCOutputFormat as a sink in a stream ?

On Tue, Jul 5, 2016 at 3:53 PM, Chesnay Schepler <[hidden email]> wrote:
Hello,

an instance of the JDBCOutputFormat will use a single connection to send all values.

Essentially
- open(...) is called at the very start to create the connection
- then all invoke/writeRecord calls are executed (using the same connection)
- then close() is called to clean up.

The total number of connections made to the database depends on the parallelism of the Sink, as every parallel instance creates it's own connection.

Regards,
Chesnay


On 05.07.2016 12:04, Harikrishnan S wrote:
The basic idea was that I would create a pool of connections in the open() method in a custom sink and each invoke() method gets one connection from the pool and does the upserts needed. I might have misunderstood how sinks work in flink though.

On Tue, Jul 5, 2016 at 2:22 PM, Flavio Pompermaier <[hidden email][hidden email]> wrote:

why do you need a connection pool?

On 5 Jul 2016 11:41, "Harikrishnan S" <[hidden email][hidden email]> wrote:
Hi,

Are there any examples of implementing a jdbc sink in flink using a connection pool ?

Thanks

On Tue, Jul 5, 2016 at 2:00 PM, Harikrishnan S <[hidden email][hidden email]> wrote:
Hi,

Are there any examples of implementing a jdbc sink in flink using a connection pool ?

Thanks

On Tue, Jul 5, 2016 at 1:57 PM, Harikrishnan S <[hidden email][hidden email]> wrote:
Hi,

Are there any examples of implementing a jdbc sink in flink using a connection pool ?

Thanks