Can JDBCSinkFunction support exectly once?

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Can JDBCSinkFunction support exectly once?

Jocean shi
Hi,
Can JDBCSinkFunction support exectly once? Is it that The JDBCSinkFunction dont't implement CheckpointListener meaning JDBCSinkFunction dont't support exectly once?

cheers

Jocean
Reply | Threaded
Open this post in threaded view
|

Re: Can JDBCSinkFunction support exectly once?

Dominik Wosiński
Hey, 

As far as I know, the function needs to implement the TwoPhaseCommitFunction and not the CheckpointListener. JDBCSinkFunction does not implement the two-phase commit, so currently it does not support exactly once.

Best Regards,
Dom.

śr., 21 lis 2018 o 11:07 Jocean shi <[hidden email]> napisał(a):
Hi,
Can JDBCSinkFunction support exectly once? Is it that The JDBCSinkFunction dont't implement CheckpointListener meaning JDBCSinkFunction dont't support exectly once?

cheers

Jocean
Reply | Threaded
Open this post in threaded view
|

Re: Can JDBCSinkFunction support exectly once?

Fabian Hueske-2
Hi,

JDBCSinkFunction is a simple wrapper around the JDBCOutputFormat (the DataSet / Batch API output interface).
Dominik is right, that JDBCSinkFunction does not support exactly-once output.

It is not strictly required that an exactly-once sink implements TwoPhaseCommitFunction.
TPCF is a convenience interface that internally use the CheckpointListener interface. It makes it easier to implement exactly-once sinks but is not the only way to do it.

Implementing an exactly-once JDBC sink by extending TwoPhaseCommitFunction would require to be able to recover (and commit) an open transaction after a task was restarted.
Not sure if the JDBC interface supports this.

Best, Fabian

Am Mi., 21. Nov. 2018 um 11:27 Uhr schrieb Dominik Wosiński <[hidden email]>:
Hey, 

As far as I know, the function needs to implement the TwoPhaseCommitFunction and not the CheckpointListener. JDBCSinkFunction does not implement the two-phase commit, so currently it does not support exactly once.

Best Regards,
Dom.

śr., 21 lis 2018 o 11:07 Jocean shi <[hidden email]> napisał(a):
Hi,
Can JDBCSinkFunction support exectly once? Is it that The JDBCSinkFunction dont't implement CheckpointListener meaning JDBCSinkFunction dont't support exectly once?

cheers

Jocean
Reply | Threaded
Open this post in threaded view
|

Re: Can JDBCSinkFunction support exectly once?

Jocean shi
Hi,
Thanks for help.
I find that many sink dont't support stream exectly-once. I need use exectly-once Hbase sink and Mysql sink in my work. 
I try to contribute to Hbase Sink firstly
 
Best Regards,
Jocean

Fabian Hueske <[hidden email]> 于2018年11月21日周三 下午6:34写道:
Hi,

JDBCSinkFunction is a simple wrapper around the JDBCOutputFormat (the DataSet / Batch API output interface).
Dominik is right, that JDBCSinkFunction does not support exactly-once output.

It is not strictly required that an exactly-once sink implements TwoPhaseCommitFunction.
TPCF is a convenience interface that internally use the CheckpointListener interface. It makes it easier to implement exactly-once sinks but is not the only way to do it.

Implementing an exactly-once JDBC sink by extending TwoPhaseCommitFunction would require to be able to recover (and commit) an open transaction after a task was restarted.
Not sure if the JDBC interface supports this.

Best, Fabian

Am Mi., 21. Nov. 2018 um 11:27 Uhr schrieb Dominik Wosiński <[hidden email]>:
Hey, 

As far as I know, the function needs to implement the TwoPhaseCommitFunction and not the CheckpointListener. JDBCSinkFunction does not implement the two-phase commit, so currently it does not support exactly once.

Best Regards,
Dom.

śr., 21 lis 2018 o 11:07 Jocean shi <[hidden email]> napisał(a):
Hi,
Can JDBCSinkFunction support exectly once? Is it that The JDBCSinkFunction dont't implement CheckpointListener meaning JDBCSinkFunction dont't support exectly once?

cheers

Jocean