Bucketing/Rolling Sink: How to overwrite method "openNewPartFile" - to append a new timestamp to part file path every time a new part file is being created

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Bucketing/Rolling Sink: How to overwrite method "openNewPartFile" - to append a new timestamp to part file path every time a new part file is being created

Raja.Aravapalli

 

Hi,

 

I want to overwrite the method “openNewPartFile” in the BucketingSink Class such that it creates part file name with inclusion of timestamp whenever it rolls a new part file.

 

Can someone share some thoughts on how I can do this.                 

 

Thanks a ton, in advance.

 

 

Regards,

Raja.

Reply | Threaded
Open this post in threaded view
|

Re: Bucketing/Rolling Sink: How to overwrite method "openNewPartFile" - to append a new timestamp to part file path every time a new part file is being created

Kostas Kloudas
Hi Raja,

To know about the method, I suppose you have looked at the source code of the sink.
I think that including the timestamp of the element in the path file is not as easy as overriding the openNewPartFile.

The reason is that the filenames serve as identities for the associated state of the bucket and this searches for 
complete equality of the filename, rather that “contains()”, when checking for partial filenames to transition from
pending to finished state.

A way to bypass this, it to write along each element, its timestamp, so that when you check out the content of the 
file, you can see the timestamp of the first element. You will have to write more data though.

Does this fit your needs?

Kostas

On Oct 6, 2017, at 11:02 PM, Raja.Aravapalli <[hidden email]> wrote:

 
Hi,
 
I want to overwrite the method “openNewPartFile” in the BucketingSink Class such that it creates part file name with inclusion of timestamp whenever it rolls a new part file.
 
Can someone share some thoughts on how I can do this.                 
 
Thanks a ton, in advance. 
 
 
Regards,
Raja.