Re: Architecture question

Posted by Fabian Hueske-2 on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/Architecture-question-tp18341p18407.html

Hi,

What you are looking for is a BucketingSink that works on event time (the timestamp is encoded in your data).
AFAIK, Flink's BucketingSink has been designed to work in processing time, but you can implement a Bucketer that creates buckets based on a timestamp in the data.
You might need to play around with the parameters for closing open buckets for a good behavior (similar to watermark tuning).

Best, Fabian

2018-02-14 22:18 GMT+01:00 robert <[hidden email]>:
I need to grab avro data from a kafka topic and write to the local file
system

Inside the avro record there is a date time field. From that field I need to
name the file accordingly. (20180103) as an example


I was thinking of using flink to read, unpack this generic record then put
to a sink that will sort to make sure it goes into the right file.

Does anyhow have a high-level approach for this ?

The bucketing sink look promising. Any examples of this type of problem for
flink to solve ?

Thanks



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/