Kafka Sinking from DataSet

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Kafka Sinking from DataSet

Jonny Graham

Hi,

I'm using HadoopInputs.readHadoopFile() to read a Parquet file, which gives me a DataSource which (as far as I can see) is basically a DataSet.

I want to write data from this source into Kafka, but the Kafka sink only works on a DataStream.

 

There's no easy way to convert my DataSet to a DataStream.

 

What's the canonical way to do this in Flink? I could just use a Kafka producer directly on my data (map on the DataSet?), or I could output to a file (after manipulating my dataset) and use something other than Flink to load the data into Kafka. I'm not sure as to what the correct approach is.

 

Ideally I'd like to get exactly-once semantics but I doubt that can be done easily since I will be writing a lot of events to Kafka

 

Thanks,

Jonny


Confidentiality: This communication and any attachments are intended for the above-named persons only and may be confidential and/or legally privileged. Any opinions expressed in this communication are not necessarily those of NICE Actimize. If this communication has come to you in error you must take no action based on it, nor must you copy or show it to anyone; please delete/destroy and inform the sender by e-mail immediately. 
Monitoring: NICE Actimize may monitor incoming and outgoing e-mails.
Viruses: Although we have taken steps toward ensuring that this e-mail and attachments are free from any virus, we advise that in keeping with good computing practice the recipient should ensure they are actually virus free.


Confidentiality: This communication and any attachments are intended for the above-named persons only and may be confidential and/or legally privileged. Any opinions expressed in this communication are not necessarily those of NICE Actimize. If this communication has come to you in error you must take no action based on it, nor must you copy or show it to anyone; please delete/destroy and inform the sender by e-mail immediately. 
Monitoring: NICE Actimize may monitor incoming and outgoing e-mails.
Viruses: Although we have taken steps toward ensuring that this e-mail and attachments are free from any virus, we advise that in keeping with good computing practice the recipient should ensure they are actually virus free.