data conversion between flink and "other" paradigms

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

data conversion between flink and "other" paradigms

Bill Sparks

Just a question if there was some prior-art here. Just say someone wanted to use flink for processing, but at some point they wanted to call another function via say JNI/C which doesn't understand DataSet's. How would one go about this ... I'm assuming the code would have to convert the data to a common format prior to calling the function.

 

 

Regards,

   Bill.

Reply | Threaded
Open this post in threaded view
|

Re: data conversion between flink and "other" paradigms

Fabian Hueske-2
Hi Bill,

a DataSet is just a logical concept in Flink. DataSets are often not persisted and just streamed along operators. At the moment, there is no way to access an intermediate DataSet of a Flink program directly (this might change in the future).

You can process data in another function by implementing a Java user function (for example a MapPartition function) and sending the data through JNI to a C function (if you need the full data set, you must set the parallelism to 1). Flink's Python API follows a similar approach to ship data from Flink to an external Python process.

Best, Fabian



2015-07-06 9:30 GMT+02:00 Bill Sparks <[hidden email]>:

Just a question if there was some prior-art here. Just say someone wanted to use flink for processing, but at some point they wanted to call another function via say JNI/C which doesn't understand DataSet's. How would one go about this ... I'm assuming the code would have to convert the data to a common format prior to calling the function.

 

 

Regards,

   Bill.


Reply | Threaded
Open this post in threaded view
|

RE: data conversion between flink and "other" paradigms

Bill Sparks

Fabian.

 

Thanks for the info and pointer to python. I'll check it out.

 

-Bill


From: Fabian Hueske [[hidden email]]
Sent: Monday, July 06, 2015 3:23 AM
To: [hidden email]
Subject: Re: data conversion between flink and "other" paradigms

Hi Bill,

a DataSet is just a logical concept in Flink. DataSets are often not persisted and just streamed along operators. At the moment, there is no way to access an intermediate DataSet of a Flink program directly (this might change in the future).

You can process data in another function by implementing a Java user function (for example a MapPartition function) and sending the data through JNI to a C function (if you need the full data set, you must set the parallelism to 1). Flink's Python API follows a similar approach to ship data from Flink to an external Python process.

Best, Fabian



2015-07-06 9:30 GMT+02:00 Bill Sparks <[hidden email]>:

Just a question if there was some prior-art here. Just say someone wanted to use flink for processing, but at some point they wanted to call another function via say JNI/C which doesn't understand DataSet's. How would one go about this ... I'm assuming the code would have to convert the data to a common format prior to calling the function.

 

 

Regards,

   Bill.