Regarding the use of RichAsyncFunction as a FlatMap

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Regarding the use of RichAsyncFunction as a FlatMap

Konstantinos Barmpis
I was wondering if there is a way to create an asynchronous flatmap function in Flink.

As far as I am aware, the asynchronous function only accepts a single result future as its return (which can be the aggregate list of the flatmap, but then partiality is lost as we have to wait for all of the results to be aggregated before returning).

Alternatively, using a normal FlatMap seems to require a way to keep the current instance alive as it waits for the results and emits them one at a time, as otherwise Flink may consider it terminated (as the function call returns immediately) and close the channel.

Is there some third option I am unaware of (or using one of the above two in a different way?), 

Cheers.




--
Konstantinos Barmpis | Research Associate
White Rose Grid Enterprise Systems Group
Dept. of Computer Science
University of York
Tel: +44 (0) 1904-32 5653
 
Email Disclaimer:
Reply | Threaded
Open this post in threaded view
|

RE: Regarding the use of RichAsyncFunction as a FlatMap

Martin, Nick

Regarding the ‘partiality’ you mention in option one, wouldn’t you have to give that up anyway to maintain exactly once processing? Suppose input message A results in asynchronous queries/futures B and C, and imagine the following series of events:

1.       Your function receives A

2.       Asynchronous calls start for B and C

3.       B completes, and its result is immediately emitted.

4.       Your job is restarted before C returns

 

How would Flink restart the job without either dropping C or duplicating B?  I think the only answer is for the function to hold on to results generated by A until they have all completed, then emit them all at once, and I have a suspicion that under the hood, that’s what a flat map is doing.

 

From: Konstantinos Barmpis [mailto:[hidden email]]
Sent: Thursday, July 12, 2018 8:42 AM
To: [hidden email]
Subject: Regarding the use of RichAsyncFunction as a FlatMap

 

I was wondering if there is a way to create an asynchronous flatmap function in Flink.

 

As far as I am aware, the asynchronous function only accepts a single result future as its return (which can be the aggregate list of the flatmap, but then partiality is lost as we have to wait for all of the results to be aggregated before returning).

 

Alternatively, using a normal FlatMap seems to require a way to keep the current instance alive as it waits for the results and emits them one at a time, as otherwise Flink may consider it terminated (as the function call returns immediately) and close the channel.

 

Is there some third option I am unaware of (or using one of the above two in a different way?), 

 

Cheers.

 

 

 

 

--

Konstantinos Barmpis | Research Associate
White Rose Grid Enterprise Systems Group

Dept. of Computer Science
University of York
Tel: +44 (0) 1904-32 5653

 

Email Disclaimer:



Notice: This e-mail is intended solely for use of the individual or entity to which it is addressed and may contain information that is proprietary, privileged and/or exempt from disclosure under applicable law. If the reader is not the intended recipient or agent responsible for delivering the message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. This communication may also contain data subject to U.S. export laws. If so, data subject to the International Traffic in Arms Regulation cannot be disseminated, distributed, transferred, or copied, whether incorporated or in its original form, to foreign nationals residing in the U.S. or abroad, absent the express prior approval of the U.S. Department of State. Data subject to the Export Administration Act may not be disseminated, distributed, transferred or copied contrary to U. S. Department of Commerce regulations. If you have received this communication in error, please notify the sender by reply e-mail and destroy the e-mail message and any physical copies made of the communication.
 Thank you. 
*********************