Re: Updating external service and then processing response

Posted by Michael Latta on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/Updating-external-service-and-then-processing-response-tp19851p19859.html

If the external web service call does not modify the state of that external system all the approaches you list are probably ok.  If there is external state modification then you want to ensure on restart the Flink job does not resend requests to that service or that it can handle duplicate requests.  In that sense a sink that sends the request is the cleanest as it represents an export of data to an external system.  The response back is just to allow the sink to not repeat messages.  If it is sending data to the system that affects it’s state and then the response has values that need to be recorded as results not just control values, then that could be a separate flow or use the map process as from the Flink’s point of view it was a transform.  This later case however to me smacks of an undesireable side effect as these make error recovery cases harder.

Michael

> On Apr 28, 2018, at 8:21 PM, wazza <[hidden email]> wrote:
>
> Hi all,
>
> I need to send a request to an external web service and then store the response in a DB table, and I am wondering how people have approached this or similar problems in the past.
>
> The flow is: Kafka source (msgs only every few seconds)  => filter/map operators => result sent to web service (which updates state in that system) => response stored in DB.
>
> Initially I was thinking of just creating a custom sink which basically: Sends request to webservice  => Get response containing external key => Save key into DB
> This feels to me like basically smashing together 2 separate sinks into 1, and I am not sure if that is a good design or not.
>
> Another option would be to create a RichMapFunction (possibly async function) which does the web service call. My map function can then just return the response which I can then feed into a standard DB sink.
> However, with this approach it feels strange to update an external system in a map() function, but maybe that's ok? Also, I presume to make my map function idempotent I would need to store some state (I can key the messages and use a ValueState) so I don't do duplicate web service calls if there is a failure?
>
> Thoughts?
>
> Thanks in advance.
>