StateFun feedback operator

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

StateFun feedback operator

Martijn de Heus
Hi all,

Thanks a lot for the development of StateFun. It is very cool.

I don’t understand how the feedback operator works exactly and I want to understand how this works because when benchmarking my system the internal messages seem to be a lot slower as opposed to directly invoking a function from an ingress and outputting to an egress and I want to understand why this is.

Specific questions I have are the following:

* How is the feedback stream exactly merged with the input stream from ingresses? What ordering and time is used to merge the two streams? (Also, when is this done?)
* How exactly does this mechanism work with checkpointing?

I tried to find specific documentation on this mechanism, but I could not find any. Could you point me in the right direction for some documentation?

Kind regards,

Martijn de Heus
Reply | Threaded
Open this post in threaded view
|

Re: StateFun feedback operator

Igal Shilman
Hi Martijn,

I'm glad you like it! and we are always happy to learn about new use cases :)

* How is the feedback stream exactly merged with the input stream from ingresses?

First, I'd like to refer you to this talk, that has a peek under the hood part[1] (it starts at ~26min) where I outline how StateFun concepts like ingress, egress, and user functions are mapped to
the Flink streaming graph. I would be happy to follow up if you will have further questions :)

* What ordering and time is used to merge the two streams? (Also, when is this done?)

If you are referring to watermarks, then StateFun currently supports processing time only, and there is no explicit merge of these two streams,
but rather, two separate input channels (main input, and the feedback channel) might enqueue work for the thread that invokes the user code. (very roughly speaking)

(Also, when is this done?)
As soon as there is any input from either the main input (where messages from an ingress might arrive) or messages received from the feedback loop.

* How exactly does this mechanism work with checkpointing?
  The algorithm still uses checkpoint barriers, as received from Flink, but it would also include in the checkpoint, all the elements that are currently in the feedback loop.

I hope this helps,
Igal.



On Fri, Jan 15, 2021 at 9:51 AM Martijn de Heus <[hidden email]> wrote:
Hi all,

Thanks a lot for the development of StateFun. It is very cool.

I don’t understand how the feedback operator works exactly and I want to understand how this works because when benchmarking my system the internal messages seem to be a lot slower as opposed to directly invoking a function from an ingress and outputting to an egress and I want to understand why this is.

Specific questions I have are the following:

* How is the feedback stream exactly merged with the input stream from ingresses? What ordering and time is used to merge the two streams? (Also, when is this done?)
* How exactly does this mechanism work with checkpointing?

I tried to find specific documentation on this mechanism, but I could not find any. Could you point me in the right direction for some documentation?

Kind regards,

Martijn de Heus