I'm not sure I fully understand the scenario you envision. Are you
saying you want to have some sort of window that batches (and
deduplicates) up until a downstream map has finished processing the
previous deduplicated batch, and then the window should emit the new
batch?
If that's what you want, then I would say no, that implementation
strategy is not a good fit for Flink (at present).
With more understanding of your functional requirements, we might be
able to suggest an approach that would be a better fit.
David
On Fri, Aug 16, 2019 at 10:58 PM Steven Nelson <
[hidden email]> wrote:
>
> Hello!
>
> I think I know the answer to this, but I thought I would go ahead and ask.
>
> We have a process the emits messages to our stream. These messages can include duplicates based on a certain key ( we'll call it TheKey). Our Flink job reads the messages, keys by TheKey and enters a window function. Right now we are using an EventTime Session Window with a custom aggregator, which only keeps the most recent messages for each TheKey. It then emits those messages to a map function that does work based on the message.
>
> What we would like to have is a window function that keeps building up until the downstream map function has completed for the windows key, then emit.
>
> Is this a pattern that Flink supports?
>
> -Steve
>