I'm trying to get a list of late elements in my Tumbling Windows application and I noticed that I need to use SingleOutputStreamOperator<T> instead of DataStream<T> to get access to the .sideOutputLateData(...) method. Can someone explain what the difference is between SingleOutputStreamOperator and DataStream and why I need to use this for getting the late data? Thanks! Snippet: OutputTag<EventBean> lateEventsTag = new OutputTag<EventBean>("late-events") {}; SingleOutputStreamOperator<EventBean> windowedEvents = eventStream .keyBy("key") .window(TumblingEventTimeWindows.of(Time.seconds(3))) .sideOutputLateData(lateEventsTag) .process(new EventBeanProcessWindowFunction()); DataStream<EventBean> lateEvents = windowedEvents.getSideOutput(lateEventsTag); -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ |
Hi Chris, a `DataStream` represents a stream of events which have the same type. A `SingleOutputStreamOperator` is a subclass of `DataStream` and represents a user defined transformation applied to an input `DataStream` and producing an output `DataStream` (represented by itself). Since you can only add a side output to an operator/user defined transformation, you can only access the side output data from a `SingleOutputStreamOperator` and not from a `DataStream`. In this regard, the `SingleOutputStreamOperator` is just a richer version of the `DataStream` which requires a certain context. Cheers, Till On Tue, Jul 24, 2018 at 1:26 PM chrisr123 <[hidden email]> wrote:
|
Free forum by Nabble | Edit this page |