SingleOutputStreamOperator vs DataStream?

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

SingleOutputStreamOperator vs DataStream?

chrisr123

I'm trying to get a list of late elements in my Tumbling Windows application
and I noticed
that I need to use SingleOutputStreamOperator<T> instead of DataStream<T> to
get
access to the .sideOutputLateData(...) method.

Can someone explain what the difference is between
SingleOutputStreamOperator and DataStream
and why I need to use this for getting the late data?
Thanks!

Snippet:
 OutputTag<EventBean> lateEventsTag = new
OutputTag<EventBean>("late-events") {};
  SingleOutputStreamOperator<EventBean> windowedEvents = eventStream
                                .keyBy("key")
                                .window(TumblingEventTimeWindows.of(Time.seconds(3)))
                                .sideOutputLateData(lateEventsTag)
                                .process(new EventBeanProcessWindowFunction());
DataStream<EventBean> lateEvents =
windowedEvents.getSideOutput(lateEventsTag);



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: SingleOutputStreamOperator vs DataStream?

Till Rohrmann
Hi Chris,

a `DataStream` represents a stream of events which have the same type. A `SingleOutputStreamOperator` is a subclass of `DataStream` and represents a user defined transformation applied to an input `DataStream` and producing an output `DataStream` (represented by itself). Since you can only add a side output to an operator/user defined transformation, you can only access the side output data from a `SingleOutputStreamOperator` and not from a `DataStream`. In this regard, the `SingleOutputStreamOperator` is just a richer version of the `DataStream` which requires a certain context.

Cheers,
Till

On Tue, Jul 24, 2018 at 1:26 PM chrisr123 <[hidden email]> wrote:

I'm trying to get a list of late elements in my Tumbling Windows application
and I noticed
that I need to use SingleOutputStreamOperator<T> instead of DataStream<T> to
get
access to the .sideOutputLateData(...) method.

Can someone explain what the difference is between
SingleOutputStreamOperator and DataStream
and why I need to use this for getting the late data?
Thanks!

Snippet:
 OutputTag<EventBean> lateEventsTag = new
OutputTag<EventBean>("late-events") {};
  SingleOutputStreamOperator<EventBean> windowedEvents = eventStream
                                .keyBy("key")
                                .window(TumblingEventTimeWindows.of(Time.seconds(3)))
                                .sideOutputLateData(lateEventsTag)
                                .process(new EventBeanProcessWindowFunction());
DataStream<EventBean> lateEvents =
windowedEvents.getSideOutput(lateEventsTag);



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/