Idiomatic way to split pipeline

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Idiomatic way to split pipeline

avilevi
Hi,
I want to split the output of one of the operators to two pipelines. Since the split method is deprecated, what is the idiomatic way to do that without duplicating the operator ?

Screen Shot 2019-11-25 at 10.05.38.png


Reply | Threaded
Open this post in threaded view
|

Re: Idiomatic way to split pipeline

vino yang
Hi Avi,

As the doc of DataStream#split said, you can use the "side output" feature to replace it.[1]


Best,
Vino

Avi Levi <[hidden email]> 于2019年11月25日周一 下午4:12写道:
Hi,
I want to split the output of one of the operators to two pipelines. Since the split method is deprecated, what is the idiomatic way to do that without duplicating the operator ?

Screen Shot 2019-11-25 at 10.05.38.png


Reply | Threaded
Open this post in threaded view
|

Re: Idiomatic way to split pipeline

avilevi
Thank you, for your quick reply. I appreciate that.  but this it not exactly "side output" per se. it is simple splitting. IIUC The side output is more for splitting the records buy something the differentiate them (latnes , value etc' ) . I thought there is more idiomatic but if this is it, than I will go with that. 

On Mon, Nov 25, 2019 at 10:42 AM vino yang <[hidden email]> wrote:
This Message originated outside your organization.

Hi Avi,

As the doc of DataStream#split said, you can use the "side output" feature to replace it.[1]


Best,
Vino

Avi Levi <[hidden email]> 于2019年11月25日周一 下午4:12写道:
Hi,
I want to split the output of one of the operators to two pipelines. Since the split method is deprecated, what is the idiomatic way to do that without duplicating the operator ?

Screen Shot 2019-11-25 at 10.05.38.png


Reply | Threaded
Open this post in threaded view
|

Re: Idiomatic way to split pipeline

vino yang
Hi Avi,

The side output provides a superset of split's functionality. So anything can be implemented via split also can be implemented via side output.[1]

Best,

Avi Levi <[hidden email]> 于2019年11月25日周一 下午5:32写道:
Thank you, for your quick reply. I appreciate that.  but this it not exactly "side output" per se. it is simple splitting. IIUC The side output is more for splitting the records buy something the differentiate them (latnes , value etc' ) . I thought there is more idiomatic but if this is it, than I will go with that. 

On Mon, Nov 25, 2019 at 10:42 AM vino yang <[hidden email]> wrote:
This Message originated outside your organization.

Hi Avi,

As the doc of DataStream#split said, you can use the "side output" feature to replace it.[1]


Best,
Vino

Avi Levi <[hidden email]> 于2019年11月25日周一 下午4:12写道:
Hi,
I want to split the output of one of the operators to two pipelines. Since the split method is deprecated, what is the idiomatic way to do that without duplicating the operator ?

Screen Shot 2019-11-25 at 10.05.38.png


Reply | Threaded
Open this post in threaded view
|

Re: Idiomatic way to split pipeline

avilevi
Thanks, I'll check it out. 

On Mon, Nov 25, 2019 at 11:46 AM vino yang <[hidden email]> wrote:
This Message originated outside your organization.

Hi Avi,

The side output provides a superset of split's functionality. So anything can be implemented via split also can be implemented via side output.[1]

Best,

Avi Levi <[hidden email]> 于2019年11月25日周一 下午5:32写道:
Thank you, for your quick reply. I appreciate that.  but this it not exactly "side output" per se. it is simple splitting. IIUC The side output is more for splitting the records buy something the differentiate them (latnes , value etc' ) . I thought there is more idiomatic but if this is it, than I will go with that. 

On Mon, Nov 25, 2019 at 10:42 AM vino yang <[hidden email]> wrote:
This Message originated outside your organization.

Hi Avi,

As the doc of DataStream#split said, you can use the "side output" feature to replace it.[1]


Best,
Vino

Avi Levi <[hidden email]> 于2019年11月25日周一 下午4:12写道:
Hi,
I want to split the output of one of the operators to two pipelines. Since the split method is deprecated, what is the idiomatic way to do that without duplicating the operator ?

Screen Shot 2019-11-25 at 10.05.38.png


Reply | Threaded
Open this post in threaded view
|

Re: Idiomatic way to split pipeline

Arvid Heise-3
Hi Avi,

it seems to me that you are not really needing any split feature. As far as I can see in your picture you want to apply two different windows on the same input data.

In that case you simply use two different subgraphs.

stream = ...
stream1 = stream.window(...).....addSink(<sink1>)
stream2 = stream.window(...).....addSink(<sink2>)
In Flink, you can compose arbitrary directed acyclic graphs, so consuming the output of one operator on several downstream operators is completely normal.

Best,

Arvid

On Mon, Nov 25, 2019 at 10:50 AM Avi Levi <[hidden email]> wrote:
Thanks, I'll check it out. 

On Mon, Nov 25, 2019 at 11:46 AM vino yang <[hidden email]> wrote:
This Message originated outside your organization.

Hi Avi,

The side output provides a superset of split's functionality. So anything can be implemented via split also can be implemented via side output.[1]

Best,

Avi Levi <[hidden email]> 于2019年11月25日周一 下午5:32写道:
Thank you, for your quick reply. I appreciate that.  but this it not exactly "side output" per se. it is simple splitting. IIUC The side output is more for splitting the records buy something the differentiate them (latnes , value etc' ) . I thought there is more idiomatic but if this is it, than I will go with that. 

On Mon, Nov 25, 2019 at 10:42 AM vino yang <[hidden email]> wrote:
This Message originated outside your organization.

Hi Avi,

As the doc of DataStream#split said, you can use the "side output" feature to replace it.[1]


Best,
Vino

Avi Levi <[hidden email]> 于2019年11月25日周一 下午4:12写道:
Hi,
I want to split the output of one of the operators to two pipelines. Since the split method is deprecated, what is the idiomatic way to do that without duplicating the operator ?

Screen Shot 2019-11-25 at 10.05.38.png


Reply | Threaded
Open this post in threaded view
|

Re: Idiomatic way to split pipeline

avilevi
Thanks Arvid,
The problem is that I will get an exception on non unique uid on the stream .

On Thu, Nov 28, 2019 at 2:45 PM Arvid Heise <[hidden email]> wrote:
This Message originated outside your organization.

Hi Avi,

it seems to me that you are not really needing any split feature. As far as I can see in your picture you want to apply two different windows on the same input data.

In that case you simply use two different subgraphs.

stream = ...
stream1 = stream.window(...).....addSink(<sink1>)
stream2 = stream.window(...).....addSink(<sink2>)
In Flink, you can compose arbitrary directed acyclic graphs, so consuming the output of one operator on several downstream operators is completely normal.

Best,

Arvid

On Mon, Nov 25, 2019 at 10:50 AM Avi Levi <[hidden email]> wrote:
Thanks, I'll check it out. 

On Mon, Nov 25, 2019 at 11:46 AM vino yang <[hidden email]> wrote:
This Message originated outside your organization.

Hi Avi,

The side output provides a superset of split's functionality. So anything can be implemented via split also can be implemented via side output.[1]

Best,

Avi Levi <[hidden email]> 于2019年11月25日周一 下午5:32写道:
Thank you, for your quick reply. I appreciate that.  but this it not exactly "side output" per se. it is simple splitting. IIUC The side output is more for splitting the records buy something the differentiate them (latnes , value etc' ) . I thought there is more idiomatic but if this is it, than I will go with that. 

On Mon, Nov 25, 2019 at 10:42 AM vino yang <[hidden email]> wrote:
This Message originated outside your organization.

Hi Avi,

As the doc of DataStream#split said, you can use the "side output" feature to replace it.[1]


Best,
Vino

Avi Levi <[hidden email]> 于2019年11月25日周一 下午4:12写道:
Hi,
I want to split the output of one of the operators to two pipelines. Since the split method is deprecated, what is the idiomatic way to do that without duplicating the operator ?

Screen Shot 2019-11-25 at 10.05.38.png


Reply | Threaded
Open this post in threaded view
|

Re: Idiomatic way to split pipeline

rmetzger0
Hi Avi,
can you post the exception with the stack trace here as well?

On Sun, Dec 1, 2019 at 10:03 AM Avi Levi <[hidden email]> wrote:
Thanks Arvid,
The problem is that I will get an exception on non unique uid on the stream .

On Thu, Nov 28, 2019 at 2:45 PM Arvid Heise <[hidden email]> wrote:
This Message originated outside your organization.

Hi Avi,

it seems to me that you are not really needing any split feature. As far as I can see in your picture you want to apply two different windows on the same input data.

In that case you simply use two different subgraphs.

stream = ...
stream1 = stream.window(...).....addSink(<sink1>)
stream2 = stream.window(...).....addSink(<sink2>)
In Flink, you can compose arbitrary directed acyclic graphs, so consuming the output of one operator on several downstream operators is completely normal.

Best,

Arvid

On Mon, Nov 25, 2019 at 10:50 AM Avi Levi <[hidden email]> wrote:
Thanks, I'll check it out. 

On Mon, Nov 25, 2019 at 11:46 AM vino yang <[hidden email]> wrote:
This Message originated outside your organization.

Hi Avi,

The side output provides a superset of split's functionality. So anything can be implemented via split also can be implemented via side output.[1]

Best,

Avi Levi <[hidden email]> 于2019年11月25日周一 下午5:32写道:
Thank you, for your quick reply. I appreciate that.  but this it not exactly "side output" per se. it is simple splitting. IIUC The side output is more for splitting the records buy something the differentiate them (latnes , value etc' ) . I thought there is more idiomatic but if this is it, than I will go with that. 

On Mon, Nov 25, 2019 at 10:42 AM vino yang <[hidden email]> wrote:
This Message originated outside your organization.

Hi Avi,

As the doc of DataStream#split said, you can use the "side output" feature to replace it.[1]


Best,
Vino

Avi Levi <[hidden email]> 于2019年11月25日周一 下午4:12写道:
Hi,
I want to split the output of one of the operators to two pipelines. Since the split method is deprecated, what is the idiomatic way to do that without duplicating the operator ?

Screen Shot 2019-11-25 at 10.05.38.png