CoGgroup Operator Data Sink

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

CoGgroup Operator Data Sink

Mustafa Elbehery
Hi all, 

I wonder if the coGroup operator have the ability to sink two output simultaneously. I am trying to mock it by calling a function inside the operator, in which I sink the first output, and get the second output myself. 

I am not sure if this is the best way, and I would like to hear your suggestions, 

Regards.

--
Mustafa Elbehery
+49(0)15750363097
skype: mustafaelbehery87

Reply | Threaded
Open this post in threaded view
|

Re: CoGgroup Operator Data Sink

rmetzger0
Hi,

you can write the output of a coGroup operator to two sinks:

------\           /---->Sink1
       \         /
        (CoGroup) 
       /        \
------/          \------>Sink2

You can actually write to as many sinks as you want.
Note that the data written to Sink1 and Sink2 will be identical.
If you want to write different data to S1 and S2, you can use a Tuple2 where the first field contains a tag, and the second field contains your data.
Then, you use a filter in front of your Sinks to select the data based on the tag.

------\           /---(Filter)-->Sink1
       \         /
        (CoGroup) 
       /        \
------/          \----(Filter)-->Sink2

So the output of CoGroup could be Tuple2<Integer,YourPojo>, when the integer is 1, it is only written by Sink1, when the integer is 2, its only written by Sink2.




On Tue, Apr 14, 2015 at 10:20 AM, Mustafa Elbehery <[hidden email]> wrote:
Hi all, 

I wonder if the coGroup operator have the ability to sink two output simultaneously. I am trying to mock it by calling a function inside the operator, in which I sink the first output, and get the second output myself. 

I am not sure if this is the best way, and I would like to hear your suggestions, 

Regards.

--
Mustafa Elbehery
<a href="tel:%2B49%280%2915750363097" value="+4915750363097" target="_blank">+49(0)15750363097
skype: mustafaelbehery87


Reply | Threaded
Open this post in threaded view
|

Re: CoGgroup Operator Data Sink

Mustafa Elbehery
Thanks for prompt reply. 

Maybe the expression "Sink" is not suitable to what I need. What if I want to Collect two data sets directly from the coGroup operator. Is there anyway to do so ?!! 

As I might know, the operator has only Collector Object, but I wonder if there is another feature in Flink that supports what I need .

Thanks.

On Tue, Apr 14, 2015 at 11:27 AM, Robert Metzger <[hidden email]> wrote:
Hi,

you can write the output of a coGroup operator to two sinks:

------\           /---->Sink1
       \         /
        (CoGroup) 
       /        \
------/          \------>Sink2

You can actually write to as many sinks as you want.
Note that the data written to Sink1 and Sink2 will be identical.
If you want to write different data to S1 and S2, you can use a Tuple2 where the first field contains a tag, and the second field contains your data.
Then, you use a filter in front of your Sinks to select the data based on the tag.

------\           /---(Filter)-->Sink1
       \         /
        (CoGroup) 
       /        \
------/          \----(Filter)-->Sink2

So the output of CoGroup could be Tuple2<Integer,YourPojo>, when the integer is 1, it is only written by Sink1, when the integer is 2, its only written by Sink2.




On Tue, Apr 14, 2015 at 10:20 AM, Mustafa Elbehery <[hidden email]> wrote:
Hi all, 

I wonder if the coGroup operator have the ability to sink two output simultaneously. I am trying to mock it by calling a function inside the operator, in which I sink the first output, and get the second output myself. 

I am not sure if this is the best way, and I would like to hear your suggestions, 

Regards.

--
Mustafa Elbehery
<a href="tel:%2B49%280%2915750363097" value="+4915750363097" target="_blank">+49(0)15750363097
skype: mustafaelbehery87





--
Mustafa Elbehery
+49(0)15750363097
skype: mustafaelbehery87

Reply | Threaded
Open this post in threaded view
|

Re: CoGgroup Operator Data Sink

Kostas Tzoumas
Each operator has only one output (which can be consumed by multiple downstream operators), so you cannot branch out to two different directions from inside the user code with many collectors. The reasoning is that you can have the same effect with what Robert suggested. 

But perhaps your use case is different; can you not achieve the same result with branching out to two different DataSets as per Robert's suggestion? If this is the case, posting some details on the function would be helpful.

On Tue, Apr 14, 2015 at 11:37 AM, Mustafa Elbehery <[hidden email]> wrote:
Thanks for prompt reply. 

Maybe the expression "Sink" is not suitable to what I need. What if I want to Collect two data sets directly from the coGroup operator. Is there anyway to do so ?!! 

As I might know, the operator has only Collector Object, but I wonder if there is another feature in Flink that supports what I need .

Thanks.

On Tue, Apr 14, 2015 at 11:27 AM, Robert Metzger <[hidden email]> wrote:
Hi,

you can write the output of a coGroup operator to two sinks:

------\           /---->Sink1
       \         /
        (CoGroup) 
       /        \
------/          \------>Sink2

You can actually write to as many sinks as you want.
Note that the data written to Sink1 and Sink2 will be identical.
If you want to write different data to S1 and S2, you can use a Tuple2 where the first field contains a tag, and the second field contains your data.
Then, you use a filter in front of your Sinks to select the data based on the tag.

------\           /---(Filter)-->Sink1
       \         /
        (CoGroup) 
       /        \
------/          \----(Filter)-->Sink2

So the output of CoGroup could be Tuple2<Integer,YourPojo>, when the integer is 1, it is only written by Sink1, when the integer is 2, its only written by Sink2.




On Tue, Apr 14, 2015 at 10:20 AM, Mustafa Elbehery <[hidden email]> wrote:
Hi all, 

I wonder if the coGroup operator have the ability to sink two output simultaneously. I am trying to mock it by calling a function inside the operator, in which I sink the first output, and get the second output myself. 

I am not sure if this is the best way, and I would like to hear your suggestions, 

Regards.

--
Mustafa Elbehery
<a href="tel:%2B49%280%2915750363097" value="+4915750363097" target="_blank">+49(0)15750363097
skype: mustafaelbehery87





--
Mustafa Elbehery
<a href="tel:%2B49%280%2915750363097" value="+4915750363097" target="_blank">+49(0)15750363097
skype: mustafaelbehery87