union stream vs multiple operators

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

union stream vs multiple operators

Alexey Trenikhun
Hello,
I have two Kafka topics ("A" and "B") which provide similar structure wise data but with different load pattern, for example hundreds records per second  in first topic while 10 records per second in second topic. Events processed using same algorithm and output in common sink, currently my pipeline is something like:

Source A->T-\
                       -> Sink
Source B->T-/

Instead of this pipeline I can union two streams and send them to common KeyedProcessFunction T:

Source A-\
                 (union)-> T -> Sink
Source B-/

What are pros and cons of these approaches?

Thanks,
Alexey
Reply | Threaded
Open this post in threaded view
|

Re: union stream vs multiple operators

Chesnay Schepler
I don't think the first option has any benefit.

On 11/5/2020 1:19 AM, Alexey Trenikhun wrote:
Hello,
I have two Kafka topics ("A" and "B") which provide similar structure wise data but with different load pattern, for example hundreds records per second  in first topic while 10 records per second in second topic. Events processed using same algorithm and output in common sink, currently my pipeline is something like:

Source A->T-\
                       -> Sink
Source B->T-/

Instead of this pipeline I can union two streams and send them to common KeyedProcessFunction T:

Source A-\
                 (union)-> T -> Sink
Source B-/

What are pros and cons of these approaches?

Thanks,
Alexey


Reply | Threaded
Open this post in threaded view
|

Re: union stream vs multiple operators

Alexey Trenikhun
Ok, thank you.


From: Chesnay Schepler <[hidden email]>
Sent: Thursday, November 5, 2020 3:15:28 PM
To: Alexey Trenikhun <[hidden email]>; Flink User Mail List <[hidden email]>
Subject: Re: union stream vs multiple operators
 
I don't think the first option has any benefit.

On 11/5/2020 1:19 AM, Alexey Trenikhun wrote:
Hello,
I have two Kafka topics ("A" and "B") which provide similar structure wise data but with different load pattern, for example hundreds records per second  in first topic while 10 records per second in second topic. Events processed using same algorithm and output in common sink, currently my pipeline is something like:

Source A->T-\
                       -> Sink
Source B->T-/

Instead of this pipeline I can union two streams and send them to common KeyedProcessFunction T:

Source A-\
                 (union)-> T -> Sink
Source B-/

What are pros and cons of these approaches?

Thanks,
Alexey