I have 2 broadcast streams that carry rules to be applied to a third keyed stream. The connect method of the keyed stream only takes a single broadcast stream. How do I connect the 2 broadcast stream to that single keyed stream.
Do I have 2 connects and thus 2 instances of BroadcastConnextedStream, union them and then apply process through a single SingleOutpitStreamOperator ? The issue I see there are 2 keyBy calls and an additional shuffle before connect is called. To be precise, is there a simple example of applying 2 dissimilar rules through 2 broadcast streams, thus 2 different MapStateDiscriptors, to a single keyed stream without any unnecessary overhead... |
I think I can connect the broadcast streams to the single keyed stream (as in 2 connects ) and access the 2 or n broadcast states through the key in the Broadcast Map in processBroadcastElement , using a single instance of the Processor function. In essence I need to merge the results of the 2 rules careying broadcast stream in a single keyed brooadcast function ..... On Tue, Sep 18, 2018, 8:59 AM Vishal Santoshi <[hidden email]> wrote:
|
In reply to this post by Vishal Santoshi
Hi Vishal,
You could try 1) merging these two rule streams first with the `union` method if they get the same type or 2) connecting them and encapsulate the records from both sides to a unified type (e.g., scala Either). Best, Xingcan > On Sep 18, 2018, at 8:59 PM, Vishal Santoshi <[hidden email]> wrote: > > I have 2 broadcast streams that carry rules to be applied to a third keyed stream. The connect method of the keyed stream only takes a single broadcast stream. How do I connect the 2 broadcast stream to that single keyed stream. > > Do I have 2 connects and thus 2 instances of BroadcastConnextedStream, union them and then apply process through a single SingleOutpitStreamOperator ? The issue I see there are 2 keyBy calls and an additional shuffle before connect is called. > > To be precise, is there a simple example of applying 2 dissimilar rules through 2 broadcast streams, thus 2 different MapStateDiscriptors, to a single keyed stream without any unnecessary overhead... > > > |
I could do that, but I was under the impression that 2 or more disparate broadcast states could be provided to a keyed stream, referenced through a key in the Map State...That would be cleaner as in the fact that 2 different set of rules are to be applied are explictely declared rather then carries inside the datums of a unioned stream...... I will look at second option... On Tue, Sep 18, 2018, 9:15 AM Xingcan Cui <[hidden email]> wrote: Hi Vishal, |
Hi Vishal,
Actually, you could provide multiple MapStateDescriptors for the `broadcast()` method and then use them, separately. Best, Xingcan
|
Options ( here A is the non broadcast stream type and X and Y are the streams to be broadcast types). Note also the the Broadcasts of X and Y happen at different intervals and we require the 2 Rules to be applied in a single execution and run some merge routine on the output in the processElement method, without leaving it for downstream operators. 1. ConnectedStream(X,Y) connectedStream = BroadcastStream<X>.connect(BroadcastStream<Y>); connectedStream.broadcast() ; // broadcast not allowed on connected stream 2. DataStream<X> unionedStream = DataStream<X>.union(DatStream<X>) ;//here X.type = Y.type B = unionedStream.broadcast( ..) ; DataStream<A>.keyBy(..) connect(B); // But here we unifying to a single Descriptor, thus the hint as to which X or Y the rule came from has to be implicit in the the data, not at all declararitive. 3. DataStream<A>.keyBy().connect(DataStream<X>.broadcast( ..) ).process ( new () ) ; DataStream<A>.keyBy().connect(DataStream<Y>.broadcast( ..) ).process ( new () ) // 2 keyBy overhead and a subsequent merge step? 4. MapStateDescriptor<X> one= ..; MapStateDescriptor<Y> two=..; BroadcastStream<X> oneBroadCastStream = DataStream<X>.broadcast(one); BroadcastStream<Y> twoBroadCastStream = DataStream<Y>.broadcast(two); DataStream<A>.keyBy() .connect(oneBroadCastStream). .connect(twoBroadCastStream) // not possible ).process(new KeyedBroadcastProcessFunction(){ processElement(..); processBrodcastElement( Context, Out, X ,Y....) { // here figure out which local operator state to replace } })) I do not see a clean way at all to out in 2 StateDescriptors to a single KeyedBroadcastProcessFunction from 2 ( or more ) MapDescriptors even though I evidently can broadcast each stream and connect each stream independently to the non broadcast stream and access the states independently. I am not sure, that without reducing the 2 Rules to a single type On Tue, Sep 18, 2018 at 9:38 AM Xingcan Cui <[hidden email]> wrote:
|
of course 2. DataStream<X> unionedStream = DataStream<X>.union(DatStream<X>) ;//here X.type = Y.type B = unionedStream.broadcast( ..) ; DataStream<A>.keyBy(..) connect(B); // But here we unifying to a single Descriptor, thus the hint as to which X or Y the rule came from has to be implicit in the the data, not at all declararitive. should work. On Tue, Sep 18, 2018 at 10:51 AM Vishal Santoshi <[hidden email]> wrote:
|
Free forum by Nabble | Edit this page |