Merge or minus Dataset API missing

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Merge or minus Dataset API missing

Flavio Pompermaier
Hi to all,

I have a use case where I have to merge 2 datasets but I can't find a direct dataset API to do that.
I want to execute some function when there's a match, otherwise move on the not-null element.
At the moment I can do this in a fairly complicated way (I want to avoid broadcasting because the dataset could be big): using 2 leftOuterJoin plus a union. Is there a simpler way?


Best,
Flavio
Reply | Threaded
Open this post in threaded view
|

Re: Merge or minus Dataset API missing

Till Rohrmann

Why don’t you simply use a fullOuterJoin to do that?

Cheers,
Till


On Fri, Feb 12, 2016 at 4:48 PM, Flavio Pompermaier <[hidden email]> wrote:
Hi to all,

I have a use case where I have to merge 2 datasets but I can't find a direct dataset API to do that.
I want to execute some function when there's a match, otherwise move on the not-null element.
At the moment I can do this in a fairly complicated way (I want to avoid broadcasting because the dataset could be big): using 2 leftOuterJoin plus a union. Is there a simpler way?


Best,
Flavio

Reply | Threaded
Open this post in threaded view
|

Re: Merge or minus Dataset API missing

Fabian Hueske-2
In reply to this post by Flavio Pompermaier
Hi Flavio,

If I got it right, you can use a FullOuterJoin.
It will give you both elements on a match and otherwise a left or a right element and null.

Best, Fabian

2016-02-12 16:48 GMT+01:00 Flavio Pompermaier <[hidden email]>:
Hi to all,

I have a use case where I have to merge 2 datasets but I can't find a direct dataset API to do that.
I want to execute some function when there's a match, otherwise move on the not-null element.
At the moment I can do this in a fairly complicated way (I want to avoid broadcasting because the dataset could be big): using 2 leftOuterJoin plus a union. Is there a simpler way?


Best,
Flavio

Reply | Threaded
Open this post in threaded view
|

Re: Merge or minus Dataset API missing

Flavio Pompermaier
Ah ok, I didn't know about it! Thanks Till and Fabian!

On Fri, Feb 12, 2016 at 5:11 PM, Fabian Hueske <[hidden email]> wrote:
Hi Flavio,

If I got it right, you can use a FullOuterJoin.
It will give you both elements on a match and otherwise a left or a right element and null.

Best, Fabian

2016-02-12 16:48 GMT+01:00 Flavio Pompermaier <[hidden email]>:
Hi to all,

I have a use case where I have to merge 2 datasets but I can't find a direct dataset API to do that.
I want to execute some function when there's a match, otherwise move on the not-null element.
At the moment I can do this in a fairly complicated way (I want to avoid broadcasting because the dataset could be big): using 2 leftOuterJoin plus a union. Is there a simpler way?


Best,
Flavio