Join with Default-Value
Posted by Sebastian Neef on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/Join-with-Default-Value-tp11566.html
Hi,
is it possible to assign a "default" value to elements that didn't match?
For example I have the following two datasets:
|DataSetA | DataSetB|
---------------------
|id=1 | id=1
|id=2 | id=3
|id=5 | id=4
|id=6 | id=6
When doing a join with:
A.join(B).where( KeySelector(A.id))
.equalTo(KeySelector(B.id))
The resulting dataset is:
|(DataSetA | DataSetB)|
---------------------
|(id=1 | id=1)
|(id=6 | id=6)
What is the best way to assign a default value to the elements id=2/id=5
from DataSet A. E.g. I need a result which looks similar to this:
|(DataSetA | DataSetB)|
---------------------
|(id=1 | id=1)
|(id=2 | Default)
|(id=5 | Default)
|(id=6 | id=6)
My idea would be to get the missing Elements from DataSetA by .filter
with (DataSetA|DataSetB) and then do a .union after creating a tuple
with a default value. But that sounds a bit over-complicated.
Best regards,
Sebastian