Join with Default-Value

Posted by Sebastian Neef on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/Join-with-Default-Value-tp11566.html

Hi,

is it possible to assign a "default" value to elements that didn't match?

For example I have the following two datasets:

|DataSetA | DataSetB|
---------------------
|id=1  | id=1
|id=2  | id=3
|id=5  | id=4
|id=6  | id=6

When doing a join with:

A.join(B).where( KeySelector(A.id))
        .equalTo(KeySelector(B.id))

The resulting dataset is:

|(DataSetA | DataSetB)|
---------------------
|(id=1  | id=1)
|(id=6  | id=6)

What is the best way to assign a default value to the elements id=2/id=5
from DataSet A. E.g. I need a result which looks similar to this:

|(DataSetA | DataSetB)|
---------------------
|(id=1  | id=1)
|(id=2  | Default)
|(id=5  | Default)
|(id=6  | id=6)

My idea would be to get the missing Elements from DataSetA by .filter
with (DataSetA|DataSetB) and then do a .union after creating a tuple
with a default value. But that sounds a bit over-complicated.

Best regards,
Sebastian