Re: Cross product of datastream and dataset
Posted by
Fabian Hueske-2 on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/Cross-product-of-datastream-and-dataset-tp10186p10195.html
Hi,
it is not possible to mix the DataSet and DataStream APIs at the moment.
If the DataSet is constant and not too big (which I assume, since otherwise crossing would be extremely expensive), you can load the data into a stateful MapFunction.
For that you can implement a RichFlatMapFunction and read the data in open(). For each incoming record, i.e., each call of map(), you cross it with the records in the state and immediately evaluate the condition and count. That way you don't generate too many records.
If your DataSet is slowly changing, you can think of using a stateful CoFlatmapFunction and use on input to read the stream and the other to update the dataset.
Hope this helps,
Fabian