I'm looking for something like DataStream.split(), but for DataSets. I'd like to split my streaming data so messages go to different parts of an execution graph, based on arbitrary logic.
DataStream.split() seems to be perfect, except that my source is a CSV file, and I have only found built in functions for reading CSV files into a DataSet. I've evaluated using DataSet.filter(), but as far as I can tell, that only allows me to emulate a yes/no split. This is not ideal because it's too coarse, and I would prefer a more fine grained split than that. Do you have any suggestions on how I can achieve my arbitrary splitting logic for a) DataSets in general, or b) CSV files? |
Hi Magnus, there is no Split operator on the DataSet API.DataSet<X> secondSplit = setToSplit.filter(new SplitCondition2()); DataSet<X> thirdSplit = setToSplit.filter(new SplitCondition3()); 2017-10-17 10:42 GMT+02:00 Magnus Vojbacke <[hidden email]>: I'm looking for something like DataStream.split(), but for DataSets. I'd like to split my streaming data so messages go to different parts of an execution graph, based on arbitrary logic. |
Thank you, Fabian! If batch semantics are not important to my use case, is there any way to "downgrade" or convert a DataSet to a DataStream?
BR /Magnus
|
Unfortunately, it's not possible to bridge the gap between the DataSet and DataStream APIs. 2017-10-17 11:05 GMT+02:00 Magnus Vojbacke <[hidden email]>:
|
Free forum by Nabble | Edit this page |