Hi,
I am new to Flink, and I'd like to firstly use it to perform some in memory aggregation in batch mode (in some months this will be migrated to permanent streaming, hence the choice of Flink). For this, I can successfully create the complex key that I require using KeySelector & returning a hash of the set of fields to "groupBy". I can also get the data from file/db, but now I want to be able to perform many different reduce functions on different fields (not hardcoded, but read from configuration). What I'd like to know, is if this is possible out of the box? From my research, it seems that only a single reduce function can be applied to a DataSet. The only way I found up to now, was to create a single reducer which is a container for all of the reduce functions I want to apply to my data record and simply loop through them to apply them to each record. Is this recommended? or am I missing some basics here? Many thanks for any advice, Osh |
Hi Osh, As I know, currently one dataset source can not be consumed by several different vertexs and from the API you can not construct the topology for your request. I think your way to merge different reduce functions into one UDF is feasible. Maybe someone has better solution. :) zhijiang
|
Hi Osh, You can certainly apply multiple reduce function on a DataSet, however, you should make sure that the data is only partitioned and sorted once. Moreover, you would end up with multiple data sets that you need to join afterwards. I think the easier approach is to wrap your functions in a single ReduceFunction. However, you should be aware that the return type of that function needs to be correctly defined. For example you could use the Row type. An alternative could also be Flink SQL which supports user-defined scalar and aggregation functions. If you can express your logic in these UDFs, it might be much easier because the optimizer will code generate the dynamic parts for you. Best, Fabian 2018-06-28 5:23 GMT+02:00 Zhijiang(wangzhijiang999) <[hidden email]>:
|
Free forum by Nabble | Edit this page |