your message is very short,i can not read more.the follow is my guss,
in flink,the dataStream is not for iterative computation,the dataSet would be more well.and fink suggest broadcast mini data,not large.
your can load your model data (it can be from file,or table),before main function,andassignment to variable ,like name=yourModel.
and the dataStream(it is a stream,unscored record,like DataStream[String] or DataStream[yourClass]),
and dataStream.map{x=>
val score = computeScore(x,yourModel)
}
object YourObject {
load your model
val yourModel = ;
def main(){
...............
read unscoreed record,from socket or kafka,or ....
dataStream.map{x=>
val score = computeScore(x,yourModel)
}
......
}
}
----- 原始邮件 -----
发件人:Anchit Jatana <
[hidden email]>
收件人:
[hidden email]主题:How to Broadcast a very large model object (used in iterative scoring in recommendation system) in Flink
日期:2016年09月30日 14点15分
Hi All,
I'm building a recommendation system streaming application for which I need to broadcast a very large model object (used in iterative scoring) among all the task managers performing the operation parallely for the operator
I'm doing an this operation in map1 of CoMapFunction. Please suggest me some way to achieve the broadcasting of the large model variable (something similar to what Spark has with broadcast variables).
Thank you
Regards,
Anchit