How about going for an optional parameter for the InputFormat to determine into how many splits each region is split? That would be a lightweight option to control the number of splits with low effort (on our side). 2014-11-05 0:01 GMT+01:00 Flavio Pompermaier <[hidden email]>:
|
Just shared the example at https://github.com/okkam-it/flink-mongodb-test and twitted :)
The next step is to show how to write the result of a Flink process back to Mongo. How can I manage to do that? Can someone help me? On Wed, Nov 5, 2014 at 1:17 PM, Fabian Hueske <[hidden email]> wrote:
|
Any help here..?
On Wed, Nov 5, 2014 at 6:39 PM, Flavio Pompermaier <[hidden email]> wrote:
|
Hi Flavio! I think the general method is the same as with the inputs. You use the "HadoopOutputFormat" wrapping the "MongoOutputFormat" (https://github.com/mongodb/mongo-hadoop/blob/master/core/src/main/java/com/mongodb/hadoop/mapred/MongoOutputFormat.java) You can then call DataSet<Tuple2<BSONWritable, BSONWritable>> data = ...; data.output(mongoOutput); Greetings, Stephan On Thu, Nov 6, 2014 at 3:41 PM, Flavio Pompermaier <[hidden email]> wrote:
|
I'm trying to do that but I can't find the proper typing.. For example:
DataSet<String> fin = input.map(new MapFunction<Tuple2<BSONWritable, BSONWritable>, String>() { private static final long serialVersionUID = 1L; @Override public String map(Tuple2<BSONWritable, BSONWritable> record) throws Exception { BSONWritable value = record.getField(1); BSONObject doc = value.getDoc(); BasicDBObject jsonld = (BasicDBObject) doc.get("jsonld"); String type = jsonld.getString("@type"); return type; } }); MongoConfigUtil.setOutputURI( hdIf.getJobConf(), "mongodb://localhost:27017/test.test"); fin.output(new HadoopOutputFormat<BSONWritable,BSONWritable>(new MongoOutputFormat<BSONWritable,BSONWritable>(), hdIf.getJobConf())); Obviously this doesn't work because I'm emitting strings and trying to write BSONWritable ..can you show me a simple working example? Best, Flavio On Thu, Nov 6, 2014 at 3:58 PM, Stephan Ewen <[hidden email]> wrote:
|
Hi! Can you: - either return a BSONWritable from the function - or type the output formats to String? The MongoRecordWriter can work with non BSON objects as well. https://github.com/mongodb/mongo-hadoop/blob/master/core/src/main/java/com/mongodb/hadoop/mapred/output/MongoRecordWriter.java Stephan On Thu, Nov 6, 2014 at 4:12 PM, Flavio Pompermaier <[hidden email]> wrote:
|
I managed to write back to mongo using this:
MongoConfigUtil.setOutputURI( hdIf.getJobConf(), "mongodb://localhost:27017/test.testData"); // emit result (this works only locally) fin.output(new HadoopOutputFormat<Text,BSONWritable>(new MongoOutputFormat<Text,BSONWritable>(), hdIf.getJobConf())); So I updated also the example at https://github.com/okkam-it/flink-mongodb-test :) On Thu, Nov 6, 2014 at 5:10 PM, Stephan Ewen <[hidden email]> wrote:
|
Hi Flavio! Looks very nice :-) Is the repository going to stay for a while? Can we link to your example from the Flink Website? Stephan On Fri, Nov 7, 2014 at 5:01 PM, Flavio Pompermaier <[hidden email]> wrote:
|
Yes, it will stay there as long it will work :)
However if you want to bring it into official flink examples it will be better I think! Best, Flavio On Mon, Nov 10, 2014 at 5:06 PM, Stephan Ewen <[hidden email]> wrote:
|
Free forum by Nabble | Edit this page |