Hi,
I am doing pure map computation with typical benchmarks like BlackScholes and NBody. I am using local configuration with multiple threads. It seems like, inside the chuck (total size / numThreads) the order is correct. But the ordering between chunks is not correct, giving an incorrect result at the end. What I mean by the order is, the correct result in the correct position of the array. Is there any way to guarantee the result? Many thanks Juan |
Hi, If you use `partitionCustom()` method [1] with custom partitioner, you can guarantee the order of partition.
Regards, Chiwan Park [1] https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/api/java/DataSet.html#partitionCustom(org.apache.flink.api.common.functions.Partitioner,%20int) > On Jul 15, 2015, at 1:10 AM, Juan Fumero <[hidden email]> wrote: > > Hi, > I am doing pure map computation with typical benchmarks like > BlackScholes and NBody. > > I am using local configuration with multiple threads. It seems like, > inside the chuck (total size / numThreads) the order is correct. But the > ordering between chunks is not correct, giving an incorrect result at > the end. > > What I mean by the order is, the correct result in the correct position > of the array. > > Is there any way to guarantee the result? > > Many thanks > Juan > |
Hi Chiwan,
great thanks. Is there any example available? Regards Juan On Wed, 2015-07-15 at 01:19 +0900, Chiwan Park wrote: > Hi, If you use `partitionCustom()` method [1] with custom partitioner, you can guarantee the order of partition. > > Regards, > Chiwan Park > > [1] https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/api/java/DataSet.html#partitionCustom(org.apache.flink.api.common.functions.Partitioner,%20int) > > > On Jul 15, 2015, at 1:10 AM, Juan Fumero <[hidden email]> wrote: > > > > Hi, > > I am doing pure map computation with typical benchmarks like > > BlackScholes and NBody. > > > > I am using local configuration with multiple threads. It seems like, > > inside the chuck (total size / numThreads) the order is correct. But the > > ordering between chunks is not correct, giving an incorrect result at > > the end. > > > > What I mean by the order is, the correct result in the correct position > > of the array. > > > > Is there any way to guarantee the result? > > > > Many thanks > > Juan > > > > > > |
Sure, here is a example [1] of using `partitionCustom()` method in Java API. Scala API is
similar to Java API. You should implement Partitioner<KEY> interface. The interface has a method called partition with two parameters. The first parameter is key value of each record and the second parameter is number of partitions. The partition method must return the number of partition which the record belongs to. The returned value must be between 0 and numPartitions - 1. There are 3 types of `partitionCustom()` method. The difference between three methods is by method of specifying keys. If you want to know more detail of key specifying method in Flink, please see the documentation [2] in Flink homepage. Regards, Chiwan Park [1] https://gist.github.com/chiwanpark/e71d27cc8edae8bc7298 [2] https://ci.apache.org/projects/flink/flink-docs-master/apis/programming_guide.html#specifying-keys > On Jul 15, 2015, at 2:40 AM, Juan Fumero <[hidden email]> wrote: > > Hi Chiwan, > great thanks. Is there any example available? > > Regards > Juan > > On Wed, 2015-07-15 at 01:19 +0900, Chiwan Park wrote: >> Hi, If you use `partitionCustom()` method [1] with custom partitioner, you can guarantee the order of partition. >> >> Regards, >> Chiwan Park >> >> [1] https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/api/java/DataSet.html#partitionCustom(org.apache.flink.api.common.functions.Partitioner,%20int) >> >>> On Jul 15, 2015, at 1:10 AM, Juan Fumero <[hidden email]> wrote: >>> >>> Hi, >>> I am doing pure map computation with typical benchmarks like >>> BlackScholes and NBody. >>> >>> I am using local configuration with multiple threads. It seems like, >>> inside the chuck (total size / numThreads) the order is correct. But the >>> ordering between chunks is not correct, giving an incorrect result at >>> the end. >>> >>> What I mean by the order is, the correct result in the correct position >>> of the array. >>> >>> Is there any way to guarantee the result? >>> >>> Many thanks >>> Juan >>> >> >> >> >> > > |
Free forum by Nabble | Edit this page |