Hi , I think there is no difference between JobVertex(A) and JobVertex(B). Because the JobVertex(C) is not shown in the right graph, it may mislead you. There should be another intermediate result partition between JobVertex(B) and JobVertex(C) for each parallelism, and that is the same case with JobVertex(A). Cheers, Zhijiang
|
Hi, if output is same, why not just only one intermediate data set is ok 2017-03-14 14:36 GMT+08:00 Zhijiang(wangzhijiang999) <[hidden email]>:
|
In reply to this post by Zhijiang(wangzhijiang999)
Hi lining, From JobGraph level, it is logic topology. There will be one IntermediateDataSet between each producer and consumer, like the case A-IntermediateDataSet-B, A-IntermediateDataSet-D in the left graph. Also the same case for B-IntermediateDataSet-C, B-IntermediateDataSet-D, but the IntermediateDataSet between B and D is not shown separately in the left graph. From ExecutionGraph level, it is related with physical runtime. There will be one IntermediateResultPartition among each connected parallel ExecutionVertex, like the case A1-IntermediateResultPartition-B1,A1-IntermediateResultPartition-B2,A2-IntermediateResultPartition-B1, A2-IntermediateResultPartition-B2 in the right graph. Cheers, Zhijiang
|
Free forum by Nabble | Edit this page |