All: I'm running a Flink 0.10.2 App by submitting to YARN as an application. I'm using an AWS EMR cluster of 1 Master and 10 d2.8xlarge. When I submit the job using: I'm seeing this error:
The error message does not seem to be conveying the correct information. Can someone explain to me, what are reasonable numbers to use for taskmanager.network.numberOfBuffers and taskmanager.network.bufferSizeInBytes I've read this: https://ci.apache.org/projects/flink/flink-docs-master/setup/config.html#configuring-the-network-buffers and this: http://stackoverflow.com/questions/33589710/flink-cluster-params-how-to-set But I am still unclear of the calculus is it supposed to be? #cores ^ 2 * #machines * 4 So, in my case 36 ^ 2 * 10 * 4 = 51840 Thanks in advance for you help that you can provide. |
Hi Sourigna, you are using the formula correctly: #cores should to be translated into slots per taskmanager (TM), and #machines into number of TMs. So 36 ^ 2 * 10 * 4 = 51840 appears to be right.The constant 4 refers to the total number of concurrently active full network shuffles (partitioning or broadcasting). If your job is more complex, e.g., it has several inputs which are joined, reduced, etc, the constant needs to be adapted accordingly. ExecutionEnvironment env = ... env.getConfig().setExecutionMode(ExecutionMode.BATCH); Best, Fabian 2016-03-03 22:27 GMT+01:00 Sourigna Phetsarath <[hidden email]>:
|
Free forum by Nabble | Edit this page |