http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/Need-help-regarding-Flink-Batch-Application-tp22151p22170.html
Hi Fabina/Chesnay,
Thanks for your quick response. We are using EMR 5.16 which has Flink 1.5.0
Source and Sink are S3(using flink-s3-fs-hadoop module).
flink run -m yarn-cluster -yn 792 -ys 2 -ytm 14000 -yjm 114736 -p 1584
Parallelism is 1584.
I have played around with different values for -yn and -ys and but didn't perform well, the above given configuration is so far the best performance. I am not able to get the execution plan in json. I have added the image from flink ui.
while creating the cluster on aws emr, we are using below configuration
[{"classification":"hdfs-site","properties":{"dfs.webhdfs.enabled":"True"}},{"classification":"yarn-site","properties":{"yarn.log-aggregation.retain-seconds":"345600","yarn.nodemanager.resource.memory-mb":"116736","yarn.app.mapreduce.am.resource.mb":"2048"}},{"classification":"flink-conf","properties":{"mode":"legacy","akka.lookup.timeout":"120 s","taskmanager.memory.fraction":"0.85","akka.ask.timeout":"120 s","env.java.opts.taskmanager":"-XX:+UseG1GC","akka.startup-timeout":"120 s","akka.client.timeout":"120 s"}}]
Thanks,
Ravi