http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/Checking-actual-config-values-used-by-TaskManager-tp6567p6699.html
Hi Ken,
When you're running Yarn, the Flink configuration is created once and
shared among all nodes (JobManager and TaskManagers). Please have a
look at the JobManager tab on the web interface. It shows you the
configuration.
I’ve seen that, but the values displayed don’t match what I’m setting, or what I see in the logs.
I’m running a job using ./bin/flink run, with parameters:
-ytm 20000 \
-yjm 2048 \
-ys 4 \
-p 10 \
-yD taskmanager.network.numberOfBuffers=3000 \
-yD taskmanager.memory.off-heap=true
Here’s a screenshot from the JobManager:
If that doesn’t come through, it’s showing:
job manager.heap.mb 256
taskmanager.heap.mb 512
taskmanager.memory.off-heap true
taskmanager.network.numberOfBuffers 3000
taskmanager.numberOfTaskSlots 1
So numberOfBuffers seems right, same with memory.off-heap.
But taskmanager.heap.mb looks like a default value, same for numberOfTaskSlots and jobmanager.heap.mb
When I look at my actual job, the settings I’m seeing for number of slots (as an example) match what I’m specifying from the command line.
When I look at the JobManager logs, I see -Xmx1448M, which I guess is an approximation of the 2048 I specified.
And when I look at the TaskManager logs, the JVM settings match what I’d expect (for -ytm 20000, so 15GB direct, and about 5GB for the JVM).
2016-05-05 01:07:16,161 INFO org.apache.flink.yarn.YarnTaskManagerRunner - JVM Options:
2016-05-05 01:07:16,161 INFO org.apache.flink.yarn.YarnTaskManagerRunner - -Xms4500m
2016-05-05 01:07:16,161 INFO org.apache.flink.yarn.YarnTaskManagerRunner - -Xmx4500m
2016-05-05 01:07:16,161 INFO org.apache.flink.yarn.YarnTaskManagerRunner - -XX:MaxDirectMemorySize=15000m
So I guess I’ve got two questions…
1. What is the meaning of the values I’m seeing in the JobManager UI.
2. How do I figure out what the TaskManager is getting for -yD taskmanager.tmp.dirs, as an example.
Thanks,
— Ken
On Fri, Apr 29, 2016 at 3:18 PM, Ken Krugler
<
[hidden email]> wrote:
Hi Timur,
On Apr 28, 2016, at 10:40pm, Timur Fayruzov <[hidden email]>
wrote:
If you're talking about parameters that were set on JVM startup then `ps
aux|grep flink` on an EMR slave node should do the trick, that'll give you
the full command line.
No, I’m talking about values that come from flink-conf.yaml.
Maybe there’s no good reason to worry, but in Hadoop land you can have
parameters set via the conf on the client, which in turn get overridden by
values from conf files on the nodes, which you can then override via command
line parameters, which in turn can be changed by the user code.
Plus parameters that can be flagged as final/unmodifiable, and thus some of
the above actually don’t change anything.
So it’s a common issue where what you think you set as a value isn’t
actually being used, and that’s why examining the job conf that was actually
deployed with tasks is critical.
— Ken
On Thu, Apr 28, 2016 at 9:00 PM, Ken Krugler <[hidden email]>
wrote:
Hi all,
I’m running jobs on EMR via YARN, and wondering how to check exactly what
configuration settings are actually being used.
This is mostly for the TaskManager.
I know I can modify the conf/flink-conf.yaml file, and (via the CLI) I can
use -yD param=value.
But my experience with Hadoop makes me want to see the exact values being
used, versus assuming I know what’s been set :)
Thanks,
— Ken
--------------------------
Ken Krugler
+1 530-210-6378
http://www.scaleunlimited.com
custom big data solutions & training
Hadoop, Cascading, Cassandra & Solr
--------------------------
Ken Krugler
+1 530-210-6378
custom big data solutions & training
Hadoop, Cascading, Cassandra & Solr