Hi folks,
I have a question related configuration for new memory introduced in flink 1.10. Has anyone encountered similar problem? I'm trying to make use of taskmanager.memory.process.size configuration key in combination with mesos session cluster, but I get an error like this: 2020-03-11 11:44:09,771 [main] ERROR org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - Error while starting the TaskManager org.apache.flink.configuration.IllegalConfigurationException: Failed to create TaskExecutorResourceSpec at org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils.resourceSpecFromConfig(TaskExecutorResourceUtils.java:72) at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.startTaskManager(TaskManagerRunner.java:356) at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.<init>(TaskManagerRunner.java:152) at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.runTaskManager(TaskManagerRunner.java:308) at org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner.lambda$main$0(MesosTaskExecutorRunner.java:106) at java.base/java.security.AccessController.doPrivileged(Native Method) at java.base/javax.security.auth.Subject.doAs(Subject.java:423) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1692) at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) at org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner.main(MesosTaskExecutorRunner.java:105) Caused by: org.apache.flink.configuration.IllegalConfigurationException: The required configuration option Key: 'taskmanager.memory.task.heap.size' , default: null (fallback keys: []) is not set at org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils.checkConfigOptionIsSet(TaskExecutorResourceUtils.java:90) at org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils.lambda$checkTaskExecutorResourceConfigSet$0(TaskExecutorResourceUtils.java:84) at java.base/java.util.Arrays$ArrayList.forEach(Arrays.java:4390) at org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils.checkTaskExecutorResourceConfigSet(TaskExecutorResourceUtils.java:84) at org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils.resourceSpecFromConfig(TaskExecutorResourceUtils.java:70) ... 9 moreBut when task manager is launched, it correctly parses process memory key: 2020-03-11 11:43:55,376 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - -------------------------------------------------------------------------------- 2020-03-11 11:43:55,377 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - Starting MesosTaskExecutorRunner (Version: 1.10.0, Rev:aa4eb8f, Date:07.02.2020 @ 19:18:19 CET) 2020-03-11 11:43:55,377 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - OS current user: root 2020-03-11 11:43:57,347 [main] WARN org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2020-03-11 11:43:57,535 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - JVM: OpenJDK 64-Bit Server VM - AdoptOpenJDK - 11/11.0.2+9 2020-03-11 11:43:57,535 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - Maximum heap size: 746 MiBytes 2020-03-11 11:43:57,535 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - JAVA_HOME: (not set) 2020-03-11 11:43:57,539 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - Hadoop version: 2.6.5 2020-03-11 11:43:57,539 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - JVM Options: 2020-03-11 11:43:57,539 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - -Xmx781818251 2020-03-11 11:43:57,539 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - -Xms781818251 2020-03-11 11:43:57,540 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - -XX:MaxDirectMemorySize=317424929 2020-03-11 11:43:57,540 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - -XX:MaxMetaspaceSize=100663296 2020-03-11 11:43:57,540 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - -Dlog.file=/var/log/flink-session-cluster/taskmanager.log 2020-03-11 11:43:57,540 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - -Dlog4j.configuration=file:/opt/flink/conf/log4j.properties 2020-03-11 11:43:57,540 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - -Dlogback.configurationFile=file:/opt/flink/conf/logback.xml 2020-03-11 11:43:57,540 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - Program Arguments: (none) 2020-03-11 11:43:57,540 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - Classpath: /opt/flink/lib/apache-log4j-extras-1.2.17.jar:/opt/flink/lib/flink-metrics-graphite-1.10.0.jar:/opt/flink/lib/flink-shaded-hadoop-2-uber-2.6.5-8.0.jar:/opt/flink/lib/flink-table-blink_2.12-1.10.0.jar:/opt/flink/lib/flink-table_2.12-1.10.0.jar:/opt/flink/lib/log4j-1.2.17.jar:/opt/flink/lib/slf4j-log4j12-1.7.15.jar:/opt/flink/lib/flink-dist_2.12-1.10.0.jar: 2020-03-11 11:43:57,541 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - -------------------------------------------------------------------------------- 2020-03-11 11:43:57,542 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - Registered UNIX signal handlers for [TERM, HUP, INT] 2020-03-11 11:43:57,550 [main] INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.memory.process.size, 2g 2020-03-11 11:43:57,550 [main] INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.cpu.cores, 2 2020-03-11 11:43:57,551 [main] INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.numberOfTaskSlots, 4 2020-03-11 11:43:57,551 [main] INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: parallelism.default, 1 ... Judging by the docs specifying taskmanager.memory.process.size key should be enough to launch the job, but it seems like this value is ignored. I would appreciate any suggestion. Regards and thanks in advance, Alex. |
Hi, Alexander
I could not reproduce it in my local environment. Normally, Mesos RM will calculate all the mem config and add it to the launch command. Unfortunately, all the log I could found for this command is at the DEBUG level. Would you mind changing the log level to DEBUG or sharing anything about the taskmanager launch command you could found in the current log? Best, Yangze Guo On Thu, Mar 12, 2020 at 1:38 PM Alexander Kasyanenko <[hidden email]> wrote: > > Hi folks, > > I have a question related configuration for new memory introduced in flink 1.10. Has anyone encountered similar problem? > I'm trying to make use of taskmanager.memory.process.size configuration key in combination with mesos session cluster, but I get an error like this: > > 2020-03-11 11:44:09,771 [main] ERROR org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - Error while starting the TaskManager > org.apache.flink.configuration.IllegalConfigurationException: Failed to create TaskExecutorResourceSpec > at org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils.resourceSpecFromConfig(TaskExecutorResourceUtils.java:72) > at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.startTaskManager(TaskManagerRunner.java:356) > at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.<init>(TaskManagerRunner.java:152) > at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.runTaskManager(TaskManagerRunner.java:308) > at org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner.lambda$main$0(MesosTaskExecutorRunner.java:106) > at java.base/java.security.AccessController.doPrivileged(Native Method) > at java.base/javax.security.auth.Subject.doAs(Subject.java:423) > at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1692) > at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) > at org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner.main(MesosTaskExecutorRunner.java:105) > Caused by: org.apache.flink.configuration.IllegalConfigurationException: The required configuration option Key: 'taskmanager.memory.task.heap.size' , default: null (fallback keys: []) is not set > at org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils.checkConfigOptionIsSet(TaskExecutorResourceUtils.java:90) > at org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils.lambda$checkTaskExecutorResourceConfigSet$0(TaskExecutorResourceUtils.java:84) > at java.base/java.util.Arrays$ArrayList.forEach(Arrays.java:4390) > at org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils.checkTaskExecutorResourceConfigSet(TaskExecutorResourceUtils.java:84) > at org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils.resourceSpecFromConfig(TaskExecutorResourceUtils.java:70) > ... 9 more > > But when task manager is launched, it correctly parses process memory key: > > 2020-03-11 11:43:55,376 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - -------------------------------------------------------------------------------- > 2020-03-11 11:43:55,377 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - Starting MesosTaskExecutorRunner (Version: 1.10.0, Rev:aa4eb8f, Date:07.02.2020 @ 19:18:19 CET) > 2020-03-11 11:43:55,377 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - OS current user: root > 2020-03-11 11:43:57,347 [main] WARN org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable > 2020-03-11 11:43:57,535 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - JVM: OpenJDK 64-Bit Server VM - AdoptOpenJDK - 11/11.0.2+9 > 2020-03-11 11:43:57,535 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - Maximum heap size: 746 MiBytes > 2020-03-11 11:43:57,535 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - JAVA_HOME: (not set) > 2020-03-11 11:43:57,539 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - Hadoop version: 2.6.5 > 2020-03-11 11:43:57,539 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - JVM Options: > 2020-03-11 11:43:57,539 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - -Xmx781818251 > 2020-03-11 11:43:57,539 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - -Xms781818251 > 2020-03-11 11:43:57,540 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - -XX:MaxDirectMemorySize=317424929 > 2020-03-11 11:43:57,540 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - -XX:MaxMetaspaceSize=100663296 > 2020-03-11 11:43:57,540 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - -Dlog.file=/var/log/flink-session-cluster/taskmanager.log > 2020-03-11 11:43:57,540 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - -Dlog4j.configuration=file:/opt/flink/conf/log4j.properties > 2020-03-11 11:43:57,540 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - -Dlogback.configurationFile=file:/opt/flink/conf/logback.xml > 2020-03-11 11:43:57,540 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - Program Arguments: (none) > 2020-03-11 11:43:57,540 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - Classpath: /opt/flink/lib/apache-log4j-extras-1.2.17.jar:/opt/flink/lib/flink-metrics-graphite-1.10.0.jar:/opt/flink/lib/flink-shaded-hadoop-2-uber-2.6.5-8.0.jar:/opt/flink/lib/flink-table-blink_2.12-1.10.0.jar:/opt/flink/lib/flink-table_2.12-1.10.0.jar:/opt/flink/lib/log4j-1.2.17.jar:/opt/flink/lib/slf4j-log4j12-1.7.15.jar:/opt/flink/lib/flink-dist_2.12-1.10.0.jar: > 2020-03-11 11:43:57,541 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - -------------------------------------------------------------------------------- > 2020-03-11 11:43:57,542 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - Registered UNIX signal handlers for [TERM, HUP, INT] > 2020-03-11 11:43:57,550 [main] INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.memory.process.size, 2g > 2020-03-11 11:43:57,550 [main] INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.cpu.cores, 2 > 2020-03-11 11:43:57,551 [main] INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.numberOfTaskSlots, 4 > 2020-03-11 11:43:57,551 [main] INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: parallelism.default, 1 > ... > > Judging by the docs specifying taskmanager.memory.process.size key should be enough to launch the job, but it seems like this value is ignored. > I would appreciate any suggestion. > > Regards and thanks in advance, > Alex. |
Hi Alex, Could you try to check and post your TM launch command? I suspect that there might be some unrecognized arguments that prevent the rest of arguments being parsed. The TM memory configuration process works as follow:
One thing that might have caused your problem is that, when MesosTaskExecutorRunner parses the command line arguments (that's where the dynamic configurations are passed in), if it meets an unrecognized token it will stop parsing the rest of the arguments. That could be the reason that 'taskmanager.memory.task.heap.size' is missing. You can take a look at the launching command, see if there's anything unexpected before the memory dynamic configurations. Thank you~ Xintong Song On Thu, Mar 12, 2020 at 2:26 PM Yangze Guo <[hidden email]> wrote: Hi, Alexander |
Hi Yangze, Xintong, Thank you for instant response. And big thanks for the hint on TM launch command. It indeed was the problem. I've added my own custom mesos-taskmanager.sh to echo the launch command (I've switched to DEBUG level on logging, but it didn't really display anything useful). May I suggest to add something like this in the future releases? As for my particular case, the issue was in mesos-appmaster.sh option: -Dmesos.resourcemanager.tasks.taskmanager-cmd="/opt/job/custom_launch_tm.sh"My custom launch script was slicing argument array incorrectly. Thanks for the help and regards, Alex. чт, 12 мар. 2020 г. в 15:46, Xintong Song <[hidden email]>:
|
Glad to hear that your issue is fixed.
I'm not sure what you suggest to add. Could you tell it more specific or create a Jira ticket? Best, Yangze Guo On Thu, Mar 12, 2020 at 3:51 PM Alexander Kasyanenko <[hidden email]> wrote: > > Hi Yangze, Xintong, > > Thank you for instant response. > > And big thanks for the hint on TM launch command. It indeed was the problem. I've added my own custom mesos-taskmanager.sh to echo the launch command (I've switched to DEBUG level on logging, but it didn't really display anything useful). May I suggest to add something like this in the future releases? > > As for my particular case, the issue was in mesos-appmaster.sh option: > > -Dmesos.resourcemanager.tasks.taskmanager-cmd="/opt/job/custom_launch_tm.sh" > > My custom launch script was slicing argument array incorrectly. > > Thanks for the help and regards, > Alex. > > чт, 12 мар. 2020 г. в 15:46, Xintong Song <[hidden email]>: >> >> Hi Alex, >> >> Could you try to check and post your TM launch command? I suspect that there might be some unrecognized arguments that prevent the rest of arguments being parsed. >> >> The TM memory configuration process works as follow: >> >> The resource manager will parse the configurations, checking which options are configured and which are not, and calculate the size of each memory component. (This is where ‘taskmanager.memory.process.size’ is used.) >> After deriving the memory component sizes, the resource manager will generate launch command for the task managers, with dynamic configurations "-D <key=value>" overwriting the memory component sizes. Therefore, even you have not configured 'taskmanager.memory.task.heap.size', it is expected that before when the TM is launched this config option should be available. >> When a task manager is started, it will not do the calculations again, and will directly read the memory component sizes calculated by resource manager from the dynamic configurations. That means it is not reading ‘taskmanager.memory.process.size’ and deriving memory component sizes from it again. >> >> One thing that might have caused your problem is that, when MesosTaskExecutorRunner parses the command line arguments (that's where the dynamic configurations are passed in), if it meets an unrecognized token it will stop parsing the rest of the arguments. That could be the reason that 'taskmanager.memory.task.heap.size' is missing. You can take a look at the launching command, see if there's anything unexpected before the memory dynamic configurations. >> >> Thank you~ >> >> Xintong Song >> >> >> >> On Thu, Mar 12, 2020 at 2:26 PM Yangze Guo <[hidden email]> wrote: >>> >>> Hi, Alexander >>> >>> I could not reproduce it in my local environment. Normally, Mesos RM >>> will calculate all the mem config and add it to the launch command. >>> Unfortunately, all the log I could found for this command is at the >>> DEBUG level. Would you mind changing the log level to DEBUG or sharing >>> anything about the taskmanager launch command you could found in the >>> current log? >>> >>> >>> Best, >>> Yangze Guo >>> >>> On Thu, Mar 12, 2020 at 1:38 PM Alexander Kasyanenko >>> <[hidden email]> wrote: >>> > >>> > Hi folks, >>> > >>> > I have a question related configuration for new memory introduced in flink 1.10. Has anyone encountered similar problem? >>> > I'm trying to make use of taskmanager.memory.process.size configuration key in combination with mesos session cluster, but I get an error like this: >>> > >>> > 2020-03-11 11:44:09,771 [main] ERROR org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - Error while starting the TaskManager >>> > org.apache.flink.configuration.IllegalConfigurationException: Failed to create TaskExecutorResourceSpec >>> > at org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils.resourceSpecFromConfig(TaskExecutorResourceUtils.java:72) >>> > at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.startTaskManager(TaskManagerRunner.java:356) >>> > at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.<init>(TaskManagerRunner.java:152) >>> > at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.runTaskManager(TaskManagerRunner.java:308) >>> > at org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner.lambda$main$0(MesosTaskExecutorRunner.java:106) >>> > at java.base/java.security.AccessController.doPrivileged(Native Method) >>> > at java.base/javax.security.auth.Subject.doAs(Subject.java:423) >>> > at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1692) >>> > at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) >>> > at org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner.main(MesosTaskExecutorRunner.java:105) >>> > Caused by: org.apache.flink.configuration.IllegalConfigurationException: The required configuration option Key: 'taskmanager.memory.task.heap.size' , default: null (fallback keys: []) is not set >>> > at org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils.checkConfigOptionIsSet(TaskExecutorResourceUtils.java:90) >>> > at org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils.lambda$checkTaskExecutorResourceConfigSet$0(TaskExecutorResourceUtils.java:84) >>> > at java.base/java.util.Arrays$ArrayList.forEach(Arrays.java:4390) >>> > at org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils.checkTaskExecutorResourceConfigSet(TaskExecutorResourceUtils.java:84) >>> > at org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils.resourceSpecFromConfig(TaskExecutorResourceUtils.java:70) >>> > ... 9 more >>> > >>> > But when task manager is launched, it correctly parses process memory key: >>> > >>> > 2020-03-11 11:43:55,376 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - -------------------------------------------------------------------------------- >>> > 2020-03-11 11:43:55,377 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - Starting MesosTaskExecutorRunner (Version: 1.10.0, Rev:aa4eb8f, Date:07.02.2020 @ 19:18:19 CET) >>> > 2020-03-11 11:43:55,377 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - OS current user: root >>> > 2020-03-11 11:43:57,347 [main] WARN org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable >>> > 2020-03-11 11:43:57,535 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - JVM: OpenJDK 64-Bit Server VM - AdoptOpenJDK - 11/11.0.2+9 >>> > 2020-03-11 11:43:57,535 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - Maximum heap size: 746 MiBytes >>> > 2020-03-11 11:43:57,535 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - JAVA_HOME: (not set) >>> > 2020-03-11 11:43:57,539 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - Hadoop version: 2.6.5 >>> > 2020-03-11 11:43:57,539 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - JVM Options: >>> > 2020-03-11 11:43:57,539 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - -Xmx781818251 >>> > 2020-03-11 11:43:57,539 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - -Xms781818251 >>> > 2020-03-11 11:43:57,540 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - -XX:MaxDirectMemorySize=317424929 >>> > 2020-03-11 11:43:57,540 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - -XX:MaxMetaspaceSize=100663296 >>> > 2020-03-11 11:43:57,540 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - -Dlog.file=/var/log/flink-session-cluster/taskmanager.log >>> > 2020-03-11 11:43:57,540 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - -Dlog4j.configuration=file:/opt/flink/conf/log4j.properties >>> > 2020-03-11 11:43:57,540 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - -Dlogback.configurationFile=file:/opt/flink/conf/logback.xml >>> > 2020-03-11 11:43:57,540 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - Program Arguments: (none) >>> > 2020-03-11 11:43:57,540 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - Classpath: /opt/flink/lib/apache-log4j-extras-1.2.17.jar:/opt/flink/lib/flink-metrics-graphite-1.10.0.jar:/opt/flink/lib/flink-shaded-hadoop-2-uber-2.6.5-8.0.jar:/opt/flink/lib/flink-table-blink_2.12-1.10.0.jar:/opt/flink/lib/flink-table_2.12-1.10.0.jar:/opt/flink/lib/log4j-1.2.17.jar:/opt/flink/lib/slf4j-log4j12-1.7.15.jar:/opt/flink/lib/flink-dist_2.12-1.10.0.jar: >>> > 2020-03-11 11:43:57,541 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - -------------------------------------------------------------------------------- >>> > 2020-03-11 11:43:57,542 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - Registered UNIX signal handlers for [TERM, HUP, INT] >>> > 2020-03-11 11:43:57,550 [main] INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.memory.process.size, 2g >>> > 2020-03-11 11:43:57,550 [main] INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.cpu.cores, 2 >>> > 2020-03-11 11:43:57,551 [main] INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.numberOfTaskSlots, 4 >>> > 2020-03-11 11:43:57,551 [main] INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: parallelism.default, 1 >>> > ... >>> > >>> > Judging by the docs specifying taskmanager.memory.process.size key should be enough to launch the job, but it seems like this value is ignored. >>> > I would appreciate any suggestion. >>> > >>> > Regards and thanks in advance, >>> > Alex. |
Instead of just launching TM as it works right now, I suggest to log launch command first, and then launch TM. But that might be unnecessary, since the use case is rather specific. Regards, Alex. чт, 12 мар. 2020 г. в 16:58, Yangze Guo <[hidden email]>: Glad to hear that your issue is fixed. |
It seems we already have such logs in [1]. If that is the case, +1 for
changing it to INFO level. [1] https://github.com/apache/flink/blob/663af45c7f403eb6724852915bf2078241927258/flink-mesos/src/main/java/org/apache/flink/mesos/runtime/clusterframework/LaunchableMesosWorker.java#L341 Best, Yangze Guo On Thu, Mar 12, 2020 at 4:03 PM Alexander Kasyanenko <[hidden email]> wrote: > > Instead of just launching TM as it works right now, I suggest to log launch command first, and then launch TM. But that might be unnecessary, since the use case is rather specific. > > Regards, > Alex. > > чт, 12 мар. 2020 г. в 16:58, Yangze Guo <[hidden email]>: >> >> Glad to hear that your issue is fixed. >> I'm not sure what you suggest to add. Could you tell it more specific >> or create a Jira ticket? >> >> Best, >> Yangze Guo >> >> >> On Thu, Mar 12, 2020 at 3:51 PM Alexander Kasyanenko >> <[hidden email]> wrote: >> > >> > Hi Yangze, Xintong, >> > >> > Thank you for instant response. >> > >> > And big thanks for the hint on TM launch command. It indeed was the problem. I've added my own custom mesos-taskmanager.sh to echo the launch command (I've switched to DEBUG level on logging, but it didn't really display anything useful). May I suggest to add something like this in the future releases? >> > >> > As for my particular case, the issue was in mesos-appmaster.sh option: >> > >> > -Dmesos.resourcemanager.tasks.taskmanager-cmd="/opt/job/custom_launch_tm.sh" >> > >> > My custom launch script was slicing argument array incorrectly. >> > >> > Thanks for the help and regards, >> > Alex. >> > >> > чт, 12 мар. 2020 г. в 15:46, Xintong Song <[hidden email]>: >> >> >> >> Hi Alex, >> >> >> >> Could you try to check and post your TM launch command? I suspect that there might be some unrecognized arguments that prevent the rest of arguments being parsed. >> >> >> >> The TM memory configuration process works as follow: >> >> >> >> The resource manager will parse the configurations, checking which options are configured and which are not, and calculate the size of each memory component. (This is where ‘taskmanager.memory.process.size’ is used.) >> >> After deriving the memory component sizes, the resource manager will generate launch command for the task managers, with dynamic configurations "-D <key=value>" overwriting the memory component sizes. Therefore, even you have not configured 'taskmanager.memory.task.heap.size', it is expected that before when the TM is launched this config option should be available. >> >> When a task manager is started, it will not do the calculations again, and will directly read the memory component sizes calculated by resource manager from the dynamic configurations. That means it is not reading ‘taskmanager.memory.process.size’ and deriving memory component sizes from it again. >> >> >> >> One thing that might have caused your problem is that, when MesosTaskExecutorRunner parses the command line arguments (that's where the dynamic configurations are passed in), if it meets an unrecognized token it will stop parsing the rest of the arguments. That could be the reason that 'taskmanager.memory.task.heap.size' is missing. You can take a look at the launching command, see if there's anything unexpected before the memory dynamic configurations. >> >> >> >> Thank you~ >> >> >> >> Xintong Song >> >> >> >> >> >> >> >> On Thu, Mar 12, 2020 at 2:26 PM Yangze Guo <[hidden email]> wrote: >> >>> >> >>> Hi, Alexander >> >>> >> >>> I could not reproduce it in my local environment. Normally, Mesos RM >> >>> will calculate all the mem config and add it to the launch command. >> >>> Unfortunately, all the log I could found for this command is at the >> >>> DEBUG level. Would you mind changing the log level to DEBUG or sharing >> >>> anything about the taskmanager launch command you could found in the >> >>> current log? >> >>> >> >>> >> >>> Best, >> >>> Yangze Guo >> >>> >> >>> On Thu, Mar 12, 2020 at 1:38 PM Alexander Kasyanenko >> >>> <[hidden email]> wrote: >> >>> > >> >>> > Hi folks, >> >>> > >> >>> > I have a question related configuration for new memory introduced in flink 1.10. Has anyone encountered similar problem? >> >>> > I'm trying to make use of taskmanager.memory.process.size configuration key in combination with mesos session cluster, but I get an error like this: >> >>> > >> >>> > 2020-03-11 11:44:09,771 [main] ERROR org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - Error while starting the TaskManager >> >>> > org.apache.flink.configuration.IllegalConfigurationException: Failed to create TaskExecutorResourceSpec >> >>> > at org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils.resourceSpecFromConfig(TaskExecutorResourceUtils.java:72) >> >>> > at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.startTaskManager(TaskManagerRunner.java:356) >> >>> > at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.<init>(TaskManagerRunner.java:152) >> >>> > at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.runTaskManager(TaskManagerRunner.java:308) >> >>> > at org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner.lambda$main$0(MesosTaskExecutorRunner.java:106) >> >>> > at java.base/java.security.AccessController.doPrivileged(Native Method) >> >>> > at java.base/javax.security.auth.Subject.doAs(Subject.java:423) >> >>> > at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1692) >> >>> > at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) >> >>> > at org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner.main(MesosTaskExecutorRunner.java:105) >> >>> > Caused by: org.apache.flink.configuration.IllegalConfigurationException: The required configuration option Key: 'taskmanager.memory.task.heap.size' , default: null (fallback keys: []) is not set >> >>> > at org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils.checkConfigOptionIsSet(TaskExecutorResourceUtils.java:90) >> >>> > at org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils.lambda$checkTaskExecutorResourceConfigSet$0(TaskExecutorResourceUtils.java:84) >> >>> > at java.base/java.util.Arrays$ArrayList.forEach(Arrays.java:4390) >> >>> > at org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils.checkTaskExecutorResourceConfigSet(TaskExecutorResourceUtils.java:84) >> >>> > at org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils.resourceSpecFromConfig(TaskExecutorResourceUtils.java:70) >> >>> > ... 9 more >> >>> > >> >>> > But when task manager is launched, it correctly parses process memory key: >> >>> > >> >>> > 2020-03-11 11:43:55,376 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - -------------------------------------------------------------------------------- >> >>> > 2020-03-11 11:43:55,377 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - Starting MesosTaskExecutorRunner (Version: 1.10.0, Rev:aa4eb8f, Date:07.02.2020 @ 19:18:19 CET) >> >>> > 2020-03-11 11:43:55,377 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - OS current user: root >> >>> > 2020-03-11 11:43:57,347 [main] WARN org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable >> >>> > 2020-03-11 11:43:57,535 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - JVM: OpenJDK 64-Bit Server VM - AdoptOpenJDK - 11/11.0.2+9 >> >>> > 2020-03-11 11:43:57,535 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - Maximum heap size: 746 MiBytes >> >>> > 2020-03-11 11:43:57,535 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - JAVA_HOME: (not set) >> >>> > 2020-03-11 11:43:57,539 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - Hadoop version: 2.6.5 >> >>> > 2020-03-11 11:43:57,539 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - JVM Options: >> >>> > 2020-03-11 11:43:57,539 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - -Xmx781818251 >> >>> > 2020-03-11 11:43:57,539 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - -Xms781818251 >> >>> > 2020-03-11 11:43:57,540 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - -XX:MaxDirectMemorySize=317424929 >> >>> > 2020-03-11 11:43:57,540 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - -XX:MaxMetaspaceSize=100663296 >> >>> > 2020-03-11 11:43:57,540 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - -Dlog.file=/var/log/flink-session-cluster/taskmanager.log >> >>> > 2020-03-11 11:43:57,540 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - -Dlog4j.configuration=file:/opt/flink/conf/log4j.properties >> >>> > 2020-03-11 11:43:57,540 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - -Dlogback.configurationFile=file:/opt/flink/conf/logback.xml >> >>> > 2020-03-11 11:43:57,540 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - Program Arguments: (none) >> >>> > 2020-03-11 11:43:57,540 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - Classpath: /opt/flink/lib/apache-log4j-extras-1.2.17.jar:/opt/flink/lib/flink-metrics-graphite-1.10.0.jar:/opt/flink/lib/flink-shaded-hadoop-2-uber-2.6.5-8.0.jar:/opt/flink/lib/flink-table-blink_2.12-1.10.0.jar:/opt/flink/lib/flink-table_2.12-1.10.0.jar:/opt/flink/lib/log4j-1.2.17.jar:/opt/flink/lib/slf4j-log4j12-1.7.15.jar:/opt/flink/lib/flink-dist_2.12-1.10.0.jar: >> >>> > 2020-03-11 11:43:57,541 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - -------------------------------------------------------------------------------- >> >>> > 2020-03-11 11:43:57,542 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - Registered UNIX signal handlers for [TERM, HUP, INT] >> >>> > 2020-03-11 11:43:57,550 [main] INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.memory.process.size, 2g >> >>> > 2020-03-11 11:43:57,550 [main] INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.cpu.cores, 2 >> >>> > 2020-03-11 11:43:57,551 [main] INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.numberOfTaskSlots, 4 >> >>> > 2020-03-11 11:43:57,551 [main] INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: parallelism.default, 1 >> >>> > ... >> >>> > >> >>> > Judging by the docs specifying taskmanager.memory.process.size key should be enough to launch the job, but it seems like this value is ignored. >> >>> > I would appreciate any suggestion. >> >>> > >> >>> > Regards and thanks in advance, >> >>> > Alex. |
BTW, the dynamic config will also occur in TM side logs [1]. It would
be good to print it in INFO level as well. [1] https://github.com/apache/flink/blob/663af45c7f403eb6724852915bf2078241927258/flink-mesos/src/main/java/org/apache/flink/mesos/entrypoint/MesosTaskExecutorRunner.java#L77 Best, Yangze Guo On Thu, Mar 12, 2020 at 4:06 PM Yangze Guo <[hidden email]> wrote: > > It seems we already have such logs in [1]. If that is the case, +1 for > changing it to INFO level. > > [1] https://github.com/apache/flink/blob/663af45c7f403eb6724852915bf2078241927258/flink-mesos/src/main/java/org/apache/flink/mesos/runtime/clusterframework/LaunchableMesosWorker.java#L341 > Best, > Yangze Guo > > On Thu, Mar 12, 2020 at 4:03 PM Alexander Kasyanenko > <[hidden email]> wrote: > > > > Instead of just launching TM as it works right now, I suggest to log launch command first, and then launch TM. But that might be unnecessary, since the use case is rather specific. > > > > Regards, > > Alex. > > > > чт, 12 мар. 2020 г. в 16:58, Yangze Guo <[hidden email]>: > >> > >> Glad to hear that your issue is fixed. > >> I'm not sure what you suggest to add. Could you tell it more specific > >> or create a Jira ticket? > >> > >> Best, > >> Yangze Guo > >> > >> > >> On Thu, Mar 12, 2020 at 3:51 PM Alexander Kasyanenko > >> <[hidden email]> wrote: > >> > > >> > Hi Yangze, Xintong, > >> > > >> > Thank you for instant response. > >> > > >> > And big thanks for the hint on TM launch command. It indeed was the problem. I've added my own custom mesos-taskmanager.sh to echo the launch command (I've switched to DEBUG level on logging, but it didn't really display anything useful). May I suggest to add something like this in the future releases? > >> > > >> > As for my particular case, the issue was in mesos-appmaster.sh option: > >> > > >> > -Dmesos.resourcemanager.tasks.taskmanager-cmd="/opt/job/custom_launch_tm.sh" > >> > > >> > My custom launch script was slicing argument array incorrectly. > >> > > >> > Thanks for the help and regards, > >> > Alex. > >> > > >> > чт, 12 мар. 2020 г. в 15:46, Xintong Song <[hidden email]>: > >> >> > >> >> Hi Alex, > >> >> > >> >> Could you try to check and post your TM launch command? I suspect that there might be some unrecognized arguments that prevent the rest of arguments being parsed. > >> >> > >> >> The TM memory configuration process works as follow: > >> >> > >> >> The resource manager will parse the configurations, checking which options are configured and which are not, and calculate the size of each memory component. (This is where ‘taskmanager.memory.process.size’ is used.) > >> >> After deriving the memory component sizes, the resource manager will generate launch command for the task managers, with dynamic configurations "-D <key=value>" overwriting the memory component sizes. Therefore, even you have not configured 'taskmanager.memory.task.heap.size', it is expected that before when the TM is launched this config option should be available. > >> >> When a task manager is started, it will not do the calculations again, and will directly read the memory component sizes calculated by resource manager from the dynamic configurations. That means it is not reading ‘taskmanager.memory.process.size’ and deriving memory component sizes from it again. > >> >> > >> >> One thing that might have caused your problem is that, when MesosTaskExecutorRunner parses the command line arguments (that's where the dynamic configurations are passed in), if it meets an unrecognized token it will stop parsing the rest of the arguments. That could be the reason that 'taskmanager.memory.task.heap.size' is missing. You can take a look at the launching command, see if there's anything unexpected before the memory dynamic configurations. > >> >> > >> >> Thank you~ > >> >> > >> >> Xintong Song > >> >> > >> >> > >> >> > >> >> On Thu, Mar 12, 2020 at 2:26 PM Yangze Guo <[hidden email]> wrote: > >> >>> > >> >>> Hi, Alexander > >> >>> > >> >>> I could not reproduce it in my local environment. Normally, Mesos RM > >> >>> will calculate all the mem config and add it to the launch command. > >> >>> Unfortunately, all the log I could found for this command is at the > >> >>> DEBUG level. Would you mind changing the log level to DEBUG or sharing > >> >>> anything about the taskmanager launch command you could found in the > >> >>> current log? > >> >>> > >> >>> > >> >>> Best, > >> >>> Yangze Guo > >> >>> > >> >>> On Thu, Mar 12, 2020 at 1:38 PM Alexander Kasyanenko > >> >>> <[hidden email]> wrote: > >> >>> > > >> >>> > Hi folks, > >> >>> > > >> >>> > I have a question related configuration for new memory introduced in flink 1.10. Has anyone encountered similar problem? > >> >>> > I'm trying to make use of taskmanager.memory.process.size configuration key in combination with mesos session cluster, but I get an error like this: > >> >>> > > >> >>> > 2020-03-11 11:44:09,771 [main] ERROR org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - Error while starting the TaskManager > >> >>> > org.apache.flink.configuration.IllegalConfigurationException: Failed to create TaskExecutorResourceSpec > >> >>> > at org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils.resourceSpecFromConfig(TaskExecutorResourceUtils.java:72) > >> >>> > at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.startTaskManager(TaskManagerRunner.java:356) > >> >>> > at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.<init>(TaskManagerRunner.java:152) > >> >>> > at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.runTaskManager(TaskManagerRunner.java:308) > >> >>> > at org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner.lambda$main$0(MesosTaskExecutorRunner.java:106) > >> >>> > at java.base/java.security.AccessController.doPrivileged(Native Method) > >> >>> > at java.base/javax.security.auth.Subject.doAs(Subject.java:423) > >> >>> > at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1692) > >> >>> > at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) > >> >>> > at org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner.main(MesosTaskExecutorRunner.java:105) > >> >>> > Caused by: org.apache.flink.configuration.IllegalConfigurationException: The required configuration option Key: 'taskmanager.memory.task.heap.size' , default: null (fallback keys: []) is not set > >> >>> > at org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils.checkConfigOptionIsSet(TaskExecutorResourceUtils.java:90) > >> >>> > at org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils.lambda$checkTaskExecutorResourceConfigSet$0(TaskExecutorResourceUtils.java:84) > >> >>> > at java.base/java.util.Arrays$ArrayList.forEach(Arrays.java:4390) > >> >>> > at org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils.checkTaskExecutorResourceConfigSet(TaskExecutorResourceUtils.java:84) > >> >>> > at org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils.resourceSpecFromConfig(TaskExecutorResourceUtils.java:70) > >> >>> > ... 9 more > >> >>> > > >> >>> > But when task manager is launched, it correctly parses process memory key: > >> >>> > > >> >>> > 2020-03-11 11:43:55,376 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - -------------------------------------------------------------------------------- > >> >>> > 2020-03-11 11:43:55,377 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - Starting MesosTaskExecutorRunner (Version: 1.10.0, Rev:aa4eb8f, Date:07.02.2020 @ 19:18:19 CET) > >> >>> > 2020-03-11 11:43:55,377 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - OS current user: root > >> >>> > 2020-03-11 11:43:57,347 [main] WARN org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable > >> >>> > 2020-03-11 11:43:57,535 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - JVM: OpenJDK 64-Bit Server VM - AdoptOpenJDK - 11/11.0.2+9 > >> >>> > 2020-03-11 11:43:57,535 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - Maximum heap size: 746 MiBytes > >> >>> > 2020-03-11 11:43:57,535 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - JAVA_HOME: (not set) > >> >>> > 2020-03-11 11:43:57,539 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - Hadoop version: 2.6.5 > >> >>> > 2020-03-11 11:43:57,539 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - JVM Options: > >> >>> > 2020-03-11 11:43:57,539 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - -Xmx781818251 > >> >>> > 2020-03-11 11:43:57,539 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - -Xms781818251 > >> >>> > 2020-03-11 11:43:57,540 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - -XX:MaxDirectMemorySize=317424929 > >> >>> > 2020-03-11 11:43:57,540 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - -XX:MaxMetaspaceSize=100663296 > >> >>> > 2020-03-11 11:43:57,540 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - -Dlog.file=/var/log/flink-session-cluster/taskmanager.log > >> >>> > 2020-03-11 11:43:57,540 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - -Dlog4j.configuration=file:/opt/flink/conf/log4j.properties > >> >>> > 2020-03-11 11:43:57,540 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - -Dlogback.configurationFile=file:/opt/flink/conf/logback.xml > >> >>> > 2020-03-11 11:43:57,540 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - Program Arguments: (none) > >> >>> > 2020-03-11 11:43:57,540 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - Classpath: /opt/flink/lib/apache-log4j-extras-1.2.17.jar:/opt/flink/lib/flink-metrics-graphite-1.10.0.jar:/opt/flink/lib/flink-shaded-hadoop-2-uber-2.6.5-8.0.jar:/opt/flink/lib/flink-table-blink_2.12-1.10.0.jar:/opt/flink/lib/flink-table_2.12-1.10.0.jar:/opt/flink/lib/log4j-1.2.17.jar:/opt/flink/lib/slf4j-log4j12-1.7.15.jar:/opt/flink/lib/flink-dist_2.12-1.10.0.jar: > >> >>> > 2020-03-11 11:43:57,541 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - -------------------------------------------------------------------------------- > >> >>> > 2020-03-11 11:43:57,542 [main] INFO org.apache.flink.mesos.entrypoint.MesosTaskExecutorRunner - Registered UNIX signal handlers for [TERM, HUP, INT] > >> >>> > 2020-03-11 11:43:57,550 [main] INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.memory.process.size, 2g > >> >>> > 2020-03-11 11:43:57,550 [main] INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.cpu.cores, 2 > >> >>> > 2020-03-11 11:43:57,551 [main] INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.numberOfTaskSlots, 4 > >> >>> > 2020-03-11 11:43:57,551 [main] INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: parallelism.default, 1 > >> >>> > ... > >> >>> > > >> >>> > Judging by the docs specifying taskmanager.memory.process.size key should be enough to launch the job, but it seems like this value is ignored. > >> >>> > I would appreciate any suggestion. > >> >>> > > >> >>> > Regards and thanks in advance, > >> >>> > Alex. |
Free forum by Nabble | Edit this page |