Hi Experts,
I recently tried to run yarn-application mode on my yarn cluster, and I had a problem related to configuring `execution.target`. After reading the source code and doing some experiments, I found that there should be some room of improvement for `FlinkYarnSessionCli` or `AbstractYarnCli`. My experiments are:
From `AbstractYarnCli#isActive` [2] and `FlinkYarnSessionCli#isActive` [3], `FlinkYarnSessionCli` will be active when `execution.target` is specified with `yarn-per-job` or `yarn-session`. According to the flink official document [4], I thought the 2nd experiment should also work well, but it didn't. The The root cause is that `FlinkYarnSessionCli` only overwrite the `execution.target` with `yarn-session` or `yarn-per-job` [5], but no `yarn-application`. So, my question is
and one more improvement, the config description for `execution.target` [6] should include `yarn-application` as well. best regards, |
Hi, Tony.
What is the version of your flink-dist. AFAIK, this issue should be addressed in FLINK-15852[1]. Could you give the client log of case 2(set the log level to DEBUG would be better). [1] https://issues.apache.org/jira/browse/FLINK-15852 Best, Yangze Guo On Sun, Apr 25, 2021 at 11:33 AM Tony Wei <[hidden email]> wrote: > > Hi Experts, > > I recently tried to run yarn-application mode on my yarn cluster, and I had a problem related to configuring `execution.target`. > After reading the source code and doing some experiments, I found that there should be some room of improvement for `FlinkYarnSessionCli` or `AbstractYarnCli`. > > My experiments are: > > setting `execution.target: yarn-application` in flink-conf.yaml and run `flink run-application -t yarn-application`: run job successfully. > > `FlinkYarnSessionCli` is not active > `GenericCLI` is active > > setting `execution.target: yarn-per-job` in flink-conf.yaml and run `flink run-application -t yarn-application`: run job failed > > failed due to `ClusterDeploymentException` [1] > `FlinkYarnSessionCli` is active > > setting `execution.target: yarn-application` in flink-conf.yaml and run `flink run -t yarn-per-job`: run job successfully. > > `FlinkYarnSessionCli` is not active > `GenericCLI` is active > > setting `execution.target: yarn-per-job` in flink-conf.yaml and run `flink run -t yarn-per-job`: run job successfully. > > `FlinkYarnSessionCli` is active > > From `AbstractYarnCli#isActive` [2] and `FlinkYarnSessionCli#isActive` [3], `FlinkYarnSessionCli` will be active when `execution.target` is specified with `yarn-per-job` or `yarn-session`. > > According to the flink official document [4], I thought the 2nd experiment should also work well, but it didn't. >> >> The --target will overwrite the execution.target specified in the config/flink-config.yaml. > > > The root cause is that `FlinkYarnSessionCli` only overwrite the `execution.target` with `yarn-session` or `yarn-per-job` [5], but no `yarn-application`. > So, my question is > > should we use `FlinkYarnSessionCli` in case 2? > if we should, how we can improve `FlinkYarnSessionCli` so that we can overwrite `execution.target` via `--target`? > > and one more improvement, the config description for `execution.target` [6] should include `yarn-application` as well. > > [1] https://github.com/apache/flink/blob/master/flink-yarn/src/main/java/org/apache/flink/yarn/YarnClusterDescriptor.java#L439-L447 > [2] https://github.com/apache/flink/blob/master/flink-yarn/src/main/java/org/apache/flink/yarn/cli/AbstractYarnCli.java#L54-L66 > [3] https://github.com/apache/flink/blob/master/flink-yarn/src/main/java/org/apache/flink/yarn/cli/FlinkYarnSessionCli.java#L373-L377 > [4] https://ci.apache.org/projects/flink/flink-docs-stable/deployment/cli.html#selecting-deployment-targets > [5] https://github.com/apache/flink/blob/master/flink-yarn/src/main/java/org/apache/flink/yarn/cli/FlinkYarnSessionCli.java#L397-L413 > [6] https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/configuration/DeploymentOptions.java#L41-L46 > > best regards, > |
Hi Tony, I think you are right that Flink's cli does not behave super consistent at the moment. Case 2. should definitely work because `-t yarn-application` should overwrite what is defined in the Flink configuration. The problem seems to be that we don't resolve the configuration wrt the specified command line options before calling into `CustomCommandLine.isActive`. If we parsed first the command line configuration options which can overwrite flink-conf.yaml options and then replaced them, then the custom command lines (assuming that they use the Configuration as the ground truth) should behave consistently. For your questions: 1. I am not 100% sure. I think the FlinkYarnSessionCli wasn't used on purpose when introducing the yarn application mode. 2. See answer 1. I think it is a good idea to extend the description of the config option `execution.target`. Do you want to create a ticket and a PR for it? Cheers, Till On Mon, Apr 26, 2021 at 8:37 AM Yangze Guo <[hidden email]> wrote: Hi, Tony. |
Hi, Till,
I agree that we need to resolve the issue by overriding the configuration before selecting the CustomCommandLines. However, IIUC, after FLINK-15852 the GenericCLI should always be the first choice. Could you help me to understand why the FlinkYarnSessionCli can be activated? Best, Yangze Guo On Mon, Apr 26, 2021 at 4:48 PM Till Rohrmann <[hidden email]> wrote: > > Hi Tony, > > I think you are right that Flink's cli does not behave super consistent at the moment. Case 2. should definitely work because `-t yarn-application` should overwrite what is defined in the Flink configuration. The problem seems to be that we don't resolve the configuration wrt the specified command line options before calling into `CustomCommandLine.isActive`. If we parsed first the command line configuration options which can overwrite flink-conf.yaml options and then replaced them, then the custom command lines (assuming that they use the Configuration as the ground truth) should behave consistently. > > For your questions: > > 1. I am not 100% sure. I think the FlinkYarnSessionCli wasn't used on purpose when introducing the yarn application mode. > 2. See answer 1. > > I think it is a good idea to extend the description of the config option `execution.target`. Do you want to create a ticket and a PR for it? > > Cheers, > Till > > On Mon, Apr 26, 2021 at 8:37 AM Yangze Guo <[hidden email]> wrote: >> >> Hi, Tony. >> >> What is the version of your flink-dist. AFAIK, this issue should be >> addressed in FLINK-15852[1]. Could you give the client log of case >> 2(set the log level to DEBUG would be better). >> >> [1] https://issues.apache.org/jira/browse/FLINK-15852 >> >> Best, >> Yangze Guo >> >> On Sun, Apr 25, 2021 at 11:33 AM Tony Wei <[hidden email]> wrote: >> > >> > Hi Experts, >> > >> > I recently tried to run yarn-application mode on my yarn cluster, and I had a problem related to configuring `execution.target`. >> > After reading the source code and doing some experiments, I found that there should be some room of improvement for `FlinkYarnSessionCli` or `AbstractYarnCli`. >> > >> > My experiments are: >> > >> > setting `execution.target: yarn-application` in flink-conf.yaml and run `flink run-application -t yarn-application`: run job successfully. >> > >> > `FlinkYarnSessionCli` is not active >> > `GenericCLI` is active >> > >> > setting `execution.target: yarn-per-job` in flink-conf.yaml and run `flink run-application -t yarn-application`: run job failed >> > >> > failed due to `ClusterDeploymentException` [1] >> > `FlinkYarnSessionCli` is active >> > >> > setting `execution.target: yarn-application` in flink-conf.yaml and run `flink run -t yarn-per-job`: run job successfully. >> > >> > `FlinkYarnSessionCli` is not active >> > `GenericCLI` is active >> > >> > setting `execution.target: yarn-per-job` in flink-conf.yaml and run `flink run -t yarn-per-job`: run job successfully. >> > >> > `FlinkYarnSessionCli` is active >> > >> > From `AbstractYarnCli#isActive` [2] and `FlinkYarnSessionCli#isActive` [3], `FlinkYarnSessionCli` will be active when `execution.target` is specified with `yarn-per-job` or `yarn-session`. >> > >> > According to the flink official document [4], I thought the 2nd experiment should also work well, but it didn't. >> >> >> >> The --target will overwrite the execution.target specified in the config/flink-config.yaml. >> > >> > >> > The root cause is that `FlinkYarnSessionCli` only overwrite the `execution.target` with `yarn-session` or `yarn-per-job` [5], but no `yarn-application`. >> > So, my question is >> > >> > should we use `FlinkYarnSessionCli` in case 2? >> > if we should, how we can improve `FlinkYarnSessionCli` so that we can overwrite `execution.target` via `--target`? >> > >> > and one more improvement, the config description for `execution.target` [6] should include `yarn-application` as well. >> > >> > [1] https://github.com/apache/flink/blob/master/flink-yarn/src/main/java/org/apache/flink/yarn/YarnClusterDescriptor.java#L439-L447 >> > [2] https://github.com/apache/flink/blob/master/flink-yarn/src/main/java/org/apache/flink/yarn/cli/AbstractYarnCli.java#L54-L66 >> > [3] https://github.com/apache/flink/blob/master/flink-yarn/src/main/java/org/apache/flink/yarn/cli/FlinkYarnSessionCli.java#L373-L377 >> > [4] https://ci.apache.org/projects/flink/flink-docs-stable/deployment/cli.html#selecting-deployment-targets >> > [5] https://github.com/apache/flink/blob/master/flink-yarn/src/main/java/org/apache/flink/yarn/cli/FlinkYarnSessionCli.java#L397-L413 >> > [6] https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/configuration/DeploymentOptions.java#L41-L46 >> > >> > best regards, >> > |
I think you are right that the `GenericCLI` should be the first choice. From the top of my head I do not remember why FlinkYarnSessionCli is still used. Maybe it is in order to support some Yarn specific cli option parsing. I assume it is either an oversight or some parsing has not been completely migrated to the GenericCLI. Cheers, Till On Mon, Apr 26, 2021 at 11:07 AM Yangze Guo <[hidden email]> wrote: Hi, Till, |
If the GenericCLI is selected, then the execution.target should have
been overwritten to "yarn-application" in GenericCLI#toConfiguration. It is odd that why the GenericCLI#isActive return false as the execution.target is defined in both flink-conf and command line. Best, Yangze Guo On Mon, Apr 26, 2021 at 5:14 PM Till Rohrmann <[hidden email]> wrote: > > I think you are right that the `GenericCLI` should be the first choice. From the top of my head I do not remember why FlinkYarnSessionCli is still used. Maybe it is in order to support some Yarn specific cli option parsing. I assume it is either an oversight or some parsing has not been completely migrated to the GenericCLI. > > Cheers, > Till > > On Mon, Apr 26, 2021 at 11:07 AM Yangze Guo <[hidden email]> wrote: >> >> Hi, Till, >> >> I agree that we need to resolve the issue by overriding the >> configuration before selecting the CustomCommandLines. However, IIUC, >> after FLINK-15852 the GenericCLI should always be the first choice. >> Could you help me to understand why the FlinkYarnSessionCli can be >> activated? >> >> >> Best, >> Yangze Guo >> >> On Mon, Apr 26, 2021 at 4:48 PM Till Rohrmann <[hidden email]> wrote: >> > >> > Hi Tony, >> > >> > I think you are right that Flink's cli does not behave super consistent at the moment. Case 2. should definitely work because `-t yarn-application` should overwrite what is defined in the Flink configuration. The problem seems to be that we don't resolve the configuration wrt the specified command line options before calling into `CustomCommandLine.isActive`. If we parsed first the command line configuration options which can overwrite flink-conf.yaml options and then replaced them, then the custom command lines (assuming that they use the Configuration as the ground truth) should behave consistently. >> > >> > For your questions: >> > >> > 1. I am not 100% sure. I think the FlinkYarnSessionCli wasn't used on purpose when introducing the yarn application mode. >> > 2. See answer 1. >> > >> > I think it is a good idea to extend the description of the config option `execution.target`. Do you want to create a ticket and a PR for it? >> > >> > Cheers, >> > Till >> > >> > On Mon, Apr 26, 2021 at 8:37 AM Yangze Guo <[hidden email]> wrote: >> >> >> >> Hi, Tony. >> >> >> >> What is the version of your flink-dist. AFAIK, this issue should be >> >> addressed in FLINK-15852[1]. Could you give the client log of case >> >> 2(set the log level to DEBUG would be better). >> >> >> >> [1] https://issues.apache.org/jira/browse/FLINK-15852 >> >> >> >> Best, >> >> Yangze Guo >> >> >> >> On Sun, Apr 25, 2021 at 11:33 AM Tony Wei <[hidden email]> wrote: >> >> > >> >> > Hi Experts, >> >> > >> >> > I recently tried to run yarn-application mode on my yarn cluster, and I had a problem related to configuring `execution.target`. >> >> > After reading the source code and doing some experiments, I found that there should be some room of improvement for `FlinkYarnSessionCli` or `AbstractYarnCli`. >> >> > >> >> > My experiments are: >> >> > >> >> > setting `execution.target: yarn-application` in flink-conf.yaml and run `flink run-application -t yarn-application`: run job successfully. >> >> > >> >> > `FlinkYarnSessionCli` is not active >> >> > `GenericCLI` is active >> >> > >> >> > setting `execution.target: yarn-per-job` in flink-conf.yaml and run `flink run-application -t yarn-application`: run job failed >> >> > >> >> > failed due to `ClusterDeploymentException` [1] >> >> > `FlinkYarnSessionCli` is active >> >> > >> >> > setting `execution.target: yarn-application` in flink-conf.yaml and run `flink run -t yarn-per-job`: run job successfully. >> >> > >> >> > `FlinkYarnSessionCli` is not active >> >> > `GenericCLI` is active >> >> > >> >> > setting `execution.target: yarn-per-job` in flink-conf.yaml and run `flink run -t yarn-per-job`: run job successfully. >> >> > >> >> > `FlinkYarnSessionCli` is active >> >> > >> >> > From `AbstractYarnCli#isActive` [2] and `FlinkYarnSessionCli#isActive` [3], `FlinkYarnSessionCli` will be active when `execution.target` is specified with `yarn-per-job` or `yarn-session`. >> >> > >> >> > According to the flink official document [4], I thought the 2nd experiment should also work well, but it didn't. >> >> >> >> >> >> The --target will overwrite the execution.target specified in the config/flink-config.yaml. >> >> > >> >> > >> >> > The root cause is that `FlinkYarnSessionCli` only overwrite the `execution.target` with `yarn-session` or `yarn-per-job` [5], but no `yarn-application`. >> >> > So, my question is >> >> > >> >> > should we use `FlinkYarnSessionCli` in case 2? >> >> > if we should, how we can improve `FlinkYarnSessionCli` so that we can overwrite `execution.target` via `--target`? >> >> > >> >> > and one more improvement, the config description for `execution.target` [6] should include `yarn-application` as well. >> >> > >> >> > [1] https://github.com/apache/flink/blob/master/flink-yarn/src/main/java/org/apache/flink/yarn/YarnClusterDescriptor.java#L439-L447 >> >> > [2] https://github.com/apache/flink/blob/master/flink-yarn/src/main/java/org/apache/flink/yarn/cli/AbstractYarnCli.java#L54-L66 >> >> > [3] https://github.com/apache/flink/blob/master/flink-yarn/src/main/java/org/apache/flink/yarn/cli/FlinkYarnSessionCli.java#L373-L377 >> >> > [4] https://ci.apache.org/projects/flink/flink-docs-stable/deployment/cli.html#selecting-deployment-targets >> >> > [5] https://github.com/apache/flink/blob/master/flink-yarn/src/main/java/org/apache/flink/yarn/cli/FlinkYarnSessionCli.java#L397-L413 >> >> > [6] https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/configuration/DeploymentOptions.java#L41-L46 >> >> > >> >> > best regards, >> >> > |
In reply to this post by Till Rohrmann
Hi Till, Yangze,
I think FLINK-15852 should solve my problem. It is my fault that my flink version is not 100% consistent with the community version, and FLINK-15852 is the one I missed. Thanks for your information. best regards, Till Rohrmann <[hidden email]> 於 2021年4月26日 週一 下午5:14寫道:
|
Hi Till, I have created the ticket to extend the description of `execution.targe`. https://issues.apache.org/jira/browse/FLINK-22476 best regards, Tony Wei <[hidden email]> 於 2021年4月26日 週一 下午5:24寫道:
|
Free forum by Nabble | Edit this page |