I’m currently trying to programmatically create a Flink cluster on a given YARN cluster. I’m using the FlinkYarnClientBase class to do this currently with some limitations (Flink version 1.0.3).
I’m wanting to pass in my own YARN configuration so that I can deploy Flink on different YARN clusters. Currently these appear to be picked up from an environment variable YARN_CONF or if I add the yarn configuration
files into the classpath it works for that one. Is there a way I can dynamically build the YARN configuration and Flink configuration without utilizing the file system? It would be nice if we had the option to utilize the same Flink client code and have
the CLI be a wrapper that adds all the environment variable stuff, and then when we need to do create these programmatically we can do that too.
I’ve started looking at the new code in Flink master and it seems a little easier than in 1.0.3 but seems that the YarnConfiguration is private in AbstractYarnClusterDescriptor making it difficult to set the
configuration programmatically. I’m not sure if the new client code still throws an exception if the Flink config doesn’t exist, even though I could build out the Flink config programmatically.
Is there a recommended approach to allow this?
Benjamin |
What you describe seems to be correct and I am not aware of a
"recommended approach". It's currently not well supported to programmatically create Flink on YARN clusters (like providing a YarnExecutionEnvironment or so). I think that Stephan and Max (cc'd) are coordinating some work, which should add support for what you describe. Max worked on the current YARN client refactoring and he can probably share some of his insights regarding this. On Tue, Aug 2, 2016 at 11:09 PM, Bostow, Ben <[hidden email]> wrote: > I’m currently trying to programmatically create a Flink cluster on a given > YARN cluster. I’m using the FlinkYarnClientBase class to do this currently > with some limitations (Flink version 1.0.3). > > > > I’m wanting to pass in my own YARN configuration so that I can deploy Flink > on different YARN clusters. Currently these appear to be picked up from an > environment variable YARN_CONF or if I add the yarn configuration files into > the classpath it works for that one. > > > > Is there a way I can dynamically build the YARN configuration and Flink > configuration without utilizing the file system? It would be nice if we had > the option to utilize the same Flink client code and have the CLI be a > wrapper that adds all the environment variable stuff, and then when we need > to do create these programmatically we can do that too. > > > > I’ve started looking at the new code in Flink master and it seems a little > easier than in 1.0.3 but seems that the YarnConfiguration is private in > AbstractYarnClusterDescriptor making it difficult to set the configuration > programmatically. I’m not sure if the new client code still throws an > exception if the Flink config doesn’t exist, even though I could build out > the Flink config programmatically. > > > > Is there a recommended approach to allow this? > > > > Benjamin |
Hi Benjamin,
Please apologize the late reply. In the latest code base and also Flink 1.1.1, the Flink configuration doesn't have to be loaded via a file location read from an environment variable and it doesn't throw an exception if it can't find the config upfront (phew). Instead, you can also set the configuration manually via `YarnClusterDescriptor.setFlinkConfiguration(Configuration config)`. As for the Yarn configuration, I'm sorry that is still the same. You're right, that we should provide a hook to change the Yarn configuration. Probably just changing the field variable to 'protected' would help people to easily change the configuration. There is no CLI utility to set config variables yet. I've created a JIRA issue: https://issues.apache.org/jira/browse/FLINK-4416 Cheers, Max On Wed, Aug 3, 2016 at 3:29 PM, Ufuk Celebi <[hidden email]> wrote: > What you describe seems to be correct and I am not aware of a > "recommended approach". It's currently not well supported to > programmatically create Flink on YARN clusters (like providing a > YarnExecutionEnvironment or so). > > I think that Stephan and Max (cc'd) are coordinating some work, which > should add support for what you describe. Max worked on the current > YARN client refactoring and he can probably share some of his insights > regarding this. > > > On Tue, Aug 2, 2016 at 11:09 PM, Bostow, Ben <[hidden email]> wrote: >> I’m currently trying to programmatically create a Flink cluster on a given >> YARN cluster. I’m using the FlinkYarnClientBase class to do this currently >> with some limitations (Flink version 1.0.3). >> >> >> >> I’m wanting to pass in my own YARN configuration so that I can deploy Flink >> on different YARN clusters. Currently these appear to be picked up from an >> environment variable YARN_CONF or if I add the yarn configuration files into >> the classpath it works for that one. >> >> >> >> Is there a way I can dynamically build the YARN configuration and Flink >> configuration without utilizing the file system? It would be nice if we had >> the option to utilize the same Flink client code and have the CLI be a >> wrapper that adds all the environment variable stuff, and then when we need >> to do create these programmatically we can do that too. >> >> >> >> I’ve started looking at the new code in Flink master and it seems a little >> easier than in 1.0.3 but seems that the YarnConfiguration is private in >> AbstractYarnClusterDescriptor making it difficult to set the configuration >> programmatically. I’m not sure if the new client code still throws an >> exception if the Flink config doesn’t exist, even though I could build out >> the Flink config programmatically. >> >> >> >> Is there a recommended approach to allow this? >> >> >> >> Benjamin |
Free forum by Nabble | Edit this page |