Programmatically Creating a Flink Cluster On YARN

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Programmatically Creating a Flink Cluster On YARN

Bostow, Ben

I’m currently trying to programmatically create a Flink cluster on a given YARN cluster. I’m using the FlinkYarnClientBase class to do this currently with some limitations (Flink version 1.0.3).

 

I’m wanting to pass in my own YARN configuration so that I can deploy Flink on different YARN clusters. Currently these appear to be picked up from an environment variable YARN_CONF or if I add the yarn configuration files into the classpath it works for that one.

 

Is there a way I can dynamically build the YARN configuration and Flink configuration without utilizing the file system? It would be nice if we had the option to utilize the same Flink client code and have the CLI be a wrapper that adds all the environment variable stuff, and then when we need to do create these programmatically we can do that too.

 

I’ve started looking at the new code in Flink master and it seems a little easier than in 1.0.3 but seems that the YarnConfiguration is private in AbstractYarnClusterDescriptor making it difficult to set the configuration programmatically. I’m not sure if the new client code still throws an exception if the Flink config doesn’t exist, even though I could build out the Flink config programmatically.

 

Is there a recommended approach to allow this?

 

Benjamin

Reply | Threaded
Open this post in threaded view
|

Re: Programmatically Creating a Flink Cluster On YARN

Ufuk Celebi
What you describe seems to be correct and I am not aware of a
"recommended approach". It's currently not well supported to
programmatically create Flink on YARN clusters (like providing a
YarnExecutionEnvironment or so).

I think that Stephan and Max (cc'd) are coordinating some work, which
should add support for what you describe. Max worked on the current
YARN client refactoring and he can probably share some of his insights
regarding this.


On Tue, Aug 2, 2016 at 11:09 PM, Bostow, Ben <[hidden email]> wrote:

> I’m currently trying to programmatically create a Flink cluster on a given
> YARN cluster. I’m using the FlinkYarnClientBase class to do this currently
> with some limitations (Flink version 1.0.3).
>
>
>
> I’m wanting to pass in my own YARN configuration so that I can deploy Flink
> on different YARN clusters. Currently these appear to be picked up from an
> environment variable YARN_CONF or if I add the yarn configuration files into
> the classpath it works for that one.
>
>
>
> Is there a way I can dynamically build the YARN configuration and Flink
> configuration without utilizing the file system? It would be nice if we had
> the option to utilize the same Flink client code and have the CLI be a
> wrapper that adds all the environment variable stuff, and then when we need
> to do create these programmatically we can do that too.
>
>
>
> I’ve started looking at the new code in Flink master and it seems a little
> easier than in 1.0.3 but seems that the YarnConfiguration is private in
> AbstractYarnClusterDescriptor making it difficult to set the configuration
> programmatically. I’m not sure if the new client code still throws an
> exception if the Flink config doesn’t exist, even though I could build out
> the Flink config programmatically.
>
>
>
> Is there a recommended approach to allow this?
>
>
>
> Benjamin
Reply | Threaded
Open this post in threaded view
|

Re: Programmatically Creating a Flink Cluster On YARN

Maximilian Michels
Hi Benjamin,

Please apologize the late reply. In the latest code base and also
Flink 1.1.1, the Flink configuration doesn't have to be loaded via a
file location read from an environment variable and it doesn't throw
an exception if it can't find the config upfront (phew). Instead, you
can also set the configuration manually via
`YarnClusterDescriptor.setFlinkConfiguration(Configuration config)`.

As for the Yarn configuration, I'm sorry that is still the same.
You're right, that we should provide a hook to change the Yarn
configuration. Probably just changing the field variable to
'protected' would help people to easily change the configuration.
There is no CLI utility to set config variables yet. I've created a
JIRA issue: https://issues.apache.org/jira/browse/FLINK-4416

Cheers,
Max

On Wed, Aug 3, 2016 at 3:29 PM, Ufuk Celebi <[hidden email]> wrote:

> What you describe seems to be correct and I am not aware of a
> "recommended approach". It's currently not well supported to
> programmatically create Flink on YARN clusters (like providing a
> YarnExecutionEnvironment or so).
>
> I think that Stephan and Max (cc'd) are coordinating some work, which
> should add support for what you describe. Max worked on the current
> YARN client refactoring and he can probably share some of his insights
> regarding this.
>
>
> On Tue, Aug 2, 2016 at 11:09 PM, Bostow, Ben <[hidden email]> wrote:
>> I’m currently trying to programmatically create a Flink cluster on a given
>> YARN cluster. I’m using the FlinkYarnClientBase class to do this currently
>> with some limitations (Flink version 1.0.3).
>>
>>
>>
>> I’m wanting to pass in my own YARN configuration so that I can deploy Flink
>> on different YARN clusters. Currently these appear to be picked up from an
>> environment variable YARN_CONF or if I add the yarn configuration files into
>> the classpath it works for that one.
>>
>>
>>
>> Is there a way I can dynamically build the YARN configuration and Flink
>> configuration without utilizing the file system? It would be nice if we had
>> the option to utilize the same Flink client code and have the CLI be a
>> wrapper that adds all the environment variable stuff, and then when we need
>> to do create these programmatically we can do that too.
>>
>>
>>
>> I’ve started looking at the new code in Flink master and it seems a little
>> easier than in 1.0.3 but seems that the YarnConfiguration is private in
>> AbstractYarnClusterDescriptor making it difficult to set the configuration
>> programmatically. I’m not sure if the new client code still throws an
>> exception if the Flink config doesn’t exist, even though I could build out
>> the Flink config programmatically.
>>
>>
>>
>> Is there a recommended approach to allow this?
>>
>>
>>
>> Benjamin