SSL setup for YARN deployment when hostnames are unknown.

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

SSL setup for YARN deployment when hostnames are unknown.

Jiahui Jiang
Hello Flink!

We are working on turning on REST SSL for YARN deployments. We built a generic orchestration server that can submit Flink clusters to any YARN clusters given the relevant Hadoop configs. But this means we may not know the hostname the Job Managers can be deployed onto - not even through wild card DNS names as recommended in the documentation.

I’m wondering is there any factory class that I can implement that can allow me to generate a private key and import that to JM’s keystore at runtime?
Or is there any other recommended way to handle the cases where we don’t know the potential JM hosts at all?

Thank you!

Reply | Threaded
Open this post in threaded view
|

Re: SSL setup for YARN deployment when hostnames are unknown.

Jiahui Jiang
Ping on this 🙂  It there anyway I can run a script or implement some interface to run before the Dispatcher service starts up to dynamically generate the keystore? 

Thank you!

From: Jiahui Jiang <[hidden email]>
Sent: Monday, November 9, 2020 3:19 PM
To: [hidden email] <[hidden email]>
Subject: SSL setup for YARN deployment when hostnames are unknown.
 
Hello Flink!

We are working on turning on REST SSL for YARN deployments. We built a generic orchestration server that can submit Flink clusters to any YARN clusters given the relevant Hadoop configs. But this means we may not know the hostname the Job Managers can be deployed onto - not even through wild card DNS names as recommended in the documentation.

I’m wondering is there any factory class that I can implement that can allow me to generate a private key and import that to JM’s keystore at runtime?
Or is there any other recommended way to handle the cases where we don’t know the potential JM hosts at all?

Thank you!

Reply | Threaded
Open this post in threaded view
|

Re: SSL setup for YARN deployment when hostnames are unknown.

Matthias
Hi Jiahui,
thanks for reaching out to the mailing list. This is not something I have expertise in. But have you checked out the Flink SSL Setup documentation [1]? Maybe, you'd find some help there.

Additionally, I did go through the code a bit: A SecurityContext is loaded during ClusterEntrypoint startup [2]. It supports dynamic loading of security modules. You might have to implement org.apache.flink.runtime.security.contexts.SecurityContextFactory and configure it in your flink-conf.yaml. Is this something that might help you? I'm adding Aljoscha to this thread as he worked on dynamically loading these modules recently.

Best,
Matthias

[2] https://github.com/apache/flink/blob/2c8631a4eb7a247ce8fb4205f838e8c0f8019367/flink-runtime/src/main/java/org/apache/flink/runtime/entrypoint/ClusterEntrypoint.java#L170

On Wed, Nov 11, 2020 at 6:17 AM Jiahui Jiang <[hidden email]> wrote:
Ping on this 🙂  It there anyway I can run a script or implement some interface to run before the Dispatcher service starts up to dynamically generate the keystore? 

Thank you!

From: Jiahui Jiang <[hidden email]>
Sent: Monday, November 9, 2020 3:19 PM
To: [hidden email] <[hidden email]>
Subject: SSL setup for YARN deployment when hostnames are unknown.
 
Hello Flink!

We are working on turning on REST SSL for YARN deployments. We built a generic orchestration server that can submit Flink clusters to any YARN clusters given the relevant Hadoop configs. But this means we may not know the hostname the Job Managers can be deployed onto - not even through wild card DNS names as recommended in the documentation.

I’m wondering is there any factory class that I can implement that can allow me to generate a private key and import that to JM’s keystore at runtime?
Or is there any other recommended way to handle the cases where we don’t know the potential JM hosts at all?

Thank you!

Reply | Threaded
Open this post in threaded view
|

Re: SSL setup for YARN deployment when hostnames are unknown.

Jiahui Jiang
Hello Matthias,

Thank you for the links! I did see the documentations and went through the sourcecode. But unfortunately it looks like only a prebuilt keystore can be supported for YARN right now.

In term of dynamic loading security modules, the link you sent seems to mainly for zookeeper's security? I checked the part of code that sets up SSL for rest server [1], it doesn't look like the SslContext creation path is pluggable.



From: Matthias Pohl <[hidden email]>
Sent: Wednesday, November 11, 2020 3:58 AM
To: Jiahui Jiang <[hidden email]>
Cc: [hidden email] <[hidden email]>; [hidden email] <[hidden email]>
Subject: Re: SSL setup for YARN deployment when hostnames are unknown.
 
Hi Jiahui,
thanks for reaching out to the mailing list. This is not something I have expertise in. But have you checked out the Flink SSL Setup documentation [1]? Maybe, you'd find some help there.

Additionally, I did go through the code a bit: A SecurityContext is loaded during ClusterEntrypoint startup [2]. It supports dynamic loading of security modules. You might have to implement org.apache.flink.runtime.security.contexts.SecurityContextFactory and configure it in your flink-conf.yaml. Is this something that might help you? I'm adding Aljoscha to this thread as he worked on dynamically loading these modules recently.

Best,
Matthias

[2] https://github.com/apache/flink/blob/2c8631a4eb7a247ce8fb4205f838e8c0f8019367/flink-runtime/src/main/java/org/apache/flink/runtime/entrypoint/ClusterEntrypoint.java#L170

On Wed, Nov 11, 2020 at 6:17 AM Jiahui Jiang <[hidden email]> wrote:
Ping on this 🙂  It there anyway I can run a script or implement some interface to run before the Dispatcher service starts up to dynamically generate the keystore? 

Thank you!

From: Jiahui Jiang <[hidden email]>
Sent: Monday, November 9, 2020 3:19 PM
To: [hidden email] <[hidden email]>
Subject: SSL setup for YARN deployment when hostnames are unknown.
 
Hello Flink!

We are working on turning on REST SSL for YARN deployments. We built a generic orchestration server that can submit Flink clusters to any YARN clusters given the relevant Hadoop configs. But this means we may not know the hostname the Job Managers can be deployed onto - not even through wild card DNS names as recommended in the documentation.

I’m wondering is there any factory class that I can implement that can allow me to generate a private key and import that to JM’s keystore at runtime?
Or is there any other recommended way to handle the cases where we don’t know the potential JM hosts at all?

Thank you!

Reply | Threaded
Open this post in threaded view
|

Re: SSL setup for YARN deployment when hostnames are unknown.

Jiahui Jiang
Since the issue is right now we can't dynamically generate a keystore when the YARN application launches, but before the JobManager process starts. Do you think the best short term solution we will hack around `yarn.container-start-command-template`and have it execute a custom script that can generate the keystore, then start the JM process? Will that be allowed given the current Flink architecture?

Thanks!

From: Jiahui Jiang <[hidden email]>
Sent: Wednesday, November 11, 2020 9:09 AM
To: [hidden email] <[hidden email]>
Cc: [hidden email] <[hidden email]>; [hidden email] <[hidden email]>
Subject: Re: SSL setup for YARN deployment when hostnames are unknown.
 
Hello Matthias,

Thank you for the links! I did see the documentations and went through the sourcecode. But unfortunately it looks like only a prebuilt keystore can be supported for YARN right now.

In term of dynamic loading security modules, the link you sent seems to mainly for zookeeper's security? I checked the part of code that sets up SSL for rest server [1], it doesn't look like the SslContext creation path is pluggable.



From: Matthias Pohl <[hidden email]>
Sent: Wednesday, November 11, 2020 3:58 AM
To: Jiahui Jiang <[hidden email]>
Cc: [hidden email] <[hidden email]>; [hidden email] <[hidden email]>
Subject: Re: SSL setup for YARN deployment when hostnames are unknown.
 
Hi Jiahui,
thanks for reaching out to the mailing list. This is not something I have expertise in. But have you checked out the Flink SSL Setup documentation [1]? Maybe, you'd find some help there.

Additionally, I did go through the code a bit: A SecurityContext is loaded during ClusterEntrypoint startup [2]. It supports dynamic loading of security modules. You might have to implement org.apache.flink.runtime.security.contexts.SecurityContextFactory and configure it in your flink-conf.yaml. Is this something that might help you? I'm adding Aljoscha to this thread as he worked on dynamically loading these modules recently.

Best,
Matthias

[2] https://github.com/apache/flink/blob/2c8631a4eb7a247ce8fb4205f838e8c0f8019367/flink-runtime/src/main/java/org/apache/flink/runtime/entrypoint/ClusterEntrypoint.java#L170

On Wed, Nov 11, 2020 at 6:17 AM Jiahui Jiang <[hidden email]> wrote:
Ping on this 🙂  It there anyway I can run a script or implement some interface to run before the Dispatcher service starts up to dynamically generate the keystore? 

Thank you!

From: Jiahui Jiang <[hidden email]>
Sent: Monday, November 9, 2020 3:19 PM
To: [hidden email] <[hidden email]>
Subject: SSL setup for YARN deployment when hostnames are unknown.
 
Hello Flink!

We are working on turning on REST SSL for YARN deployments. We built a generic orchestration server that can submit Flink clusters to any YARN clusters given the relevant Hadoop configs. But this means we may not know the hostname the Job Managers can be deployed onto - not even through wild card DNS names as recommended in the documentation.

I’m wondering is there any factory class that I can implement that can allow me to generate a private key and import that to JM’s keystore at runtime?
Or is there any other recommended way to handle the cases where we don’t know the potential JM hosts at all?

Thank you!

Reply | Threaded
Open this post in threaded view
|

Re: SSL setup for YARN deployment when hostnames are unknown.

rmetzger0
Hi Jiahui,

using the yarn.container-start-command-template is indeed a good idea.

I was also wondering whether the Flink YARN client that submits the Flink cluster to YARN has knowledge of the host where the ApplicationMaster gets deployed to. But that doesn't seem to be the case.

On Wed, Nov 11, 2020 at 7:57 PM Jiahui Jiang <[hidden email]> wrote:
Since the issue is right now we can't dynamically generate a keystore when the YARN application launches, but before the JobManager process starts. Do you think the best short term solution we will hack around `yarn.container-start-command-template`and have it execute a custom script that can generate the keystore, then start the JM process? Will that be allowed given the current Flink architecture?

Thanks!

From: Jiahui Jiang <[hidden email]>
Sent: Wednesday, November 11, 2020 9:09 AM
To: [hidden email] <[hidden email]>
Cc: [hidden email] <[hidden email]>; [hidden email] <[hidden email]>
Subject: Re: SSL setup for YARN deployment when hostnames are unknown.
 
Hello Matthias,

Thank you for the links! I did see the documentations and went through the sourcecode. But unfortunately it looks like only a prebuilt keystore can be supported for YARN right now.

In term of dynamic loading security modules, the link you sent seems to mainly for zookeeper's security? I checked the part of code that sets up SSL for rest server [1], it doesn't look like the SslContext creation path is pluggable.



From: Matthias Pohl <[hidden email]>
Sent: Wednesday, November 11, 2020 3:58 AM
To: Jiahui Jiang <[hidden email]>
Cc: [hidden email] <[hidden email]>; [hidden email] <[hidden email]>
Subject: Re: SSL setup for YARN deployment when hostnames are unknown.
 
Hi Jiahui,
thanks for reaching out to the mailing list. This is not something I have expertise in. But have you checked out the Flink SSL Setup documentation [1]? Maybe, you'd find some help there.

Additionally, I did go through the code a bit: A SecurityContext is loaded during ClusterEntrypoint startup [2]. It supports dynamic loading of security modules. You might have to implement org.apache.flink.runtime.security.contexts.SecurityContextFactory and configure it in your flink-conf.yaml. Is this something that might help you? I'm adding Aljoscha to this thread as he worked on dynamically loading these modules recently.

Best,
Matthias

[2] https://github.com/apache/flink/blob/2c8631a4eb7a247ce8fb4205f838e8c0f8019367/flink-runtime/src/main/java/org/apache/flink/runtime/entrypoint/ClusterEntrypoint.java#L170

On Wed, Nov 11, 2020 at 6:17 AM Jiahui Jiang <[hidden email]> wrote:
Ping on this 🙂  It there anyway I can run a script or implement some interface to run before the Dispatcher service starts up to dynamically generate the keystore? 

Thank you!

From: Jiahui Jiang <[hidden email]>
Sent: Monday, November 9, 2020 3:19 PM
To: [hidden email] <[hidden email]>
Subject: SSL setup for YARN deployment when hostnames are unknown.
 
Hello Flink!

We are working on turning on REST SSL for YARN deployments. We built a generic orchestration server that can submit Flink clusters to any YARN clusters given the relevant Hadoop configs. But this means we may not know the hostname the Job Managers can be deployed onto - not even through wild card DNS names as recommended in the documentation.

I’m wondering is there any factory class that I can implement that can allow me to generate a private key and import that to JM’s keystore at runtime?
Or is there any other recommended way to handle the cases where we don’t know the potential JM hosts at all?

Thank you!

Reply | Threaded
Open this post in threaded view
|

Re: SSL setup for YARN deployment when hostnames are unknown.

Jiahui Jiang
Yeah there is no wildcard hostname it can be using.

Went ahead and started the implementation for the start up wrapper, but just realized after generating the key-cert pair in the JM wrapper, we will need to ping back to the client with the cert.

Another question I have is, currently we are using Flink Yarn client. Is there anywhere I can configure it to use a separate sslContext rather than the current JVM's truststore? Since for every cluster we submit, it needs to configure the trust context dynamically.

Thank you!


From: Robert Metzger <[hidden email]>
Sent: Thursday, November 12, 2020 2:08 AM
To: Jiahui Jiang <[hidden email]>
Cc: [hidden email] <[hidden email]>; [hidden email] <[hidden email]>; [hidden email] <[hidden email]>
Subject: Re: SSL setup for YARN deployment when hostnames are unknown.
 
Hi Jiahui,

using the yarn.container-start-command-template is indeed a good idea.

I was also wondering whether the Flink YARN client that submits the Flink cluster to YARN has knowledge of the host where the ApplicationMaster gets deployed to. But that doesn't seem to be the case.

On Wed, Nov 11, 2020 at 7:57 PM Jiahui Jiang <[hidden email]> wrote:
Since the issue is right now we can't dynamically generate a keystore when the YARN application launches, but before the JobManager process starts. Do you think the best short term solution we will hack around `yarn.container-start-command-template`and have it execute a custom script that can generate the keystore, then start the JM process? Will that be allowed given the current Flink architecture?

Thanks!

From: Jiahui Jiang <[hidden email]>
Sent: Wednesday, November 11, 2020 9:09 AM
To: [hidden email] <[hidden email]>
Cc: [hidden email] <[hidden email]>; [hidden email] <[hidden email]>
Subject: Re: SSL setup for YARN deployment when hostnames are unknown.
 
Hello Matthias,

Thank you for the links! I did see the documentations and went through the sourcecode. But unfortunately it looks like only a prebuilt keystore can be supported for YARN right now.

In term of dynamic loading security modules, the link you sent seems to mainly for zookeeper's security? I checked the part of code that sets up SSL for rest server [1], it doesn't look like the SslContext creation path is pluggable.



From: Matthias Pohl <[hidden email]>
Sent: Wednesday, November 11, 2020 3:58 AM
To: Jiahui Jiang <[hidden email]>
Cc: [hidden email] <[hidden email]>; [hidden email] <[hidden email]>
Subject: Re: SSL setup for YARN deployment when hostnames are unknown.
 
Hi Jiahui,
thanks for reaching out to the mailing list. This is not something I have expertise in. But have you checked out the Flink SSL Setup documentation [1]? Maybe, you'd find some help there.

Additionally, I did go through the code a bit: A SecurityContext is loaded during ClusterEntrypoint startup [2]. It supports dynamic loading of security modules. You might have to implement org.apache.flink.runtime.security.contexts.SecurityContextFactory and configure it in your flink-conf.yaml. Is this something that might help you? I'm adding Aljoscha to this thread as he worked on dynamically loading these modules recently.

Best,
Matthias

[2] https://github.com/apache/flink/blob/2c8631a4eb7a247ce8fb4205f838e8c0f8019367/flink-runtime/src/main/java/org/apache/flink/runtime/entrypoint/ClusterEntrypoint.java#L170

On Wed, Nov 11, 2020 at 6:17 AM Jiahui Jiang <[hidden email]> wrote:
Ping on this 🙂  It there anyway I can run a script or implement some interface to run before the Dispatcher service starts up to dynamically generate the keystore? 

Thank you!

From: Jiahui Jiang <[hidden email]>
Sent: Monday, November 9, 2020 3:19 PM
To: [hidden email] <[hidden email]>
Subject: SSL setup for YARN deployment when hostnames are unknown.
 
Hello Flink!

We are working on turning on REST SSL for YARN deployments. We built a generic orchestration server that can submit Flink clusters to any YARN clusters given the relevant Hadoop configs. But this means we may not know the hostname the Job Managers can be deployed onto - not even through wild card DNS names as recommended in the documentation.

I’m wondering is there any factory class that I can implement that can allow me to generate a private key and import that to JM’s keystore at runtime?
Or is there any other recommended way to handle the cases where we don’t know the potential JM hosts at all?

Thank you!