Hello everybody,
I'm running some tests on how Flink as a long-running YARN session handles security with Kerberos. In particular, I'm running a test where I run Flink on YARN with a service account and then deploy a job via CLI as another user; in the job I'm trying to access a private folder of the former on HDFS but the job fails due to permission issues (the user running the job is actually the one who ran Flink on YARN in the first place — the service account). I'm running Flink 1.0.0-RC5, launching the long-running session with: bin/yarn-session.sh -n 2 -tm 4096 -s 3 and then running the following command: bin/flink run examples/batch/WordCount.jar \ --input hdfs:///user/stefano.baghino/hamlet.txt \ --output hdfs:///user/stefano.baghino/hamlet.out Here are the logs: https://gist.github.com/stefanobaghino/6605ec33a1c4b632fb78 It looks like the YARN session is acting as a proxy for the user instead of receiving a delegation. Is there a way to change this behavior? Is this by design? Is there an interest in implementing the delegation (if it's not already implemented)? Otherwise, is there a workaround, apart from running one-off jobs on YARN? Thank you so much in advance. BR, Stefano Baghino |
In the initial description, I meant "I'm trying to access a private folder of the latter", so not the service account. Sorry for the mistake. On Sun, Mar 6, 2016 at 8:54 PM, Stefano Baghino <[hidden email]> wrote:
BR, Stefano Baghino |
One last note: initially I tried to run the session as the same OS user, running kdestroy and then kinit with the other user, having this error. Trying to run the job in a different OS session, authenticating with Kerberos as the user who should run the job, I can't connect to the JobManager. I've added a second log with this error to the gist. On Sun, Mar 6, 2016 at 9:01 PM, Stefano Baghino <[hidden email]> wrote:
BR, Stefano Baghino |
Hi Stefano,
That is currently a limitation of the Kerberos implementation. The Kerberos authentication is performed only once the Flink cluster is brought up. The Yarn session is then tight to a particular user's ticket. Note, that you need at least Hadoop version 2.6.1 or higher to run long-running jobs because there is a bug in the Kerberos client that may let the ticket expire. The workaround you already mentioned is to use a per-job Yarn cluster. There is currently no plan to delegate the user token per job but we could certainly think about implementing this in the future. https://ci.apache.org/projects/flink/flink-docs-master/setup/config.html#kerberos Cheers, Max On Sun, Mar 6, 2016 at 9:27 PM, Stefano Baghino <[hidden email]> wrote: > One last note: initially I tried to run the session as the same OS user, > running kdestroy and then kinit with the other user, having this error. > Trying to run the job in a different OS session, authenticating with > Kerberos as the user who should run the job, I can't connect to the > JobManager. I've added a second log with this error to the gist. > > On Sun, Mar 6, 2016 at 9:01 PM, Stefano Baghino > <[hidden email]> wrote: >> >> In the initial description, I meant "I'm trying to access a private folder >> of the latter", so not the service account. Sorry for the mistake. >> >> On Sun, Mar 6, 2016 at 8:54 PM, Stefano Baghino >> <[hidden email]> wrote: >>> >>> Hello everybody, >>> >>> I'm running some tests on how Flink as a long-running YARN session >>> handles security with Kerberos. In particular, I'm running a test where I >>> run Flink on YARN with a service account and then deploy a job via CLI as >>> another user; in the job I'm trying to access a private folder of the former >>> on HDFS but the job fails due to permission issues (the user running the job >>> is actually the one who ran Flink on YARN in the first place — the service >>> account). >>> >>> I'm running Flink 1.0.0-RC5, launching the long-running session with: >>> >>> bin/yarn-session.sh -n 2 -tm 4096 -s 3 >>> >>> and then running the following command: >>> >>> bin/flink run examples/batch/WordCount.jar \ >>> --input hdfs:///user/stefano.baghino/hamlet.txt \ >>> --output hdfs:///user/stefano.baghino/hamlet.out >>> >>> Here are the logs: >>> https://gist.github.com/stefanobaghino/6605ec33a1c4b632fb78 >>> >>> It looks like the YARN session is acting as a proxy for the user instead >>> of receiving a delegation. Is there a way to change this behavior? Is this >>> by design? Is there an interest in implementing the delegation (if it's not >>> already implemented)? Otherwise, is there a workaround, apart from running >>> one-off jobs on YARN? >>> >>> Thank you so much in advance. >>> >>> -- >>> BR, >>> Stefano Baghino >>> >>> Software Engineer @ Radicalbit >> >> >> >> >> -- >> BR, >> Stefano Baghino >> >> Software Engineer @ Radicalbit > > > > > -- > BR, > Stefano Baghino > > Software Engineer @ Radicalbit |
Ok, thank you for the very detailed explanation! On Sun, Mar 6, 2016 at 10:02 PM, Maximilian Michels <[hidden email]> wrote: Hi Stefano, BR, Stefano Baghino |
Free forum by Nabble | Edit this page |