Hello,
I'm running per job Flink cluster, JM is deployed as Kubernetes Job with restartPolicy: Never, highavailability is KubernetesHaServicesFactory. Job runs fine for some time, configmaps are created etc. Now in order to upgrade Flink job, I'm trying to stop job
with savepoint (flink stop $JOB_ID), JM exits with code 2, from log:
{"ts":"2021-02-20T21:34:18.195Z","message":"Terminating cluster entrypoint process StandaloneApplicationClusterEntryPoint with exit code 2.","logger_name":"org.apache.flink.runtime.entrypoint.ClusterEntrypoint","thread_name":"flink-akka.actor.default-dispatcher-2","level":"INFO","level_value":20000,"stack_trace":"java.util.concurrent.ExecutionException:
io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: GET at: https://10.96.0.1/api/v1/namespaces/n/configmaps?labelSelector=app%3Dfsp%2Cconfigmap-type%3Dhigh-availability%2Ctype%3Dflink-native-kubernetes. Message: Forbidden!Configured
service account doesn't have access. Service account may have been revoked. configmaps is forbidden: User \"system:serviceaccount:n:fsp\" cannot list resource \"configmaps\" in API group \"\" in the namespace \"n\".\n\tat java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)\n\tat
java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)\n\tat org.apache.flink.kubernetes.highavailability.KubernetesHaServices.internalCleanup(KubernetesHaServices.java:142)\n\tat org.apache.flink.runtime.highavailability.AbstractHaServices.closeAndCleanupAllData(AbstractHaServices.java:180)\n\tat
org.apache.flink.runtime.entrypoint.ClusterEntrypoint.stopClusterServices(ClusterEntrypoint.java:378)\n\tat org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$shutDownAsync$3(ClusterEntrypoint.java:467)\n\tat org.apache.flink.runtime.concurrent.FutureUtils.lambda$composeAfterwards$19(FutureUtils.java:704)\n\tat
java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774)\n\tat java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750)\n\tat java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)\n\tat
java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975)\n\tat org.apache.flink.runtime.concurrent.FutureUtils.lambda$null$18(FutureUtils.java:715)\n\tat java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774)\n\tat
java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750)\n\tat java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)\n\tat java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975)\n\tat
org.apache.flink.runtime.entrypoint.component.DispatcherResourceManagerComponent.lambda$closeAsyncInternal$3(DispatcherResourceManagerComponent.java:182)\n\tat java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774)\n\tat java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750)\n\tat
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)\n\tat java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975)\n\tat org.apache.flink.runtime.concurrent.FutureUtils$CompletionConjunctFuture.completeFuture(FutureUtils.java:956)\n\tat
java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774)\n\tat java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750)\n\tat java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)\n\tat
java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975)\n\tat org.apache.flink.runtime.concurrent.FutureUtils.lambda$forwardTo$22(FutureUtils.java:1323)\n\tat java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774)\n\tat
java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750)\n\tat java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:456)\n\tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)\n\tat
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)\n\tat java.lang.Thread.run(Thread.java:748)\nCaused by: io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: GET at: https://10.96.0.1/api/v1/namespaces/n/configmaps?labelSelector=app%3Dfsp%2Cconfigmap-type%3Dhigh-availability%2Ctype%3Dflink-native-kubernetes.
Message: Forbidden!Configured service account doesn't have access. Service account may have been revoked. configmaps is forbidden: User \"system:serviceaccount:n:fsp\" cannot list resource \"configmaps\" in API group \"\" in the namespace \"n\".\n\tat io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:568)\n\tat
io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:505)\n\tat io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:471)\n\tat io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:430)\n\tat
io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:412)\n\tat io.fabric8.kubernetes.client.dsl.base.BaseOperation.listRequestHelper(BaseOperation.java:151)\n\tat io.fabric8.kubernetes.client.dsl.base.BaseOperation.list(BaseOperation.java:621)\n\tat
io.fabric8.kubernetes.client.dsl.base.BaseOperation.deleteList(BaseOperation.java:730)\n\tat io.fabric8.kubernetes.client.dsl.base.BaseOperation.delete(BaseOperation.java:655)\n\tat io.fabric8.kubernetes.client.dsl.base.BaseOperation.delete(BaseOperation.java:70)\n\tat
org.apache.flink.kubernetes.kubeclient.Fabric8FlinkKubeClient.lambda$deleteConfigMapsByLabels$10(Fabric8FlinkKubeClient.java:361)\n\tat java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1640)\n\t... 3 common frames omitted\n"}
Service account (fsp) role has following rules:
rules:
- apiGroups:
- ""
resources:
- configmaps
verbs:
- update
- get
- create
- watch
- patch
- delete
So service account seems allowed to GET configmaps. Also seems service account was ok to create configmaps during run (no complains in log).
Thanks,
Alexey
|
Adding "list" to verbs helps, do I need to add anything else ?
From: Alexey Trenikhun <[hidden email]>
Sent: Saturday, February 20, 2021 2:10 PM To: Flink User Mail List <[hidden email]> Subject: stop job with Savepoint
Hello,
I'm running per job Flink cluster, JM is deployed as Kubernetes Job with restartPolicy: Never, highavailability is KubernetesHaServicesFactory. Job runs fine for some time, configmaps are created etc. Now in order to upgrade Flink job, I'm trying to stop job
with savepoint (flink stop $JOB_ID), JM exits with code 2, from log:
{"ts":"2021-02-20T21:34:18.195Z","message":"Terminating cluster entrypoint process StandaloneApplicationClusterEntryPoint with exit code 2.","logger_name":"org.apache.flink.runtime.entrypoint.ClusterEntrypoint","thread_name":"flink-akka.actor.default-dispatcher-2","level":"INFO","level_value":20000,"stack_trace":"java.util.concurrent.ExecutionException:
io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: GET at: https://10.96.0.1/api/v1/namespaces/n/configmaps?labelSelector=app%3Dfsp%2Cconfigmap-type%3Dhigh-availability%2Ctype%3Dflink-native-kubernetes. Message: Forbidden!Configured
service account doesn't have access. Service account may have been revoked. configmaps is forbidden: User \"system:serviceaccount:n:fsp\" cannot list resource \"configmaps\" in API group \"\" in the namespace \"n\".\n\tat java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)\n\tat
java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)\n\tat org.apache.flink.kubernetes.highavailability.KubernetesHaServices.internalCleanup(KubernetesHaServices.java:142)\n\tat org.apache.flink.runtime.highavailability.AbstractHaServices.closeAndCleanupAllData(AbstractHaServices.java:180)\n\tat
org.apache.flink.runtime.entrypoint.ClusterEntrypoint.stopClusterServices(ClusterEntrypoint.java:378)\n\tat org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$shutDownAsync$3(ClusterEntrypoint.java:467)\n\tat org.apache.flink.runtime.concurrent.FutureUtils.lambda$composeAfterwards$19(FutureUtils.java:704)\n\tat
java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774)\n\tat java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750)\n\tat java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)\n\tat
java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975)\n\tat org.apache.flink.runtime.concurrent.FutureUtils.lambda$null$18(FutureUtils.java:715)\n\tat java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774)\n\tat
java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750)\n\tat java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)\n\tat java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975)\n\tat
org.apache.flink.runtime.entrypoint.component.DispatcherResourceManagerComponent.lambda$closeAsyncInternal$3(DispatcherResourceManagerComponent.java:182)\n\tat java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774)\n\tat java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750)\n\tat
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)\n\tat java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975)\n\tat org.apache.flink.runtime.concurrent.FutureUtils$CompletionConjunctFuture.completeFuture(FutureUtils.java:956)\n\tat
java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774)\n\tat java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750)\n\tat java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)\n\tat
java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975)\n\tat org.apache.flink.runtime.concurrent.FutureUtils.lambda$forwardTo$22(FutureUtils.java:1323)\n\tat java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774)\n\tat
java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750)\n\tat java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:456)\n\tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)\n\tat
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)\n\tat java.lang.Thread.run(Thread.java:748)\nCaused by: io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: GET at: https://10.96.0.1/api/v1/namespaces/n/configmaps?labelSelector=app%3Dfsp%2Cconfigmap-type%3Dhigh-availability%2Ctype%3Dflink-native-kubernetes.
Message: Forbidden!Configured service account doesn't have access. Service account may have been revoked. configmaps is forbidden: User \"system:serviceaccount:n:fsp\" cannot list resource \"configmaps\" in API group \"\" in the namespace \"n\".\n\tat io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:568)\n\tat
io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:505)\n\tat io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:471)\n\tat io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:430)\n\tat
io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:412)\n\tat io.fabric8.kubernetes.client.dsl.base.BaseOperation.listRequestHelper(BaseOperation.java:151)\n\tat io.fabric8.kubernetes.client.dsl.base.BaseOperation.list(BaseOperation.java:621)\n\tat
io.fabric8.kubernetes.client.dsl.base.BaseOperation.deleteList(BaseOperation.java:730)\n\tat io.fabric8.kubernetes.client.dsl.base.BaseOperation.delete(BaseOperation.java:655)\n\tat io.fabric8.kubernetes.client.dsl.base.BaseOperation.delete(BaseOperation.java:70)\n\tat
org.apache.flink.kubernetes.kubeclient.Fabric8FlinkKubeClient.lambda$deleteConfigMapsByLabels$10(Fabric8FlinkKubeClient.java:361)\n\tat java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1640)\n\t... 3 common frames omitted\n"}
Service account (fsp) role has following rules:
rules:
- apiGroups:
- ""
resources:
- configmaps
verbs:
- update
- get
- create
- watch
- patch
- delete
So service account seems allowed to GET configmaps. Also seems service account was ok to create configmaps during run (no complains in log).
Thanks,
Alexey
|
Hi Alexey, The list looks complete to me. Please report back if this is not correct. On Sat, Feb 20, 2021 at 11:30 PM Alexey Trenikhun <[hidden email]> wrote:
|
Free forum by Nabble | Edit this page |