Hi Kevin,
Let me try to understand your problem. You have added the trusted keystore to the Flink app image(my-flink-app:0.0.1)
and it could not be loaded. Right? Even though you tunnel in the pod, you could not find the key store. It is strange.
I know it is not very convenient to bundle the keystore in the image. Mounting them from ConfigMap could make things
easier. The reason why we still not have this (make ConfigMap/PersistentVolume mountable via Flink config option) is that
pod template[1] may be a more common way to get this done.
Best,
Yang
Thanks for reaching out to the Flink community Kevin. Yes, with Flink 1.12.0 it should be possible to mount secrets with your K8s deployment. From the posted stack trace it is not possible to see what exactly is going wrong. Could you maybe post the complete logs? I am also pulling in Yang Wang who is most knowledgeable about Flink's K8s integration.
I think what we need in the Native Kubernetis Config is to mount custom ConfigMap, Secrets, and Volumes
I see that in the upcoming release, Secrets are able to get mounted
Hi I am using MinIO as a S3 mock backend for Native K8S
Everything seems to be fine except that it cannot connect to S3 since self-signed certificates' trusted store are not cloned in Deployment resources
Below is in order, how I add the trusted keystore by using keytools and how I run my app with the built image
FROM registry.local/mde/my-flink-app:0.0.1
COPY s3/certs/public.crt $FLINK_HOME/s3-e2e-public.crt
RUN keytool \
-noprompt \
-alias s3-e2e-public \
-importcert \
-trustcacerts \
-keystore $JAVA_HOME/lib/security/cacerts \
-storepass changeit \
-file $FLINK_HOME/s3-e2e-public.crt
$FLINK_HOME/bin/flink run-application \
-t kubernetes-application \
-Denv.java.opts="-Dkafka.brokers=kafka-external:9092 -Dkafka.schema-registry.url=kafka-schemaregistry:8081" \
-Dkubernetes.container-start-command-template="%java% %classpath% %jvmmem% %jvmopts% %logging% %class% %args%" \
-Dkubernetes.cluster-id=${K8S_CLUSTERID} \
-Dkubernetes.container.image=${DOCKER_REPO}/${ORGANISATION}/${APP_NAME}:${APP_VERSION} \
-Dkubernetes.namespace=${K8S_NAMESPACE} \
-Dkubernetes.rest-service.exposed.type=${K8S_RESTSERVICE_EXPOSED_TYPE} \
-Dkubernetes.taskmanager.cpu=${K8S_TASKMANAGER_CPU} \
-Dresourcemanager.taskmanager-timeout=3600000 \
-Dtaskmanager.memory.process.size=${TASKMANAGER_MEMORY_PROCESS_SIZE} \
-Dtaskmanager.numberOfTaskSlots=${TASKMANAGER_NUMBEROFTASKSLOTS} \
-Ds3.endpoint=s3:443 \
-Ds3.access-key=${S3_ACCESSKEY} \
-Ds3.secret-key=${S3_SECRETKEY} \
-Ds3.path.style.access=true \
-Dstate.backend=filesystem \
-Dstate.checkpoints.dir=s3://${ORGANISATION}/${APP_NAME}/checkpoint \
-Dstate.savepoints.dir=s3://${ORGANISATION}/${APP_NAME}/savepoint \
local://${FLINK_HOME}/usrlib/${APP_NAME}-assembly-${APP_VERSION}.jar
However, I get the following error and I don't see my trusted key in keytools when I login to the pod (seems the trustedstore is not cloned)
Caused by: org.apache.flink.util.FlinkRuntimeException: Failed to create checkpoint storage at checkpoint coordinator side.
at org.apache.flink.runtime.checkpoint.CheckpointCoordinator.<init>(CheckpointCoordinator.java:305) ~[flink-dist_2.12-1.11.2.jar:1.11.2]
at org.apache.flink.runtime.checkpoint.CheckpointCoordinator.<init>(CheckpointCoordinator.java:224) ~[flink-dist_2.12-1.11.2.jar:1.11.2]
at org.apache.flink.runtime.executiongraph.ExecutionGraph.enableCheckpointing(ExecutionGraph.java:483) ~[flink-dist_2.12-1.11.2.jar:1.11.2]
at org.apache.flink.runtime.executiongraph.ExecutionGraphBuilder.buildGraph(ExecutionGraphBuilder.java:338) ~[flink-dist_2.12-1.11.2.jar:1.11.2]
at org.apache.flink.runtime.scheduler.SchedulerBase.createExecutionGraph(SchedulerBase.java:269) ~[flink-dist_2.12-1.11.2.jar:1.11.2]
at org.apache.flink.runtime.scheduler.SchedulerBase.createAndRestoreExecutionGraph(SchedulerBase.java:242) ~[flink-dist_2.12-1.11.2.jar:1.11.2]
at org.apache.flink.runtime.scheduler.SchedulerBase.<init>(SchedulerBase.java:229) ~[flink-dist_2.12-1.11.2.jar:1.11.2]
at org.apache.flink.runtime.scheduler.DefaultScheduler.<init>(DefaultScheduler.java:119) ~[flink-dist_2.12-1.11.2.jar:1.11.2]
at org.apache.flink.runtime.scheduler.DefaultSchedulerFactory.createInstance(DefaultSchedulerFactory.java:103) ~[flink-dist_2.12-1.11.2.jar:1.11.2]
at org.apache.flink.runtime.jobmaster.JobMaster.createScheduler(JobMaster.java:284) ~[flink-dist_2.12-1.11.2.jar:1.11.2]
at org.apache.flink.runtime.jobmaster.JobMaster.<init>(JobMaster.java:272) ~[flink-dist_2.12-1.11.2.jar:1.11.2]
at org.apache.flink.runtime.jobmaster.factories.DefaultJobMasterServiceFactory.createJobMasterService(DefaultJobMasterServiceFactory.java:98) ~[flink-dist_2.12-1.11.2.jar:1.11.2]
at org.apache.flink.runtime.jobmaster.factories.DefaultJobMasterServiceFactory.createJobMasterService(DefaultJobMasterServiceFactory.java:40) ~[flink-dist_2.12-1.11.2.jar:1.11.2]
at org.apache.flink.runtime.jobmaster.JobManagerRunnerImpl.<init>(JobManagerRunnerImpl.java:140) ~[flink-dist_2.12-1.11.2.jar:1.11.2]
at org.apache.flink.runtime.dispatcher.DefaultJobManagerRunnerFactory.createJobManagerRunner(DefaultJobManagerRunnerFactory.java:84) ~[flink-dist_2.12-1.11.2.jar:1.11.2]
at org.apache.flink.runtime.dispatcher.Dispatcher.lambda$createJobManagerRunner$6(Dispatcher.java:388) ~[flink-dist_2.12-1.11.2.jar:1.11.2]
at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604) ~[?:1.8.0_265]
... 6 more
Caused by: org.apache.hadoop.fs.s3a.AWSClientIOException: doesBucketExist on mde: com.amazonaws.SdkClientException: Unable to execute HTTP request: Unrecognized SSL message, plaintext connection?: Unable to execute HTTP request: Unrecognized SSL message, plaintext connection?