Flink Job Cluster Deployment on K8s

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Flink Job Cluster Deployment on K8s

Thad Truman

Hello,

 

I am trying to experiment with the new Flink job cluster on Kubernetes that is available with the Flink 1.6.x release.

 

I am using the instructions here to create the docker image, which is working great.   This image then gets pushed to our Artifactory.

 

I am able to create the job cluster service using this helm chart.

 

However when I try to deploy the job cluster job using the helm chart below (based on this one):

 

apiVersion: batch/v1
kind: Job
metadata:
  name:
flink-job-cluster
spec:
  template:
    metadata:
      labels:
        app:
flink
       
component: job-cluster
   
spec:
      imagePullSecrets:
     
- name: artifactory-docker-registry
     
restartPolicy: OnFailure
     
containers:
     
- name: flink-job-cluster
       
image: {ImageOnArtifactory}
       
args: ["job-cluster", "--job-classname", "job.jar", "-Djobmanager.rpc.address=flink-job-cluster",
              
"-Dparallelism.default=1", "-Dblob.server.port=6124", "-Dquery.server.ports=6125"]
       
ports:
       
- containerPort: 6123
         
name: rpc
        -
containerPort: 6124
         
name: blob
        -
containerPort: 6125
         
name: query
        -
containerPort: 8081
          
name: ui

 

I get this error on the pod:

 

%d [%thread] %-5level %logger - %msg%n org.apache.flink.util.FlinkException: Could not load the provied entrypoint class.                                                                                               

at org.apache.flink.container.entrypoint.StandaloneJobClusterEntryPoint.createPackagedProgram(StandaloneJobClusterEntryPoint.java:102) ~[flink-dist_2.11-1.6.1.jar:1.6.1]                                       

at org.apache.flink.container.entrypoint.StandaloneJobClusterEntryPoint.retrieveJobGraph(StandaloneJobClusterEntryPoint.java:84) ~[flink-dist_2.11-1.6.1.jar:1.6.1]                                             

at org.apache.flink.runtime.entrypoint.JobClusterEntrypoint.createDispatcher(JobClusterEntrypoint.java:107) ~[flink-dist_2.11-1.6.1.jar:1.6.1]                                                                   

at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startClusterComponents(ClusterEntrypoint.java:353) ~[flink-dist_2.11-1.6.1.jar:1.6.1]                                                                  

at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runCluster(ClusterEntrypoint.java:232) ~[flink-dist_2.11-1.6.1.jar:1.6.1]                                                                              

at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$startCluster$0(ClusterEntrypoint.java:190) ~[flink-dist_2.11-1.6.1.jar:1.6.1]                                                                   

at org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30) ~[flink-dist_2.11-1.6.1.jar:1.6.1]                                                                             

at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:189) [flink-dist_2.11-1.6.1.jar:1.6.1]                                                                              

at org.apache.flink.container.entrypoint.StandaloneJobClusterEntryPoint.main(StandaloneJobClusterEntryPoint.java:170) [flink-dist_2.11-1.6.1.jar:1.6.1]

Caused by: java.lang.ClassNotFoundException: job.jar                                                                                                                                                                    

at java.net.URLClassLoader.findClass(URLClassLoader.java:381) ~[?:1.8.0_111-internal]                                                                                                                           

at java.lang.ClassLoader.loadClass(ClassLoader.java:424) ~[?:1.8.0_111-internal]                                                                                                                                 

at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) ~[?:1.8.0_111-internal]                                                                                                                         

at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ~[?:1.8.0_111-internal]                                                                                                                                 

at org.apache.flink.container.entrypoint.StandaloneJobClusterEntryPoint.createPackagedProgram(StandaloneJobClusterEntryPoint.java:99) ~[flink-dist_2.11-1.6.1.jar:1.6.1]                                       

... 8 more

 

 

When I launch and attach the docker image, job.jar exists in /opt/flink and there is a symbolic link in /opt/flink/lib.

 

Any ideas as to why job.jar can’t be found?

 

Our flink version is 1.6.1.  We are using the flink-1.6.1-bin-scala_2.11 distribution.

 

Any help would be much appreciated.

 

Thanks,

 

Thad Truman | Software Engineer | Neovest, Inc.

A:

T:

E:

1145 S 800 E, Ste 310 Orem, UT 84097

+1 801 900 2480

[hidden email]

 

Support Desk: [hidden email] / +1 800 433 4276

 

Alt logo for white backgrounds (Grey Flat)2

This email is confidential and subject to important disclaimers and conditions including on offers for purchase or sale of securities accuracy and completeness of information viruses confidentiality legal privilege and legal entity disclaimers available at www.neovest.com/disclosures.html

 

 

Reply | Threaded
Open this post in threaded view
|

Re: Flink Job Cluster Deployment on K8s

Joey Echeverria
Try replacing the job.jar in the args in your helm chart with the classname for your job rather than the name of the jar file.

-Joey

On Oct 18, 2018, at 9:21 AM, Thad Truman <[hidden email]> wrote:

Hello,
 
I am trying to experiment with the new Flink job cluster on Kubernetes that is available with the Flink 1.6.x release.
 
I am using the instructions here to create the docker image, which is working great.   This image then gets pushed to our Artifactory.
 
I am able to create the job cluster service using this helm chart.
 
However when I try to deploy the job cluster job using the helm chart below (based on this one):
 
apiVersion: batch/v1
kind: Job
metadata:
  name: 
flink-job-cluster
spec:
  template:
    metadata:
      labels:
        app: 
flink
        
component: job-cluster
    
spec:
      imagePullSecrets:
      
- name: artifactory-docker-registry
      
restartPolicy: OnFailure
      
containers:
      
- name: flink-job-cluster
        
image: {ImageOnArtifactory}
        
args: ["job-cluster", "--job-classname", "job.jar", "-Djobmanager.rpc.address=flink-job-cluster",
               
"-Dparallelism.default=1", "-Dblob.server.port=6124", "-Dquery.server.ports=6125"]
        
ports:
        
- containerPort: 6123
          
name: rpc
        - 
containerPort: 6124
          
name: blob
        - 
containerPort: 6125
          
name: query
        - 
containerPort: 8081
          
name: ui
 
I get this error on the pod:
 
%d [%thread] %-5level %logger - %msg%n org.apache.flink.util.FlinkException: Could not load the provied entrypoint class.                                                                                               
at org.apache.flink.container.entrypoint.StandaloneJobClusterEntryPoint.createPackagedProgram(StandaloneJobClusterEntryPoint.java:102) ~[flink-dist_2.11-1.6.1.jar:1.6.1]                                       
at org.apache.flink.container.entrypoint.StandaloneJobClusterEntryPoint.retrieveJobGraph(StandaloneJobClusterEntryPoint.java:84) ~[flink-dist_2.11-1.6.1.jar:1.6.1]                                             
at org.apache.flink.runtime.entrypoint.JobClusterEntrypoint.createDispatcher(JobClusterEntrypoint.java:107) ~[flink-dist_2.11-1.6.1.jar:1.6.1]                                                                   
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startClusterComponents(ClusterEntrypoint.java:353) ~[flink-dist_2.11-1.6.1.jar:1.6.1]                                                                  
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runCluster(ClusterEntrypoint.java:232) ~[flink-dist_2.11-1.6.1.jar:1.6.1]                                                                              
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$startCluster$0(ClusterEntrypoint.java:190) ~[flink-dist_2.11-1.6.1.jar:1.6.1]                                                                   
at org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30) ~[flink-dist_2.11-1.6.1.jar:1.6.1]                                                                             
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:189) [flink-dist_2.11-1.6.1.jar:1.6.1]                                                                              
at org.apache.flink.container.entrypoint.StandaloneJobClusterEntryPoint.main(StandaloneJobClusterEntryPoint.java:170) [flink-dist_2.11-1.6.1.jar:1.6.1]
Caused by: java.lang.ClassNotFoundException: job.jar                                                                                                                                                                    
at java.net.URLClassLoader.findClass(URLClassLoader.java:381) ~[?:1.8.0_111-internal]                                                                                                                           
at java.lang.ClassLoader.loadClass(ClassLoader.java:424) ~[?:1.8.0_111-internal]                                                                                                                                 
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) ~[?:1.8.0_111-internal]                                                                                                                         
at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ~[?:1.8.0_111-internal]                                                                                                                                 
at org.apache.flink.container.entrypoint.StandaloneJobClusterEntryPoint.createPackagedProgram(StandaloneJobClusterEntryPoint.java:99) ~[flink-dist_2.11-1.6.1.jar:1.6.1]                                       
... 8 more
 
 
When I launch and attach the docker image, job.jar exists in /opt/flink and there is a symbolic link in /opt/flink/lib.
 
Any ideas as to why job.jar can’t be found?
 
Our flink version is 1.6.1.  We are using the flink-1.6.1-bin-scala_2.11 distribution.
 
Any help would be much appreciated.
 
Thanks,
 

Thad Truman | Software Engineer | Neovest, Inc.

A:
T:
E:
1145 S 800 E, Ste 310 Orem, UT 84097
+1 801 900 2480
 
Support Desk: [hidden email] / +1 800 433 4276
 
<image001.png>
This email is confidential and subject to important disclaimers and conditions including on offers for purchase or sale of securities accuracy and completeness of information viruses confidentiality legal privilege and legal entity disclaimers available at www.neovest.com/disclosures.html