How to deploy dynamically generated flink jobs?

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

How to deploy dynamically generated flink jobs?

Alexander Bagerman

Hi,

I am trying to build a functionality to dynamically configure a flink job (Java) in my code based on some additional metadata and submit it to a flink running in a session cluster.

Flink version is 1.11.2

The problem I have is how to provide a packed job to the cluster. When I am trying the following code

StreamExecutionEnvironment env = StreamExecutionEnvironment.createRemoteEnvironment(hostName, hostPort);
... configuring job workflow here...
env.execute(jobName);

I am getting ClassNotFoundException stating that code for my mapping functions did not make it to the cluster. Which makes sense.

What would be the right way to deploy dynamically configured flink jobs which are not packaged as a jar file but rather generated ad-hoc?

Thanks

Reply | Threaded
Open this post in threaded view
|

Re: How to deploy dynamically generated flink jobs?

Yun Gao
Hi Alexander, 

The signature of the createRemoteEnvironment is 
public static StreamExecutionEnvironment createRemoteEnvironment(
String host, int port, String... jarFiles);
Which could also ship the jars to execute to remote cluster. Could you have a try to also pass the jar files to the remote environment ?

Best,
 Yun
------------------------------------------------------------------
Sender:Alexander Bagerman<[hidden email]>
Date:2020/10/29 10:43:16
Recipient:<[hidden email]>
Theme:How to deploy dynamically generated flink jobs?

Hi,

I am trying to build a functionality to dynamically configure a flink job (Java) in my code based on some additional metadata and submit it to a flink running in a session cluster.

Flink version is 1.11.2

The problem I have is how to provide a packed job to the cluster. When I am trying the following code

StreamExecutionEnvironment env = StreamExecutionEnvironment.createRemoteEnvironment(hostName, hostPort);
... configuring job workflow here...
env.execute(jobName);

I am getting ClassNotFoundException stating that code for my mapping functions did not make it to the cluster. Which makes sense.

What would be the right way to deploy dynamically configured flink jobs which are not packaged as a jar file but rather generated ad-hoc?

Thanks


Reply | Threaded
Open this post in threaded view
|

Re: How to deploy dynamically generated flink jobs?

Alexander Bagerman
I did try it but this option seems to be for a third party jar. In my case I would need to specify/ship a jar that contains the code where job is being constracted. I'm not clear of
1. how to point to the containg jar 
2. how to test such a submission from my project running in Eclipse 
Alex 

On Wed, Oct 28, 2020 at 8:21 PM Yun Gao <[hidden email]> wrote:
Hi Alexander, 

The signature of the createRemoteEnvironment is 
public static StreamExecutionEnvironment createRemoteEnvironment(
String host, int port, String... jarFiles);
Which could also ship the jars to execute to remote cluster. Could you have a try to also pass the jar files to the remote environment ?

Best,
 Yun
------------------------------------------------------------------
Sender:Alexander Bagerman<[hidden email]>
Date:2020/10/29 10:43:16
Recipient:<[hidden email]>
Theme:How to deploy dynamically generated flink jobs?


Hi,

I am trying to build a functionality to dynamically configure a flink job (Java) in my code based on some additional metadata and submit it to a flink running in a session cluster.

Flink version is 1.11.2

The problem I have is how to provide a packed job to the cluster. When I am trying the following code

StreamExecutionEnvironment env = StreamExecutionEnvironment.createRemoteEnvironment(hostName, hostPort);
... configuring job workflow here...
env.execute(jobName);

I am getting ClassNotFoundException stating that code for my mapping functions did not make it to the cluster. Which makes sense.

What would be the right way to deploy dynamically configured flink jobs which are not packaged as a jar file but rather generated ad-hoc?

Thanks


Reply | Threaded
Open this post in threaded view
|

Re: Re: How to deploy dynamically generated flink jobs?

Yun Gao
Hi Alexander,

 From my side I still think it should be reasonable to have a jar that contains the code that are running in the clients and also shipped to the cluster. Then this jar could also be included in the shipping jar list.

 For the second issue, similarly I think you may first build the project to get the jar containing the code, then fill the path of the generated jar in to test the submitting.

Best,
 Yun


------------------Original Mail ------------------
Sender:Alexander Bagerman <[hidden email]>
Send Date:Thu Oct 29 11:38:45 2020
Recipients:Yun Gao <[hidden email]>
CC:Flink ML <[hidden email]>
Subject:Re: How to deploy dynamically generated flink jobs?
I did try it but this option seems to be for a third party jar. In my case I would need to specify/ship a jar that contains the code where job is being constracted. I'm not clear of
1. how to point to the containg jar 
2. how to test such a submission from my project running in Eclipse 
Alex 

On Wed, Oct 28, 2020 at 8:21 PM Yun Gao <[hidden email]> wrote:
Hi Alexander, 

The signature of the createRemoteEnvironment is 
public static StreamExecutionEnvironment createRemoteEnvironment(
String host, int port, String... jarFiles);
Which could also ship the jars to execute to remote cluster. Could you have a try to also pass the jar files to the remote environment ?

Best,
 Yun
------------------------------------------------------------------
Sender:Alexander Bagerman<[hidden email]>
Date:2020/10/29 10:43:16
Recipient:<[hidden email]>
Theme:How to deploy dynamically generated flink jobs?


Hi,

I am trying to build a functionality to dynamically configure a flink job (Java) in my code based on some additional metadata and submit it to a flink running in a session cluster.

Flink version is 1.11.2

The problem I have is how to provide a packed job to the cluster. When I am trying the following code

StreamExecutionEnvironment env = StreamExecutionEnvironment.createRemoteEnvironment(hostName, hostPort);... configuring job workflow here...env.execute(jobName);

I am getting ClassNotFoundException stating that code for my mapping functions did not make it to the cluster. Which makes sense.

What would be the right way to deploy dynamically configured flink jobs which are not packaged as a jar file but rather generated ad-hoc?

Thanks


Reply | Threaded
Open this post in threaded view
|

Re: Re: How to deploy dynamically generated flink jobs?

Alexander Bagerman
Thanks, Yun. Makes sense. How would you reference a jar file from inside of another jar for such invocation? 
In my case I would have an interactive application - spring boot web app - where the job would be configured and StreamExecutionEnvironment.execute(jobName) is called. 
Spring app is a runnable fat jar with my "flink" jar packaged along with other jars. How would I specify location to the jar so that StreamExecutionEnvironment can find it?
Thanks,
Alex

On Wed, Oct 28, 2020 at 11:03 PM Yun Gao <[hidden email]> wrote:
Hi Alexander,

 From my side I still think it should be reasonable to have a jar that contains the code that are running in the clients and also shipped to the cluster. Then this jar could also be included in the shipping jar list.

 For the second issue, similarly I think you may first build the project to get the jar containing the code, then fill the path of the generated jar in to test the submitting.

Best,
 Yun


------------------Original Mail ------------------
Sender:Alexander Bagerman <[hidden email]>
Send Date:Thu Oct 29 11:38:45 2020
Recipients:Yun Gao <[hidden email]>
CC:Flink ML <[hidden email]>
Subject:Re: How to deploy dynamically generated flink jobs?
I did try it but this option seems to be for a third party jar. In my case I would need to specify/ship a jar that contains the code where job is being constracted. I'm not clear of
1. how to point to the containg jar 
2. how to test such a submission from my project running in Eclipse 
Alex 

On Wed, Oct 28, 2020 at 8:21 PM Yun Gao <[hidden email]> wrote:
Hi Alexander, 

The signature of the createRemoteEnvironment is 
public static StreamExecutionEnvironment createRemoteEnvironment(
String host, int port, String... jarFiles);
Which could also ship the jars to execute to remote cluster. Could you have a try to also pass the jar files to the remote environment ?

Best,
 Yun
------------------------------------------------------------------
Sender:Alexander Bagerman<[hidden email]>
Date:2020/10/29 10:43:16
Recipient:<[hidden email]>
Theme:How to deploy dynamically generated flink jobs?


Hi,

I am trying to build a functionality to dynamically configure a flink job (Java) in my code based on some additional metadata and submit it to a flink running in a session cluster.

Flink version is 1.11.2

The problem I have is how to provide a packed job to the cluster. When I am trying the following code

StreamExecutionEnvironment env = StreamExecutionEnvironment.createRemoteEnvironment(hostName, hostPort);... configuring job workflow here...env.execute(jobName);

I am getting ClassNotFoundException stating that code for my mapping functions did not make it to the cluster. Which makes sense.

What would be the right way to deploy dynamically configured flink jobs which are not packaged as a jar file but rather generated ad-hoc?

Thanks


Reply | Threaded
Open this post in threaded view
|

Re: Re: Re: How to deploy dynamically generated flink jobs?

Yun Gao
Hi Alexander, 

      Sorry I might not fully understand the issue, do you means the "flink" jar is the same jar with the spring app fat jar, or they are not the same jar? As a whole, I think the parameter value we need for jarFiles is the absolute path of the jar file. We might need some logic to decide the path to the jar files. For example, if the "flink" jar containing the UDF is the same to the spring app fat jar containing the execute call, we might use method like [1] to find the containing jar, otherwise we might need some mappings from the job name to its flink jar. 

Best,
 Yun

[1] https://github.com/apache/hadoop/blob/8ee6bc2518bfdf7ad257cc1cf3c73f4208c49fc0/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/ClassUtil.java#L38
------------------Original Mail ------------------
Sender:Alexander Bagerman <[hidden email]>
Send Date:Fri Oct 30 04:49:59 2020
Recipients:Yun Gao <[hidden email]>
CC:Flink ML <[hidden email]>
Subject:Re: Re: How to deploy dynamically generated flink jobs?
Thanks, Yun. Makes sense. How would you reference a jar file from inside of another jar for such invocation? 
In my case I would have an interactive application - spring boot web app - where the job would be configured and StreamExecutionEnvironment.execute(jobName) is called. 
Spring app is a runnable fat jar with my "flink" jar packaged along with other jars. How would I specify location to the jar so that StreamExecutionEnvironment can find it?
Thanks,
Alex