(DEPRECATED) Apache Flink User Mailing List archive. - Re: [SURVEY] How many people implement Flink job based on the interface Program?

(DEPRECATED) Apache Flink User Mailing List archive.

Re: [SURVEY] How many people implement Flink job based on the interface Program?

Posted by Jeff Zhang on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/SURVEY-How-many-people-implement-Flink-job-based-on-the-interface-Program-tp28902p28939.html

IIUC the list of jobs contained in jar means the jobs you defined in the pipeline. Then I don't think it is flink's responsibility to maintain the job list info, it is the job scheduler that define the pipeline. So the job scheduler should maintain the job list.

Flavio Pompermaier <[hidden email]> 于2019年7月23日周二下午5:23写道：

The jobs are somehow related to each other in the sense that we have a configurable pipeline where there are optional steps you can enable/disable (and thus we create a single big jar).
Because of this, we have our application REST service that actually works also as a job scheduler and use the job server as a proxy towards Flink: when one steps ends (this is what is signalled back after the env.execute() from Flink to the application REST service) our application tells the job server to execute the next job of the pipeline on the cluster.
Of course this is a "dirty" solution (because we should user a workflow scheduler like Airflow or Luigi or similar) but we wanted to keep things as simplest as possible for the moment.
In the future, if our customers would ever improve this part, we will integrate our application with a dedicated job scheduler like the one listed before (probably)..I don't know if some of them are nowadays already integrated with Flink..when we started coding our frontend application (2 ears ago) none of them were using it.

Best,
Flavio

On Tue, Jul 23, 2019 at 10:40 AM Jeff Zhang <[hidden email]> wrote:
Thanks Flavio,

I get most of your points except one
Get the list of jobs contained in jar (ideally this is is true for every engine beyond Spark or Flink)
Just curious to know how you submit job via rest api, if there're multiple jobs in one jar, then do you need to submit jar one time and submit jobs multiple times ?
And is there any relationship between these jobs in the same jar ?

Flavio Pompermaier <[hidden email]> 于2019年7月23日周二下午4:01写道：
Hi Jeff, the thing about the manifest is really about to have a way to list multiple main classes in the jart (without the need to inspect every Java class or forcing a 1-to-1 between jar and job like it is now).
My requirements were driven by the UI we're using in our framework:
Get the list of jobs contained in jar (ideally this is is true for every engine beyond Spark or Flink)
Get the list of required/optional parameters for each job
Besides the optionality of a parameter, each parameter should include an help description, a type (to validate the input param), a default value and a set of choices (when there's a limited number of options available)
obviously the job serve should be able to submit/run/cancel/monitor a job and upload/delete the uploaded jars
the job server should not depend on any target platform dependency (Spark or Flink) beyond the rest client: at the moment the rest client requires a lot of core libs (indeed because it needs to submit the job graph/plan)
in our vision, the flink client should be something like Apache Livy (https://livy.apache.org/)
One of the biggest limitations we face when running a Flink job from the REST API is the fact that the job can't do anything after env.execute() while we need to call an external service to signal that the job has ended + some other details
Best,
Flavio

On Tue, Jul 23, 2019 at 3:44 AM Jeff Zhang <[hidden email]> wrote:
Hi Flavio,

Based on the discussion in the tickets you mentioned above, the program-class attribute was a mistake and community is intended to use main-class to replace it.

Deprecating Program interface is a part of work of flink new client api.
IIUC, your requirements are not so complicated. We can implement that in the new flink client api. How about listing your requirement, and let's discuss how we can make it in the new flink client api. BTW, I guess most of your requirements are based on your flink job server, It would be helpful if you could provide more info about your flink job server. Thanks

Flavio Pompermaier <[hidden email]> 于2019年7月22日周一下午8:59写道：
Hi Tison,
we use a modified version of the Program interface to enable a web UI do properly detect and run Flink jobs contained in a jar + their parameters.
As stated in [1], we dected multiple Main classes per jar by handling an extra comma-separeted Manifest entry (i.e. 'Main-classes').

As mentioned on the discussion on the dev ML, our revised Program interface looks like this:

public interface FlinkJob {
String getDescription();
List<FlinkJobParameter> getParameters();
boolean isStreamingOrBatch();
}

public class FlinkJobParameter {
private String paramName;
private String paramType = "string";
private String paramDesc;
private String paramDefaultValue;
private Set<String> choices;
private boolean mandatory;
}

I've also opened some JIRA issues related to this topic:

[1] https://issues.apache.org/jira/browse/FLINK-10864
[2] https://issues.apache.org/jira/browse/FLINK-10862
[3] https://issues.apache.org/jira/browse/FLINK-10879.

Best,
Flavio

On Mon, Jul 22, 2019 at 1:46 PM Zili Chen <[hidden email]> wrote:
Hi guys,

We want to have an accurate idea of how many people are implementing
Flink job based on the interface Program, and how they actually
implement it.

The reason I ask for the survey is from this thread[1] where we notice
this codepath is stale and less useful than it should be. As it is an
interface marked as @PublicEvolving it is originally aimed at serving
as user interface. Thus before doing deprecation or dropping, we'd like
to see if there are users implementing their job based on this
interface(org.apache.flink.api.common.Program) and if there is any,
we are curious about how it is used.

If little or none of Flink user based on this interface, we would
propose deprecating or dropping it.

I really appreciate your time and your insight.

Best,
tison.

[1] https://lists.apache.org/thread.html/7ffc9936a384b891dbcf0a481d26c6d13b2125607c200577780d1e18@%3Cdev.flink.apache.org%3E

--
Best Regards

Jeff Zhang

--
Best Regards

Jeff Zhang

Best Regards

Jeff Zhang