Hi guys, We want to have an accurate idea of how many people are implementing Flink job based on the interface Program, and how they actually implement it. The reason I ask for the survey is from this thread[1] where we notice this codepath is stale and less useful than it should be. As it is an interface marked as @PublicEvolving it is originally aimed at serving as user interface. Thus before doing deprecation or dropping, we'd like to see if there are users implementing their job based on this interface(org.apache.flink.api.common.Program) and if there is any, we are curious about how it is used. If little or none of Flink user based on this interface, we would propose deprecating or dropping it. I really appreciate your time and your insight. |
Hi Tison, we use a modified version of the Program interface to enable a web UI do properly detect and run Flink jobs contained in a jar + their parameters. As stated in [1], we dected multiple Main classes per jar by handling an extra comma-separeted Manifest entry (i.e. 'Main-classes'). As mentioned on the discussion on the dev ML, our revised Program interface looks like this: public interface FlinkJob { String getDescription(); List<FlinkJobParameter> getParameters(); boolean isStreamingOrBatch(); } public class FlinkJobParameter { private String paramName; private String paramType = "string"; private String paramDesc; private String paramDefaultValue; private Set<String> choices; private boolean mandatory; } I've also opened some JIRA issues related to this topic: Best, Flavio On Mon, Jul 22, 2019 at 1:46 PM Zili Chen <[hidden email]> wrote:
|
Hi Flavio, Based on the discussion in the tickets you mentioned above, the program-class attribute was a mistake and community is intended to use main-class to replace it. Deprecating Program interface is a part of work of flink new client api. IIUC, your requirements are not so complicated. We can implement that in the new flink client api. How about listing your requirement, and let's discuss how we can make it in the new flink client api. BTW, I guess most of your requirements are based on your flink job server, It would be helpful if you could provide more info about your flink job server. Thanks Flavio Pompermaier <[hidden email]> 于2019年7月22日周一 下午8:59写道:
Best Regards
Jeff Zhang |
Hi Jeff, the thing about the manifest is really about to have a way to list multiple main classes in the jart (without the need to inspect every Java class or forcing a 1-to-1 between jar and job like it is now). My requirements were driven by the UI we're using in our framework:
Best, Flavio On Tue, Jul 23, 2019 at 3:44 AM Jeff Zhang <[hidden email]> wrote:
|
Thanks Flavio, I get most of your points except one
Just curious to know how you submit job via rest api, if there're multiple jobs in one jar, then do you need to submit jar one time and submit jobs multiple times ? And is there any relationship between these jobs in the same jar ? Flavio Pompermaier <[hidden email]> 于2019年7月23日周二 下午4:01写道:
Best Regards
Jeff Zhang |
The jobs are somehow related to each other in the sense that we have a configurable pipeline where there are optional steps you can enable/disable (and thus we create a single big jar). Because of this, we have our application REST service that actually works also as a job scheduler and use the job server as a proxy towards Flink: when one steps ends (this is what is signalled back after the env.execute() from Flink to the application REST service) our application tells the job server to execute the next job of the pipeline on the cluster. Of course this is a "dirty" solution (because we should user a workflow scheduler like Airflow or Luigi or similar) but we wanted to keep things as simplest as possible for the moment. In the future, if our customers would ever improve this part, we will integrate our application with a dedicated job scheduler like the one listed before (probably)..I don't know if some of them are nowadays already integrated with Flink..when we started coding our frontend application (2 ears ago) none of them were using it. Best, Flavio On Tue, Jul 23, 2019 at 10:40 AM Jeff Zhang <[hidden email]> wrote:
|
IIUC the list of jobs contained in jar means the jobs you defined in the pipeline. Then I don't think it is flink's responsibility to maintain the job list info, it is the job scheduler that define the pipeline. So the job scheduler should maintain the job list. Flavio Pompermaier <[hidden email]> 于2019年7月23日周二 下午5:23写道:
Best Regards
Jeff Zhang |
I agree but you have to know in which jar a job is contained..when you upload the jar on our application you immediately know the qualified name of the job class and in which jar it belongs to. I think that when you upload a jar on Flink, Flink should list all available jobs inside it (IMHO)..it could be a single main class (as it is now) or multiple classes (IMHO) On Tue, Jul 23, 2019 at 12:13 PM Jeff Zhang <[hidden email]> wrote:
|
Free forum by Nabble | Edit this page |