Hi,
What is the best way to prevent from launching 2 jobs with the same name concurrently ? Instead of doing a check in the script that starts the Flink job, I would prefer to stop a job if another one with the same name is in progress (Exception or something like that). David |
Hi David, The jobs are identified by job id, not by job name internally in Flink and so It will only check if there are two jobs with the same job id. If you submit the job via CLI[1], I'm afraid there are still no built-in ways provided as currently the job id is generated randomly when submitting a job via CLI and the generated job id has nothing to do with the job name. However, if you submit the job via REST API [2], it did provide an option to specify the job id when submitting a job. You can generate the job id by yourself. Regards, Dian
|
The situation is as Dian said. Flink identifies jobs by job id instead of job name. However, I think it is still a valid question if it is an alternative Flink identifies jobs by job name and leaves the work to distinguish jobs by name to users. The advantages in this way includes a readable display and interaction, as well as reduce some hardcode works on job id, such as we always set job id to new JobID(0, 0) in standalone per-job mode for getting the same ZK path. Best, tison. Dian Fu <[hidden email]> 于2019年9月23日周一 上午10:55写道:
|
Hi,
Thanks for your replies. Yes, it could be useful to have a way to define jobid. Thus, I would have been able to define the jbid based on the name for example. At the moment we do not use the REST API but the cli to submit our jobs on Yarn. Nevertheless, I can implement a little trick: at startup query the Rest API and throw an Exception if a job with the same same is running. Question: is there a way to retrieve the Job manager uri from my code or should I provide it as parameter ? thx. David On 2019/09/23 03:09:42, Zili Chen <[hidden email]> wrote: > The situation is as Dian said. Flink identifies jobs by job id instead of > job name. > > However, I think it is still a valid question if it is an alternative Flink > identifies jobs by job name and > leaves the work to distinguish jobs by name to users. The advantages in > this way includes a readable > display and interaction, as well as reduce some hardcode works on job id, > such as we always set > job id to new JobID(0, 0) in standalone per-job mode for getting the same > ZK path. > > Best, > tison. > > > Dian Fu <[hidden email]> 于2019年9月23日周一 上午10:55写道: > > > Hi David, > > > > The jobs are identified by job id, not by job name internally in Flink and > > so It will only check if there are two jobs with the same job id. > > > > If you submit the job via CLI[1], I'm afraid there are still no built-in > > ways provided as currently the job id is generated randomly when submitting > > a job via CLI and the generated job id has nothing to do with the job name. > > However, if you submit the job via REST API [2], it did provide an option > > to specify the job id when submitting a job. You can generate the job id by > > yourself. > > > > Regards, > > Dian > > > > [1] https://ci.apache.org/projects/flink/flink-docs-master/ops/cli.html > > [2] > > https://ci.apache.org/projects/flink/flink-docs-master/monitoring/rest_api.html#jars-jarid-run > > > > 在 2019年9月23日,上午4:57,David Morin <[hidden email]> 写道: > > > > Hi, > > > > What is the best way to prevent from launching 2 jobs with the same name > > concurrently ? > > Instead of doing a check in the script that starts the Flink job, I would > > prefer to stop a job if another one with the same name is in progress > > (Exception or something like that). > > > > David > > > > > > > |
Hi David, you could use Flink's RestClusterClient and call #listJobs to obtain the list of jobs being executed on the cluster (note that it will also report finished jobs). By providing a properly configured Configuration (e.g. loading flink-conf.yaml via GlobalConfiguration#loadConfiguration) it will automatically detect where the JobManager is running (e.g. via ZooKeeper if HA is enabled or it picks up the configured JobManager address from the configuration). Of course, you could also provide the JobManager address as a parameter. Cheers, Till On Mon, Sep 23, 2019 at 9:08 AM David Morin <[hidden email]> wrote: Hi, |
Thanks Till,
Perfect. I gonna use RestClusterClient with listJobs It should work perfectly for my need Cheers David On 2019/09/23 12:36:46, Till Rohrmann <[hidden email]> wrote: > Hi David, > > you could use Flink's RestClusterClient and call #listJobs to obtain the > list of jobs being executed on the cluster (note that it will also report > finished jobs). By providing a properly configured Configuration (e.g. > loading flink-conf.yaml via GlobalConfiguration#loadConfiguration) it will > automatically detect where the JobManager is running (e.g. via ZooKeeper if > HA is enabled or it picks up the configured JobManager address from the > configuration). > > Of course, you could also provide the JobManager address as a parameter. > > Cheers, > Till > > On Mon, Sep 23, 2019 at 9:08 AM David Morin <[hidden email]> > wrote: > > > Hi, > > > > Thanks for your replies. > > Yes, it could be useful to have a way to define jobid. Thus, I would have > > been able to define the jbid based on the name for example. At the moment > > we do not use the REST API but the cli to submit our jobs on Yarn. > > Nevertheless, I can implement a little trick: at startup query the Rest > > API and throw an Exception if a job with the same same is running. > > Question: is there a way to retrieve the Job manager uri from my code or > > should I provide it as parameter ? > > thx. > > David > > > > On 2019/09/23 03:09:42, Zili Chen <[hidden email]> wrote: > > > The situation is as Dian said. Flink identifies jobs by job id instead of > > > job name. > > > > > > However, I think it is still a valid question if it is an alternative > > Flink > > > identifies jobs by job name and > > > leaves the work to distinguish jobs by name to users. The advantages in > > > this way includes a readable > > > display and interaction, as well as reduce some hardcode works on job id, > > > such as we always set > > > job id to new JobID(0, 0) in standalone per-job mode for getting the same > > > ZK path. > > > > > > Best, > > > tison. > > > > > > > > > Dian Fu <[hidden email]> 于2019年9月23日周一 上午10:55写道: > > > > > > > Hi David, > > > > > > > > The jobs are identified by job id, not by job name internally in Flink > > and > > > > so It will only check if there are two jobs with the same job id. > > > > > > > > If you submit the job via CLI[1], I'm afraid there are still no > > built-in > > > > ways provided as currently the job id is generated randomly when > > submitting > > > > a job via CLI and the generated job id has nothing to do with the job > > name. > > > > However, if you submit the job via REST API [2], it did provide an > > option > > > > to specify the job id when submitting a job. You can generate the job > > id by > > > > yourself. > > > > > > > > Regards, > > > > Dian > > > > > > > > [1] > > https://ci.apache.org/projects/flink/flink-docs-master/ops/cli.html > > > > [2] > > > > > > https://ci.apache.org/projects/flink/flink-docs-master/monitoring/rest_api.html#jars-jarid-run > > > > > > > > 在 2019年9月23日,上午4:57,David Morin <[hidden email]> 写道: > > > > > > > > Hi, > > > > > > > > What is the best way to prevent from launching 2 jobs with the same > > name > > > > concurrently ? > > > > Instead of doing a check in the script that starts the Flink job, I > > would > > > > prefer to stop a job if another one with the same name is in progress > > > > (Exception or something like that). > > > > > > > > David > > > > > > > > > > > > > > > > > > |
My simple workaround for it: I start the applications always from the same machine via CLI and just make a file-system-lock around execution of the check-if-task-is-already-running and task-launching part. This of course is a possible single-point-of-failure to rely on one machine starting the jobs but works in my current environment.
Best regards Theo ----- Ursprüngliche Mail ----- Von: "David Morin" <[hidden email]> An: "user" <[hidden email]> Gesendet: Montag, 23. September 2019 17:21:17 Betreff: Re: How to prevent from launching 2 jobs at the same time Thanks Till, Perfect. I gonna use RestClusterClient with listJobs It should work perfectly for my need Cheers David On 2019/09/23 12:36:46, Till Rohrmann <[hidden email]> wrote: > Hi David, > > you could use Flink's RestClusterClient and call #listJobs to obtain the > list of jobs being executed on the cluster (note that it will also report > finished jobs). By providing a properly configured Configuration (e.g. > loading flink-conf.yaml via GlobalConfiguration#loadConfiguration) it will > automatically detect where the JobManager is running (e.g. via ZooKeeper if > HA is enabled or it picks up the configured JobManager address from the > configuration). > > Of course, you could also provide the JobManager address as a parameter. > > Cheers, > Till > > On Mon, Sep 23, 2019 at 9:08 AM David Morin <[hidden email]> > wrote: > > > Hi, > > > > Thanks for your replies. > > Yes, it could be useful to have a way to define jobid. Thus, I would have > > been able to define the jbid based on the name for example. At the moment > > we do not use the REST API but the cli to submit our jobs on Yarn. > > Nevertheless, I can implement a little trick: at startup query the Rest > > API and throw an Exception if a job with the same same is running. > > Question: is there a way to retrieve the Job manager uri from my code or > > should I provide it as parameter ? > > thx. > > David > > > > On 2019/09/23 03:09:42, Zili Chen <[hidden email]> wrote: > > > The situation is as Dian said. Flink identifies jobs by job id instead of > > > job name. > > > > > > However, I think it is still a valid question if it is an alternative > > Flink > > > identifies jobs by job name and > > > leaves the work to distinguish jobs by name to users. The advantages in > > > this way includes a readable > > > display and interaction, as well as reduce some hardcode works on job id, > > > such as we always set > > > job id to new JobID(0, 0) in standalone per-job mode for getting the same > > > ZK path. > > > > > > Best, > > > tison. > > > > > > > > > Dian Fu <[hidden email]> 于2019年9月23日周一 上午10:55写道: > > > > > > > Hi David, > > > > > > > > The jobs are identified by job id, not by job name internally in Flink > > and > > > > so It will only check if there are two jobs with the same job id. > > > > > > > > If you submit the job via CLI[1], I'm afraid there are still no > > built-in > > > > ways provided as currently the job id is generated randomly when > > submitting > > > > a job via CLI and the generated job id has nothing to do with the job > > name. > > > > However, if you submit the job via REST API [2], it did provide an > > option > > > > to specify the job id when submitting a job. You can generate the job > > id by > > > > yourself. > > > > > > > > Regards, > > > > Dian > > > > > > > > [1] > > https://ci.apache.org/projects/flink/flink-docs-master/ops/cli.html > > > > [2] > > > > > > https://ci.apache.org/projects/flink/flink-docs-master/monitoring/rest_api.html#jars-jarid-run > > > > > > > > 在 2019年9月23日,上午4:57,David Morin <[hidden email]> 写道: > > > > > > > > Hi, > > > > > > > > What is the best way to prevent from launching 2 jobs with the same > > name > > > > concurrently ? > > > > Instead of doing a check in the script that starts the Flink job, I > > would > > > > prefer to stop a job if another one with the same name is in progress > > > > (Exception or something like that). > > > > > > > > David > > > > > > > > > > > > > > > > > > |
Free forum by Nabble | Edit this page |