Is there any way to get the ExecutionGraph of a Job

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Is there any way to get the ExecutionGraph of a Job

Debasish Ghosh
Hello -

I am trying to build an API that can start, control and stop a Flink Job programmatically.

When I do an executionEnv.execute() where executionEnv is an StreamExecutionEnvironment, I get back a JobExecutionResult. I find no way to stop the job (say based on some timeout) from a JobExecutionResult.

One class that can help me is ExecutionGraph which has APIs like stop() and scheduleForExecution(). But how do I get an ExecutionGraph from executionEnv.execute(). All I want is to submit the job for execution and then be able to stop it after some timeout. Or maybe if the job fails I would like to do some cleanups. 

What is the idiomatic way to design such APIs in Flink ?

regards.
Reply | Threaded
Open this post in threaded view
|

Re: Is there any way to get the ExecutionGraph of a Job

Yun Gao
Hi Debasish,

     You cannot get ExecutionGraph since it resides in the JobMaster, which is not in the same process with Client.

     In my opinion, currently you may not be able to stop the job or query the job status using the Client API. The community is currently trying to enhance the Client API[1] so it will be available in the future.  

     An alternative option may be using the REST API to monitor the status of the job, and canceling job by calling the undocumented REST API <JM_ADDR>/jobs/<job id>/yarn-cancel (May be removed in the future).




Best
Yun Gao

------------------------------------------------------------------
From:Debasish Ghosh <[hidden email]>
Send Time:2019 Jun. 3 (Mon.) 17:32
To:user <[hidden email]>
Subject:Is there any way to get the ExecutionGraph of a Job

Hello -

I am trying to build an API that can start, control and stop a Flink Job programmatically.

When I do an executionEnv.execute() where executionEnv is an StreamExecutionEnvironment, I get back a JobExecutionResult. I find no way to stop the job (say based on some timeout) from a JobExecutionResult.

One class that can help me is ExecutionGraph which has APIs like stop() and scheduleForExecution(). But how do I get an ExecutionGraph from executionEnv.execute(). All I want is to submit the job for execution and then be able to stop it after some timeout. Or maybe if the job fails I would like to do some cleanups. 

What is the idiomatic way to design such APIs in Flink ?

regards.

Reply | Threaded
Open this post in threaded view
|

Re: Is there any way to get the ExecutionGraph of a Job

Debasish Ghosh
Thanks a lot for the clarification.

On Tue, Jun 4, 2019 at 8:37 AM Yun Gao <[hidden email]> wrote:
Hi Debasish,

     You cannot get ExecutionGraph since it resides in the JobMaster, which is not in the same process with Client.

     In my opinion, currently you may not be able to stop the job or query the job status using the Client API. The community is currently trying to enhance the Client API[1] so it will be available in the future.  

     An alternative option may be using the REST API to monitor the status of the job, and canceling job by calling the undocumented REST API <JM_ADDR>/jobs/<job id>/yarn-cancel (May be removed in the future).




Best
Yun Gao

------------------------------------------------------------------
From:Debasish Ghosh <[hidden email]>
Send Time:2019 Jun. 3 (Mon.) 17:32
To:user <[hidden email]>
Subject:Is there any way to get the ExecutionGraph of a Job

Hello -

I am trying to build an API that can start, control and stop a Flink Job programmatically.

When I do an executionEnv.execute() where executionEnv is an StreamExecutionEnvironment, I get back a JobExecutionResult. I find no way to stop the job (say based on some timeout) from a JobExecutionResult.

One class that can help me is ExecutionGraph which has APIs like stop() and scheduleForExecution(). But how do I get an ExecutionGraph from executionEnv.execute(). All I want is to submit the job for execution and then be able to stop it after some timeout. Or maybe if the job fails I would like to do some cleanups. 

What is the idiomatic way to design such APIs in Flink ?

regards.



--