Post-processing batch JobExecutionResult

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Post-processing batch JobExecutionResult

spoganshev
Due to OptimizerPlanEnvironment.execute() throwing exception on the last line
there is not way to post-process batch job execution result, like:

JobExecutionResult r = env.execute(); // execute batch job
analyzeResult(r); // this will never get executed due to plan optimization

https://github.com/apache/flink/blob/master/flink-clients/src/main/java/org/apache/flink/client/program/OptimizerPlanEnvironment.java#L54

Is there any way to allow such post-processing in batch jobs?




--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Post-processing batch JobExecutionResult

Zhu Zhu
Hi spoganshev,

The OptimizerPlanEnvironment is for creating optimized plan only, as described in the javadoc 
"An {@link ExecutionEnvironment} that never executes a job but only creates the optimized plan."
It execute() is invoked with some internal handling so that it only generates optimized plan and do not actually submit a job.
Some other execution environment will execute the job instead.

Not sure how you created your ExecutionEnvironment?
Usually for DataSet jobs, it should be created in the way as below.
"final ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();"

Thanks,
Zhu Zhu

spoganshev <[hidden email]> 于2019年9月6日周五 下午11:39写道:
Due to OptimizerPlanEnvironment.execute() throwing exception on the last line
there is not way to post-process batch job execution result, like:

JobExecutionResult r = env.execute(); // execute batch job
analyzeResult(r); // this will never get executed due to plan optimization

https://github.com/apache/flink/blob/master/flink-clients/src/main/java/org/apache/flink/client/program/OptimizerPlanEnvironment.java#L54

Is there any way to allow such post-processing in batch jobs?




--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Post-processing batch JobExecutionResult

tison
Hi spoganshev,

If you deploy in per-job mode, OptimizerPlanEnvironment would be used, and thus
as you pointed out, there is _no_ way to post processing JobExecutionResult.
We the community regard this situation as a shortcoming and work on an enhancement
progress to enable you get a JobClient as return value of #execute in all deployment
and execution mode. Take a look at [1] and [2] for a preview and feel free to describe
your requirement so that the following version can satisfy your demand.

Besides, if you deploy in session mode, which might be more natural in batch cases,
at the moment ContextEnvironment is used, which execute normally and return the
JobExecutionResult that you can make use of.

Simply sum up, you can try out session mode deployment to see if it satisfy your
requirement on post processing.

Best,
tison.


Zhu Zhu <[hidden email]> 于2019年9月7日周六 上午12:07写道:
Hi spoganshev,

The OptimizerPlanEnvironment is for creating optimized plan only, as described in the javadoc 
"An {@link ExecutionEnvironment} that never executes a job but only creates the optimized plan."
It execute() is invoked with some internal handling so that it only generates optimized plan and do not actually submit a job.
Some other execution environment will execute the job instead.

Not sure how you created your ExecutionEnvironment?
Usually for DataSet jobs, it should be created in the way as below.
"final ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();"

Thanks,
Zhu Zhu

spoganshev <[hidden email]> 于2019年9月6日周五 下午11:39写道:
Due to OptimizerPlanEnvironment.execute() throwing exception on the last line
there is not way to post-process batch job execution result, like:

JobExecutionResult r = env.execute(); // execute batch job
analyzeResult(r); // this will never get executed due to plan optimization

https://github.com/apache/flink/blob/master/flink-clients/src/main/java/org/apache/flink/client/program/OptimizerPlanEnvironment.java#L54

Is there any way to allow such post-processing in batch jobs?




--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Post-processing batch JobExecutionResult

tison
Besides, if you submit the job by Jar Run REST API, it is also
OptimizerPlanEnvironment to be used. So again, _no_ post
processing support at the moment.


Zili Chen <[hidden email]> 于2019年9月7日周六 上午12:51写道:
Hi spoganshev,

If you deploy in per-job mode, OptimizerPlanEnvironment would be used, and thus
as you pointed out, there is _no_ way to post processing JobExecutionResult.
We the community regard this situation as a shortcoming and work on an enhancement
progress to enable you get a JobClient as return value of #execute in all deployment
and execution mode. Take a look at [1] and [2] for a preview and feel free to describe
your requirement so that the following version can satisfy your demand.

Besides, if you deploy in session mode, which might be more natural in batch cases,
at the moment ContextEnvironment is used, which execute normally and return the
JobExecutionResult that you can make use of.

Simply sum up, you can try out session mode deployment to see if it satisfy your
requirement on post processing.

Best,
tison.


Zhu Zhu <[hidden email]> 于2019年9月7日周六 上午12:07写道:
Hi spoganshev,

The OptimizerPlanEnvironment is for creating optimized plan only, as described in the javadoc 
"An {@link ExecutionEnvironment} that never executes a job but only creates the optimized plan."
It execute() is invoked with some internal handling so that it only generates optimized plan and do not actually submit a job.
Some other execution environment will execute the job instead.

Not sure how you created your ExecutionEnvironment?
Usually for DataSet jobs, it should be created in the way as below.
"final ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();"

Thanks,
Zhu Zhu

spoganshev <[hidden email]> 于2019年9月6日周五 下午11:39写道:
Due to OptimizerPlanEnvironment.execute() throwing exception on the last line
there is not way to post-process batch job execution result, like:

JobExecutionResult r = env.execute(); // execute batch job
analyzeResult(r); // this will never get executed due to plan optimization

https://github.com/apache/flink/blob/master/flink-clients/src/main/java/org/apache/flink/client/program/OptimizerPlanEnvironment.java#L54

Is there any way to allow such post-processing in batch jobs?




--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Post-processing batch JobExecutionResult

tison

Zili Chen <[hidden email]> 于2019年9月7日周六 上午12:52写道:
Besides, if you submit the job by Jar Run REST API, it is also
OptimizerPlanEnvironment to be used. So again, _no_ post
processing support at the moment.


Zili Chen <[hidden email]> 于2019年9月7日周六 上午12:51写道:
Hi spoganshev,

If you deploy in per-job mode, OptimizerPlanEnvironment would be used, and thus
as you pointed out, there is _no_ way to post processing JobExecutionResult.
We the community regard this situation as a shortcoming and work on an enhancement
progress to enable you get a JobClient as return value of #execute in all deployment
and execution mode. Take a look at [1] and [2] for a preview and feel free to describe
your requirement so that the following version can satisfy your demand.

Besides, if you deploy in session mode, which might be more natural in batch cases,
at the moment ContextEnvironment is used, which execute normally and return the
JobExecutionResult that you can make use of.

Simply sum up, you can try out session mode deployment to see if it satisfy your
requirement on post processing.

Best,
tison.


Zhu Zhu <[hidden email]> 于2019年9月7日周六 上午12:07写道:
Hi spoganshev,

The OptimizerPlanEnvironment is for creating optimized plan only, as described in the javadoc 
"An {@link ExecutionEnvironment} that never executes a job but only creates the optimized plan."
It execute() is invoked with some internal handling so that it only generates optimized plan and do not actually submit a job.
Some other execution environment will execute the job instead.

Not sure how you created your ExecutionEnvironment?
Usually for DataSet jobs, it should be created in the way as below.
"final ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();"

Thanks,
Zhu Zhu

spoganshev <[hidden email]> 于2019年9月6日周五 下午11:39写道:
Due to OptimizerPlanEnvironment.execute() throwing exception on the last line
there is not way to post-process batch job execution result, like:

JobExecutionResult r = env.execute(); // execute batch job
analyzeResult(r); // this will never get executed due to plan optimization

https://github.com/apache/flink/blob/master/flink-clients/src/main/java/org/apache/flink/client/program/OptimizerPlanEnvironment.java#L54

Is there any way to allow such post-processing in batch jobs?




--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/