env.execute() ?

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

env.execute() ?

Esa Heikkinen

Hi

 

Is it env.execute() mandatory at the end of application ? It is possible to run the application without it ?

 

I found some examples where it is missing.

 

Best, Esa

Reply | Threaded
Open this post in threaded view
|

Re: env.execute() ?

Fabian Hueske-2
Hi,

It is mandatory for all DataStream programs and most DataSet programs.

Exceptions are ExecutionEnvironment.print() and ExecutionEnvironment.collect().
Both methods are defined on the DataSet ExecutionEnvironment and call execute() internally.

Best, Fabian

2018-05-29 12:31 GMT+02:00 Esa Heikkinen <[hidden email]>:

Hi

 

Is it env.execute() mandatory at the end of application ? It is possible to run the application without it ?

 

I found some examples where it is missing.

 

Best, Esa


Reply | Threaded
Open this post in threaded view
|

RE: env.execute() ?

Esa Heikkinen

Hi

 

Are there only one env.execute() in application ?

 

Is it unstoppable forever loop ?

 

Or can I stop env.execute() and then do something and after that restart it ?

 

Best, Esa

 

From: Fabian Hueske <[hidden email]>
Sent: Tuesday, May 29, 2018 1:35 PM
To: Esa Heikkinen <[hidden email]>
Cc: [hidden email]
Subject: Re: env.execute() ?

 

Hi,

 

It is mandatory for all DataStream programs and most DataSet programs.

 

Exceptions are ExecutionEnvironment.print() and ExecutionEnvironment.collect().

Both methods are defined on the DataSet ExecutionEnvironment and call execute() internally.

 

Best, Fabian

 

2018-05-29 12:31 GMT+02:00 Esa Heikkinen <[hidden email]>:

Hi

 

Is it env.execute() mandatory at the end of application ? It is possible to run the application without it ?

 

I found some examples where it is missing.

 

Best, Esa

 

Reply | Threaded
Open this post in threaded view
|

Re: env.execute() ?

Rong Rong
Hi Esa,

In Flink documentation[1], what you specified before env.execute() is the job graph.
"Once you specified the complete program you need to trigger the program execution by calling execute()".

execute() can be finite or infinite, depending on whether your data source is finite, or whether you interrupt the program.

Best,
Rong



On Tue, May 29, 2018 at 3:56 AM, Esa Heikkinen <[hidden email]> wrote:

Hi

 

Are there only one env.execute() in application ?

 

Is it unstoppable forever loop ?

 

Or can I stop env.execute() and then do something and after that restart it ?

 

Best, Esa

 

From: Fabian Hueske <[hidden email]>
Sent: Tuesday, May 29, 2018 1:35 PM
To: Esa Heikkinen <[hidden email]>
Cc: [hidden email]
Subject: Re: env.execute() ?

 

Hi,

 

It is mandatory for all DataStream programs and most DataSet programs.

 

Exceptions are ExecutionEnvironment.print() and ExecutionEnvironment.collect().

Both methods are defined on the DataSet ExecutionEnvironment and call execute() internally.

 

Best, Fabian

 

2018-05-29 12:31 GMT+02:00 Esa Heikkinen <[hidden email]>:

Hi

 

Is it env.execute() mandatory at the end of application ? It is possible to run the application without it ?

 

I found some examples where it is missing.

 

Best, Esa

 


Reply | Threaded
Open this post in threaded view
|

Re: env.execute() ?

Shuyi Chen
In reply to this post by Esa Heikkinen
Hi Esa, 

I think having more than one env.execute() is anti-pattern in Flink.

env.execute() behaves differently depending on the env. For local, it will generate the flink job graph, and start a local mini cluster in background to run the job graph directly.
For remote case, it will generate the flink job graph and submit it to a remote cluster, e.g. running on YARN/Mesos, the local process might stay attached or detach to the job on the remote cluster given options. So it's not a simple "unstoppable forever loop", and I dont think the "stop env.execute() and then do something and after that restart it" will work in general.

But I think you can take a look at savepoints [1] and checkpoints [2] in Flink. With savepoints, you can stop the running job, and do something else, and restart from the savepoints to resume the processing.



Thanks
Shuyi

On Tue, May 29, 2018 at 3:56 AM, Esa Heikkinen <[hidden email]> wrote:

Hi

 

Are there only one env.execute() in application ?

 

Is it unstoppable forever loop ?

 

Or can I stop env.execute() and then do something and after that restart it ?

 

Best, Esa

 

From: Fabian Hueske <[hidden email]>
Sent: Tuesday, May 29, 2018 1:35 PM
To: Esa Heikkinen <[hidden email]>
Cc: [hidden email]
Subject: Re: env.execute() ?

 

Hi,

 

It is mandatory for all DataStream programs and most DataSet programs.

 

Exceptions are ExecutionEnvironment.print() and ExecutionEnvironment.collect().

Both methods are defined on the DataSet ExecutionEnvironment and call execute() internally.

 

Best, Fabian

 

2018-05-29 12:31 GMT+02:00 Esa Heikkinen <[hidden email]>:

Hi

 

Is it env.execute() mandatory at the end of application ? It is possible to run the application without it ?

 

I found some examples where it is missing.

 

Best, Esa

 




--
"So you have to trust that the dots will somehow connect in your future."
Reply | Threaded
Open this post in threaded view
|

RE: env.execute() ?

Esa Heikkinen

Hi

 

Ok. Thanks for the clarification. But the controlling of savepoints is only possible by command line (or a script) ? Or is it possible to do internally in sync with application ?

 

Esa

 

From: Shuyi Chen <[hidden email]>
Sent: Wednesday, May 30, 2018 8:18 AM
To: Esa Heikkinen <[hidden email]>
Cc: Fabian Hueske <[hidden email]>; [hidden email]
Subject: Re: env.execute() ?

 

Hi Esa, 

 

I think having more than one env.execute() is anti-pattern in Flink.

 

env.execute() behaves differently depending on the env. For local, it will generate the flink job graph, and start a local mini cluster in background to run the job graph directly.
For remote case, it will generate the flink job graph and submit it to a remote cluster, e.g. running on YARN/Mesos, the local process might stay attached or detach to the job on the remote cluster given options. So it's not a simple "unstoppable forever loop", and I dont think the "stop env.execute() and then do something and after that restart it" will work in general.

 

But I think you can take a look at savepoints [1] and checkpoints [2] in Flink. With savepoints, you can stop the running job, and do something else, and restart from the savepoints to resume the processing.

 

 

 

Thanks

Shuyi

 

On Tue, May 29, 2018 at 3:56 AM, Esa Heikkinen <[hidden email]> wrote:

Hi

 

Are there only one env.execute() in application ?

 

Is it unstoppable forever loop ?

 

Or can I stop env.execute() and then do something and after that restart it ?

 

Best, Esa

 

From: Fabian Hueske <[hidden email]>
Sent: Tuesday, May 29, 2018 1:35 PM
To: Esa Heikkinen <[hidden email]>
Cc: [hidden email]
Subject: Re: env.execute() ?

 

Hi,

 

It is mandatory for all DataStream programs and most DataSet programs.

 

Exceptions are ExecutionEnvironment.print() and ExecutionEnvironment.collect().

Both methods are defined on the DataSet ExecutionEnvironment and call execute() internally.

 

Best, Fabian

 

2018-05-29 12:31 GMT+02:00 Esa Heikkinen <[hidden email]>:

Hi

 

Is it env.execute() mandatory at the end of application ? It is possible to run the application without it ?

 

I found some examples where it is missing.

 

Best, Esa

 



 

--

"So you have to trust that the dots will somehow connect in your future."

Reply | Threaded
Open this post in threaded view
|

Re: env.execute() ?

Shuyi Chen
I think you might be looking for the functionality provided by the clusterclient [1]. But I am not sure if I fully understand the meaning of "do internally in sync with application". Maybe you can give a concrete use case, so we can help better, if the ClusterClient is not what you want.


On Wed, May 30, 2018 at 3:18 AM, Esa Heikkinen <[hidden email]> wrote:

Hi

 

Ok. Thanks for the clarification. But the controlling of savepoints is only possible by command line (or a script) ? Or is it possible to do internally in sync with application ?

 

Esa

 

From: Shuyi Chen <[hidden email]>
Sent: Wednesday, May 30, 2018 8:18 AM
To: Esa Heikkinen <[hidden email]>
Cc: Fabian Hueske <[hidden email]>; [hidden email]
Subject: Re: env.execute() ?

 

Hi Esa, 

 

I think having more than one env.execute() is anti-pattern in Flink.

 

env.execute() behaves differently depending on the env. For local, it will generate the flink job graph, and start a local mini cluster in background to run the job graph directly.
For remote case, it will generate the flink job graph and submit it to a remote cluster, e.g. running on YARN/Mesos, the local process might stay attached or detach to the job on the remote cluster given options. So it's not a simple "unstoppable forever loop", and I dont think the "stop env.execute() and then do something and after that restart it" will work in general.

 

But I think you can take a look at savepoints [1] and checkpoints [2] in Flink. With savepoints, you can stop the running job, and do something else, and restart from the savepoints to resume the processing.

 

 

 

Thanks

Shuyi

 

On Tue, May 29, 2018 at 3:56 AM, Esa Heikkinen <[hidden email]> wrote:

Hi

 

Are there only one env.execute() in application ?

 

Is it unstoppable forever loop ?

 

Or can I stop env.execute() and then do something and after that restart it ?

 

Best, Esa

 

From: Fabian Hueske <[hidden email]>
Sent: Tuesday, May 29, 2018 1:35 PM
To: Esa Heikkinen <[hidden email]>
Cc: [hidden email]
Subject: Re: env.execute() ?

 

Hi,

 

It is mandatory for all DataStream programs and most DataSet programs.

 

Exceptions are ExecutionEnvironment.print() and ExecutionEnvironment.collect().

Both methods are defined on the DataSet ExecutionEnvironment and call execute() internally.

 

Best, Fabian

 

2018-05-29 12:31 GMT+02:00 Esa Heikkinen <[hidden email]>:

Hi

 

Is it env.execute() mandatory at the end of application ? It is possible to run the application without it ?

 

I found some examples where it is missing.

 

Best, Esa

 



 

--

"So you have to trust that the dots will somehow connect in your future."




--
"So you have to trust that the dots will somehow connect in your future."
Reply | Threaded
Open this post in threaded view
|

RE: env.execute() ?

Esa Heikkinen

My final target is implement the application like in the attachment. I don’t know why it is so hard to me (maybe because I am too beginner with Flink). It may be little difficulties to build “upper level” state-machine outside of streams in Flink, because everything is so stream-oriented in Flink? I think every execution steps of state-machine could use own env.execute()? Good or bad idea or impossible ? I already asked this before in this email list, but I got the answer, it is “piece of cake” and I should do my homework, but no details..

 

I would be very grateful for the assistance. I can even pay little money, if anyone does it (using CEP ?). Actually I am PhD-student in Tampere University of Technology (Finland) and I have selected Flink as a benchmark for my (very simple) analyzer (that is very state-machine-oriented). I don’t know whether it was good or bad choice. But it is very hard to find suitable analyzer for comparison.

 

Best, Esa

 

From: Shuyi Chen <[hidden email]>
Sent: Thursday, May 31, 2018 12:38 AM
To: Esa Heikkinen <[hidden email]>
Cc: Fabian Hueske <[hidden email]>; [hidden email]
Subject: Re: env.execute() ?

 

I think you might be looking for the functionality provided by the clusterclient [1]. But I am not sure if I fully understand the meaning of "do internally in sync with application". Maybe you can give a concrete use case, so we can help better, if the ClusterClient is not what you want.

 

 

On Wed, May 30, 2018 at 3:18 AM, Esa Heikkinen <[hidden email]> wrote:

Hi

 

Ok. Thanks for the clarification. But the controlling of savepoints is only possible by command line (or a script) ? Or is it possible to do internally in sync with application ?

 

Esa

 

From: Shuyi Chen <[hidden email]>
Sent: Wednesday, May 30, 2018 8:18 AM
To: Esa Heikkinen <[hidden email]>
Cc: Fabian Hueske <[hidden email]>; [hidden email]
Subject: Re: env.execute() ?

 

Hi Esa, 

 

I think having more than one env.execute() is anti-pattern in Flink.

 

env.execute() behaves differently depending on the env. For local, it will generate the flink job graph, and start a local mini cluster in background to run the job graph directly.
For remote case, it will generate the flink job graph and submit it to a remote cluster, e.g. running on YARN/Mesos, the local process might stay attached or detach to the job on the remote cluster given options. So it's not a simple "unstoppable forever loop", and I dont think the "stop env.execute() and then do something and after that restart it" will work in general.

 

But I think you can take a look at savepoints [1] and checkpoints [2] in Flink. With savepoints, you can stop the running job, and do something else, and restart from the savepoints to resume the processing.

 

 

 

Thanks

Shuyi

 

On Tue, May 29, 2018 at 3:56 AM, Esa Heikkinen <[hidden email]> wrote:

Hi

 

Are there only one env.execute() in application ?

 

Is it unstoppable forever loop ?

 

Or can I stop env.execute() and then do something and after that restart it ?

 

Best, Esa

 

From: Fabian Hueske <[hidden email]>
Sent: Tuesday, May 29, 2018 1:35 PM
To: Esa Heikkinen <[hidden email]>
Cc: [hidden email]
Subject: Re: env.execute() ?

 

Hi,

 

It is mandatory for all DataStream programs and most DataSet programs.

 

Exceptions are ExecutionEnvironment.print() and ExecutionEnvironment.collect().

Both methods are defined on the DataSet ExecutionEnvironment and call execute() internally.

 

Best, Fabian

 

2018-05-29 12:31 GMT+02:00 Esa Heikkinen <[hidden email]>:

Hi

 

Is it env.execute() mandatory at the end of application ? It is possible to run the application without it ?

 

I found some examples where it is missing.

 

Best, Esa

 



 

--

"So you have to trust that the dots will somehow connect in your future."



 

--

"So you have to trust that the dots will somehow connect in your future."


Data_trace_processing_example.pdf (382K) Download Attachment