(DEPRECATED) Apache Flink User Mailing List archive.

env.execute() ?

Classic

List

Threaded

8 messages Options

Esa Heikkinen

env.execute() ?

Is it env.execute() mandatory at the end of application ? It is possible to run the application without it ?

I found some examples where it is missing.

Best, Esa

Fabian Hueske-2

Re: env.execute() ?

Hi,

It is mandatory for all DataStream programs and most DataSet programs.

Exceptions are ExecutionEnvironment.print() and ExecutionEnvironment.collect().

Both methods are defined on the DataSet ExecutionEnvironment and call execute() internally.

Best, Fabian

2018-05-29 12:31 GMT+02:00 Esa Heikkinen <[hidden email]>:

Hi

Is it env.execute() mandatory at the end of application ? It is possible to run the application without it ?

I found some examples where it is missing.

Best, Esa

Esa Heikkinen

RE: env.execute() ?

Are there only one env.execute() in application ?

Is it unstoppable forever loop ?

Or can I stop env.execute() and then do something and after that restart it ?

Best, Esa

From: Fabian Hueske <[hidden email]>
Sent: Tuesday, May 29, 2018 1:35 PM
To: Esa Heikkinen <[hidden email]>
Cc: [hidden email]
Subject: Re: env.execute() ?

Hi,

It is mandatory for all DataStream programs and most DataSet programs.

Exceptions are ExecutionEnvironment.print() and ExecutionEnvironment.collect().

Both methods are defined on the DataSet ExecutionEnvironment and call execute() internally.

Best, Fabian

2018-05-29 12:31 GMT+02:00 Esa Heikkinen <[hidden email]>:

Hi

Is it env.execute() mandatory at the end of application ? It is possible to run the application without it ?

I found some examples where it is missing.

Best, Esa

Rong Rong

Re: env.execute() ?

Hi Esa,

In Flink documentation[1], what you specified before env.execute() is the job graph.

"Once you specified the complete program you need to trigger the program execution by calling execute()".

execute() can be finite or infinite, depending on whether your data source is finite, or whether you interrupt the program.

Best,

Rong

[1]: https://ci.apache.org/projects/flink/flink-docs-release-1.5/dev/api_concepts.html#anatomy-of-a-flink-program

On Tue, May 29, 2018 at 3:56 AM, Esa Heikkinen <[hidden email]> wrote:

Hi

Are there only one env.execute() in application ?

Is it unstoppable forever loop ?

Or can I stop env.execute() and then do something and after that restart it ?

Best, Esa

From: Fabian Hueske <[hidden email]>
Sent: Tuesday, May 29, 2018 1:35 PM
To: Esa Heikkinen <[hidden email]>
Cc: [hidden email]
Subject: Re: env.execute() ?

Hi,

It is mandatory for all DataStream programs and most DataSet programs.

Exceptions are ExecutionEnvironment.print() and ExecutionEnvironment.collect().

Both methods are defined on the DataSet ExecutionEnvironment and call execute() internally.

Best, Fabian

2018-05-29 12:31 GMT+02:00 Esa Heikkinen <[hidden email]>:

Hi

Is it env.execute() mandatory at the end of application ? It is possible to run the application without it ?

I found some examples where it is missing.

Best, Esa

Shuyi Chen

Re: env.execute() ?

In reply to this post by Esa Heikkinen

Hi Esa,

I think having more than one env.execute() is anti-pattern in Flink.

env.execute() behaves differently depending on the env. For local, it will generate the flink job graph, and start a local mini cluster in background to run the job graph directly.
For remote case, it will generate the flink job graph and submit it to a remote cluster, e.g. running on YARN/Mesos, the local process might stay attached or detach to the job on the remote cluster given options. So it's not a simple "unstoppable forever loop", and I dont think the "stop env.execute() and then do something and after that restart it" will work in general.

But I think you can take a look at savepoints [1] and checkpoints [2] in Flink. With savepoints, you can stop the running job, and do something else, and restart from the savepoints to resume the processing.

[1] https://ci.apache.org/projects/flink/flink-docs-release-1.5/ops/state/savepoints.html

[2] https://ci.apache.org/projects/flink/flink-docs-release-1.5/ops/state/checkpoints.html

Thanks

Shuyi

On Tue, May 29, 2018 at 3:56 AM, Esa Heikkinen <[hidden email]> wrote:

Hi

Are there only one env.execute() in application ?

Is it unstoppable forever loop ?

Or can I stop env.execute() and then do something and after that restart it ?

Best, Esa

From: Fabian Hueske <[hidden email]>
Sent: Tuesday, May 29, 2018 1:35 PM
To: Esa Heikkinen <[hidden email]>
Cc: [hidden email]
Subject: Re: env.execute() ?

Hi,

It is mandatory for all DataStream programs and most DataSet programs.

Exceptions are ExecutionEnvironment.print() and ExecutionEnvironment.collect().

Both methods are defined on the DataSet ExecutionEnvironment and call execute() internally.

Best, Fabian

2018-05-29 12:31 GMT+02:00 Esa Heikkinen <[hidden email]>:

Hi

Is it env.execute() mandatory at the end of application ? It is possible to run the application without it ?

I found some examples where it is missing.

Best, Esa

"So you have to trust that the dots will somehow connect in your future."

Esa Heikkinen

RE: env.execute() ?

Ok. Thanks for the clarification. But the controlling of savepoints is only possible by command line (or a script) ? Or is it possible to do internally in sync with application ?

Esa

From: Shuyi Chen <[hidden email]>
Sent: Wednesday, May 30, 2018 8:18 AM
To: Esa Heikkinen <[hidden email]>
Cc: Fabian Hueske <[hidden email]>; [hidden email]
Subject: Re: env.execute() ?

Hi Esa,

I think having more than one env.execute() is anti-pattern in Flink.

[1] https://ci.apache.org/projects/flink/flink-docs-release-1.5/ops/state/savepoints.html

[2] https://ci.apache.org/projects/flink/flink-docs-release-1.5/ops/state/checkpoints.html

Thanks

Shuyi

On Tue, May 29, 2018 at 3:56 AM, Esa Heikkinen <[hidden email]> wrote:

Hi

Are there only one env.execute() in application ?

Is it unstoppable forever loop ?

Or can I stop env.execute() and then do something and after that restart it ?

Best, Esa

From: Fabian Hueske <[hidden email]>
Sent: Tuesday, May 29, 2018 1:35 PM
To: Esa Heikkinen <[hidden email]>
Cc: [hidden email]
Subject: Re: env.execute() ?

Hi,

It is mandatory for all DataStream programs and most DataSet programs.

Exceptions are ExecutionEnvironment.print() and ExecutionEnvironment.collect().

Both methods are defined on the DataSet ExecutionEnvironment and call execute() internally.

Best, Fabian

2018-05-29 12:31 GMT+02:00 Esa Heikkinen <[hidden email]>:

Hi

Is it env.execute() mandatory at the end of application ? It is possible to run the application without it ?

I found some examples where it is missing.

Best, Esa

"So you have to trust that the dots will somehow connect in your future."

Shuyi Chen

Re: env.execute() ?

I think you might be looking for the functionality provided by the clusterclient [1]. But I am not sure if I fully understand the meaning of "do internally in sync with application". Maybe you can give a concrete use case, so we can help better, if the ClusterClient is not what you want.

[1] https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/client/program/ClusterClient.html

On Wed, May 30, 2018 at 3:18 AM, Esa Heikkinen <[hidden email]> wrote:

Hi

Ok. Thanks for the clarification. But the controlling of savepoints is only possible by command line (or a script) ? Or is it possible to do internally in sync with application ?

Esa

From: Shuyi Chen <[hidden email]>
Sent: Wednesday, May 30, 2018 8:18 AM
To: Esa Heikkinen <[hidden email]>
Cc: Fabian Hueske <[hidden email]>; [hidden email]
Subject: Re: env.execute() ?

Hi Esa,

I think having more than one env.execute() is anti-pattern in Flink.

env.execute() behaves differently depending on the env. For local, it will generate the flink job graph, and start a local mini cluster in background to run the job graph directly.
For remote case, it will generate the flink job graph and submit it to a remote cluster, e.g. running on YARN/Mesos, the local process might stay attached or detach to the job on the remote cluster given options. So it's not a simple "unstoppable forever loop", and I dont think the "stop env.execute() and then do something and after that restart it" will work in general.

But I think you can take a look at savepoints [1] and checkpoints [2] in Flink. With savepoints, you can stop the running job, and do something else, and restart from the savepoints to resume the processing.

[1] https://ci.apache.org/projects/flink/flink-docs-release-1.5/ops/state/savepoints.html

[2] https://ci.apache.org/projects/flink/flink-docs-release-1.5/ops/state/checkpoints.html

Thanks

Shuyi

On Tue, May 29, 2018 at 3:56 AM, Esa Heikkinen <[hidden email]> wrote:

Hi

Are there only one env.execute() in application ?

Is it unstoppable forever loop ?

Or can I stop env.execute() and then do something and after that restart it ?

Best, Esa

From: Fabian Hueske <[hidden email]>
Sent: Tuesday, May 29, 2018 1:35 PM
To: Esa Heikkinen <[hidden email]>
Cc: [hidden email]
Subject: Re: env.execute() ?

Hi,

It is mandatory for all DataStream programs and most DataSet programs.

Exceptions are ExecutionEnvironment.print() and ExecutionEnvironment.collect().

Both methods are defined on the DataSet ExecutionEnvironment and call execute() internally.

Best, Fabian

2018-05-29 12:31 GMT+02:00 Esa Heikkinen <[hidden email]>:

Hi

Is it env.execute() mandatory at the end of application ? It is possible to run the application without it ?

I found some examples where it is missing.

Best, Esa

--

"So you have to trust that the dots will somehow connect in your future."

"So you have to trust that the dots will somehow connect in your future."

Esa Heikkinen

RE: env.execute() ?

My final target is implement the application like in the attachment. I don’t know why it is so hard to me (maybe because I am too beginner with Flink). It may be little difficulties to build “upper level” state-machine outside of streams in Flink, because everything is so stream-oriented in Flink? I think every execution steps of state-machine could use own env.execute()? Good or bad idea or impossible ? I already asked this before in this email list, but I got the answer, it is “piece of cake” and I should do my homework, but no details..

I would be very grateful for the assistance. I can even pay little money, if anyone does it (using CEP ?). Actually I am PhD-student in Tampere University of Technology (Finland) and I have selected Flink as a benchmark for my (very simple) analyzer (that is very state-machine-oriented). I don’t know whether it was good or bad choice. But it is very hard to find suitable analyzer for comparison.

Best, Esa

From: Shuyi Chen <[hidden email]>
Sent: Thursday, May 31, 2018 12:38 AM
To: Esa Heikkinen <[hidden email]>
Cc: Fabian Hueske <[hidden email]>; [hidden email]
Subject: Re: env.execute() ?

[1] https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/client/program/ClusterClient.html

On Wed, May 30, 2018 at 3:18 AM, Esa Heikkinen <[hidden email]> wrote:

Hi

Ok. Thanks for the clarification. But the controlling of savepoints is only possible by command line (or a script) ? Or is it possible to do internally in sync with application ?

Esa

From: Shuyi Chen <[hidden email]>
Sent: Wednesday, May 30, 2018 8:18 AM
To: Esa Heikkinen <[hidden email]>
Cc: Fabian Hueske <[hidden email]>; [hidden email]
Subject: Re: env.execute() ?

Hi Esa,

I think having more than one env.execute() is anti-pattern in Flink.

env.execute() behaves differently depending on the env. For local, it will generate the flink job graph, and start a local mini cluster in background to run the job graph directly.
For remote case, it will generate the flink job graph and submit it to a remote cluster, e.g. running on YARN/Mesos, the local process might stay attached or detach to the job on the remote cluster given options. So it's not a simple "unstoppable forever loop", and I dont think the "stop env.execute() and then do something and after that restart it" will work in general.

But I think you can take a look at savepoints [1] and checkpoints [2] in Flink. With savepoints, you can stop the running job, and do something else, and restart from the savepoints to resume the processing.

[1] https://ci.apache.org/projects/flink/flink-docs-release-1.5/ops/state/savepoints.html

[2] https://ci.apache.org/projects/flink/flink-docs-release-1.5/ops/state/checkpoints.html

Thanks

Shuyi

On Tue, May 29, 2018 at 3:56 AM, Esa Heikkinen <[hidden email]> wrote:

Hi

Are there only one env.execute() in application ?

Is it unstoppable forever loop ?

Or can I stop env.execute() and then do something and after that restart it ?

Best, Esa

From: Fabian Hueske <[hidden email]>
Sent: Tuesday, May 29, 2018 1:35 PM
To: Esa Heikkinen <[hidden email]>
Cc: [hidden email]
Subject: Re: env.execute() ?

Hi,

It is mandatory for all DataStream programs and most DataSet programs.

Exceptions are ExecutionEnvironment.print() and ExecutionEnvironment.collect().

Both methods are defined on the DataSet ExecutionEnvironment and call execute() internally.

Best, Fabian

2018-05-29 12:31 GMT+02:00 Esa Heikkinen <[hidden email]>:

Hi

Is it env.execute() mandatory at the end of application ? It is possible to run the application without it ?

I found some examples where it is missing.

Best, Esa

--

"So you have to trust that the dots will somehow connect in your future."

"So you have to trust that the dots will somehow connect in your future."

Data_trace_processing_example.pdf (382K) Download Attachment