Running job in detached mode via ClusterClient.

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Running job in detached mode via ClusterClient.

Piotr Szczepanek
Hey,

I've a question regarding submission job in detached mode. We're trying to submit job via YarnClusterClient in detached mode and the first one goes correctly, but when we submit second job we get exception:
Multiple environments cannot be created in detached mode
        at org.apache.flink.client.program.ContextEnvironmentFactory.createExecutionEnvironment(ContextEnvironmentFactory.java:67)
        at org.apache.flink.api.java.ExecutionEnvironment.getExecutionEnvironment(ExecutionEnvironment.java:1060)

What we are doing - we create instance of YarnClusterClient and submit PackagedProgram through it to Flink container running on Yarn. Within our jar used by PackagedProgram we get ExceutionEnvironment in a way:
ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();

By default this submission is made in attached mode but we faced problems with our threads hanging after job submission, that's we we decide to try detached approach.
But it seems that we cannot submit more than one job (in detached mode) because of ContextEnvironmentFactory code:
if (isDetached && lastEnvCreated != null) {
throw new InvalidProgramException("Multiple enviornments cannot be created in detached mode");
}

Do you have any clue why submitting more than just one job in detached mode is denied, or how we could then submit our jobs in different way?

Regards,
Piotr
Reply | Threaded
Open this post in threaded view
|

Re: Running job in detached mode via ClusterClient.

Till Rohrmann
Hi Piotr,

the reason why you cannot submit multiple jobs to a job cluster is that a job cluster is only responsible for a single job. If you want to submit multiple jobs, then you need to start a session cluster.

In attached mode, this is currently still possible, because under the hood, we start a session cluster. The reason for this is, because some batch jobs consist actually of multiple parts (e.g. when using the print or collect methods). In the future, every part should start a dedicated job cluster.

Cheers,
Till

On Tue, Oct 2, 2018 at 11:21 AM Piotr Szczepanek <[hidden email]> wrote:
Hey,

I've a question regarding submission job in detached mode. We're trying to submit job via YarnClusterClient in detached mode and the first one goes correctly, but when we submit second job we get exception:
Multiple environments cannot be created in detached mode
        at org.apache.flink.client.program.ContextEnvironmentFactory.createExecutionEnvironment(ContextEnvironmentFactory.java:67)
        at org.apache.flink.api.java.ExecutionEnvironment.getExecutionEnvironment(ExecutionEnvironment.java:1060)

What we are doing - we create instance of YarnClusterClient and submit PackagedProgram through it to Flink container running on Yarn. Within our jar used by PackagedProgram we get ExceutionEnvironment in a way:
ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();

By default this submission is made in attached mode but we faced problems with our threads hanging after job submission, that's we we decide to try detached approach.
But it seems that we cannot submit more than one job (in detached mode) because of ContextEnvironmentFactory code:
if (isDetached && lastEnvCreated != null) {
throw new InvalidProgramException("Multiple enviornments cannot be created in detached mode");
}

Do you have any clue why submitting more than just one job in detached mode is denied, or how we could then submit our jobs in different way?

Regards,
Piotr