(DEPRECATED) Apache Flink User Mailing List archive.

Job Cluster vs Session Cluster deploying and configuration

Classic

List

Threaded

2 messages Options

KristoffSC

Job Cluster vs Session Cluster deploying and configuration

Hi all,
I'm researching docker/k8s deployment possibilities for Flink 1.9.1.

I'm after reading/watching [1][2][3][4].

Currently we do think that we will try go with Job Cluster approach although
we would like to know what is the community trend with this? We would rather
not deploy more than one job per Flink cluster.

Anyways, I was wondering about few things:

1. How can I change the number of task slots per task manager for Job and
Session Cluster? In my case I'm running docker on VirtualBox where I have 4
CPUs assigned to this machine. However each task manager is spawned with
only one task slot for Job Cluster. With Session Cluster however, on the
same machine, each task manager is spawned with 4 task slots.

In both cases Flink's UI shows that each Task manager has 4 CPUs.

2. How can I resubmit job if I'm using a Job Cluster. I'm referring this use
case [5]. You may say that I have to start the job again but with different
arguments. What is the procedure for this? I'm using checkpoints btw.

Should I kill all task manager containers and rerun them with different
parameters?

3. How I can resubmit job using Session Cluster?

4. How I can provide log config for Job/Session cluster?
I have a case, where I changed log level and log format in log4j.properties
and this is working fine on local (IDE) environment. However when I build
the fat jar, and ran a Job Cluster based on this jar it seams that my log4j
properties are not passed to the cluster. I see the original format and
original (INFO) level.

Thanks,

[1] https://youtu.be/w721NI-mtAA
[2] https://youtu.be/WeHuTRwicSw
[3]
https://ci.apache.org/projects/flink/flink-docs-stable/ops/deployment/docker.html
[4]
https://github.com/apache/flink/blob/release-1.9/flink-container/docker/README.md
[5]
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-Job-claster-scalability-td32027.html

--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Yang Wang

Re: Job Cluster vs Session Cluster deploying and configuration

Hi KristoffSC,

Glad to hear that you are looking to run Flink on container land.

Firstly, You are right. Flink could both support session and per-job cluster in container environment.

The differences are job submission process and isolation. For session cluster, you do not need to

build you own image. For per-job cluster, you need to build your jar and all dependencies should be

bundled into the image. If you run one job in a Flink cluster, the per-job is a better choice. It is a one-step

submission and different jobs could get better isolation.

1. You could use `taskmanager.numberOfTaskSlots: 1` to set the slot number of TaskManager. By default

it is 1, so you will get only one slot for per TaskManager.

2. For per-job cluster, you need to destroy the Flink cluster and start a new one. You should clean up all the

Flink resources. Using the container orchestration framework(e.g. Kubernetes) could make it easier. The

standalone per-job cluster could support the save point args so that you could resume from the savepoint [1].

3. If you want to resubmit a job to the existing session, you just need to cancel it and then submit it again.

4. You need to update the log level in $FLINK_HOME/conf/log4j.properties.

Best,

Yang

[1]. https://github.com/apache/flink/blob/master/flink-container/kubernetes/README.md#resuming-from-a-savepoint

KristoffSC <[hidden email]> 于2020年1月9日周四下午10:16写道：

Hi all,
I'm researching docker/k8s deployment possibilities for Flink 1.9.1.

I'm after reading/watching [1][2][3][4].

Currently we do think that we will try go with Job Cluster approach although
we would like to know what is the community trend with this? We would rather
not deploy more than one job per Flink cluster.

Anyways, I was wondering about few things:

1. How can I change the number of task slots per task manager for Job and
Session Cluster? In my case I'm running docker on VirtualBox where I have 4
CPUs assigned to this machine. However each task manager is spawned with
only one task slot for Job Cluster. With Session Cluster however, on the
same machine, each task manager is spawned with 4 task slots.

In both cases Flink's UI shows that each Task manager has 4 CPUs.

2. How can I resubmit job if I'm using a Job Cluster. I'm referring this use
case [5]. You may say that I have to start the job again but with different
arguments. What is the procedure for this? I'm using checkpoints btw.

Should I kill all task manager containers and rerun them with different
parameters?

3. How I can resubmit job using Session Cluster?

4. How I can provide log config for Job/Session cluster?
I have a case, where I changed log level and log format in log4j.properties
and this is working fine on local (IDE) environment. However when I build
the fat jar, and ran a Job Cluster based on this jar it seams that my log4j
properties are not passed to the cluster. I see the original format and
original (INFO) level.

Thanks,

[1] https://youtu.be/w721NI-mtAA
[2] https://youtu.be/WeHuTRwicSw
[3]
https://ci.apache.org/projects/flink/flink-docs-stable/ops/deployment/docker.html
[4]
https://github.com/apache/flink/blob/release-1.9/flink-container/docker/README.md
[5]
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-Job-claster-scalability-td32027.html

--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/