(DEPRECATED) Apache Flink User Mailing List archive.

Flink on Kubernetes Vs Flink Natively on Kubernetes

Classic

List

Threaded

6 messages Options

Pankaj Chand

Flink on Kubernetes Vs Flink Natively on Kubernetes

Hi all,

I want to run Flink, Spark and other processing engines on a single Kubernetes cluster.

From the Flink documentation, I did not understand the difference between:

(1) Running Flink on Kubernetes, Versus (2) Running Flink natively on Kubernetes.

Could someone please explain the difference between the two, and when would you use which option?

Thank you,

Pankaj

Xintong Song

Re: Flink on Kubernetes Vs Flink Natively on Kubernetes

Hi Pankaj,

"Running Flink on Kubernetes" refers to the old way that basically deploys a Flink standalone cluster on Kubernetes. We leverage scripts to run Flink Master and TaskManager processes inside Kubernetes container. In this way, Flink is not ware of whether it's running in containers or directly on physical machines, and will not interact with the Kubernetes Master. Flink Master reactively accept all registered TaskManagers, whose number is decided by the Kubernetes replica.

"Running Flink natively on Kubernetes" refers deploy Flink as a Kubernetes Job. Flink Master will interact with Kubernetes Master, and actively requests for pods/containers, like on Yarn/Mesos.

Thank you~

Xintong Song

On Mon, Mar 16, 2020 at 4:03 PM Pankaj Chand <[hidden email]> wrote:

Hi all,

I want to run Flink, Spark and other processing engines on a single Kubernetes cluster.

From the Flink documentation, I did not understand the difference between:
(1) Running Flink on Kubernetes, Versus (2) Running Flink natively on Kubernetes.

Could someone please explain the difference between the two, and when would you use which option?

Thank you,

Pankaj

Xintong Song

Re: Flink on Kubernetes Vs Flink Natively on Kubernetes

Forgot to mention that "running Flink natively on Kubernetes" is newly introduced and is only available for Flink 1.10 and above.

Thank you~

Xintong Song

On Mon, Mar 16, 2020 at 5:40 PM Xintong Song <[hidden email]> wrote:

Hi Pankaj,

"Running Flink on Kubernetes" refers to the old way that basically deploys a Flink standalone cluster on Kubernetes. We leverage scripts to run Flink Master and TaskManager processes inside Kubernetes container. In this way, Flink is not ware of whether it's running in containers or directly on physical machines, and will not interact with the Kubernetes Master. Flink Master reactively accept all registered TaskManagers, whose number is decided by the Kubernetes replica.

"Running Flink natively on Kubernetes" refers deploy Flink as a Kubernetes Job. Flink Master will interact with Kubernetes Master, and actively requests for pods/containers, like on Yarn/Mesos.

Thank you~
Xintong Song

On Mon, Mar 16, 2020 at 4:03 PM Pankaj Chand <[hidden email]> wrote:
Hi all,

I want to run Flink, Spark and other processing engines on a single Kubernetes cluster.

From the Flink documentation, I did not understand the difference between:
(1) Running Flink on Kubernetes, Versus (2) Running Flink natively on Kubernetes.

Could someone please explain the difference between the two, and when would you use which option?

Thank you,

Pankaj

Pankaj Chand

Re: Flink on Kubernetes Vs Flink Natively on Kubernetes

Hi Xintong,

Thank you for the explanation!

If I run Flink "natively" on Kubernetes, will I also be able to run Spark on the same Kubernetes cluster, or will it make the Kubernetes cluster be reserved for Flink only?

Thank you!

Pankaj

On Mon, Mar 16, 2020 at 5:41 AM Xintong Song <[hidden email]> wrote:

Forgot to mention that "running Flink natively on Kubernetes" is newly introduced and is only available for Flink 1.10 and above.

Thank you~
Xintong Song

On Mon, Mar 16, 2020 at 5:40 PM Xintong Song <[hidden email]> wrote:
Hi Pankaj,

"Running Flink on Kubernetes" refers to the old way that basically deploys a Flink standalone cluster on Kubernetes. We leverage scripts to run Flink Master and TaskManager processes inside Kubernetes container. In this way, Flink is not ware of whether it's running in containers or directly on physical machines, and will not interact with the Kubernetes Master. Flink Master reactively accept all registered TaskManagers, whose number is decided by the Kubernetes replica.

"Running Flink natively on Kubernetes" refers deploy Flink as a Kubernetes Job. Flink Master will interact with Kubernetes Master, and actively requests for pods/containers, like on Yarn/Mesos.

Thank you~
Xintong Song

On Mon, Mar 16, 2020 at 4:03 PM Pankaj Chand <[hidden email]> wrote:
Hi all,

I want to run Flink, Spark and other processing engines on a single Kubernetes cluster.

From the Flink documentation, I did not understand the difference between:
(1) Running Flink on Kubernetes, Versus (2) Running Flink natively on Kubernetes.

Could someone please explain the difference between the two, and when would you use which option?

Thank you,

Pankaj

Yang Wang

Re: Flink on Kubernetes Vs Flink Natively on Kubernetes

Hi Pankaj,

Just like Xintong has said, the biggest difference of Flink on Kubernetes and native

integration is dynamic resource allocation. Since the latter has en embedded K8s

client and will communicate with K8s Api server directly to allocate/release JM/TM

pods.

Both for the two ways to run Flink on K8s, you do not need to reserve the whole

cluster for Flink. Flink could run with other workloads(e.g. Spark, tensorflow, etc.).

The K8s cluster could guarantee the isolation.

Best,

Yang

Pankaj Chand <[hidden email]> 于2020年3月16日周一下午5:51写道：

Hi Xintong,

Thank you for the explanation!

If I run Flink "natively" on Kubernetes, will I also be able to run Spark on the same Kubernetes cluster, or will it make the Kubernetes cluster be reserved for Flink only?

Thank you!

Pankaj

On Mon, Mar 16, 2020 at 5:41 AM Xintong Song <[hidden email]> wrote:
Forgot to mention that "running Flink natively on Kubernetes" is newly introduced and is only available for Flink 1.10 and above.

Thank you~
Xintong Song

On Mon, Mar 16, 2020 at 5:40 PM Xintong Song <[hidden email]> wrote:
Hi Pankaj,

"Running Flink on Kubernetes" refers to the old way that basically deploys a Flink standalone cluster on Kubernetes. We leverage scripts to run Flink Master and TaskManager processes inside Kubernetes container. In this way, Flink is not ware of whether it's running in containers or directly on physical machines, and will not interact with the Kubernetes Master. Flink Master reactively accept all registered TaskManagers, whose number is decided by the Kubernetes replica.

"Running Flink natively on Kubernetes" refers deploy Flink as a Kubernetes Job. Flink Master will interact with Kubernetes Master, and actively requests for pods/containers, like on Yarn/Mesos.

Thank you~
Xintong Song

On Mon, Mar 16, 2020 at 4:03 PM Pankaj Chand <[hidden email]> wrote:
Hi all,

I want to run Flink, Spark and other processing engines on a single Kubernetes cluster.

From the Flink documentation, I did not understand the difference between:
(1) Running Flink on Kubernetes, Versus (2) Running Flink natively on Kubernetes.

Could someone please explain the difference between the two, and when would you use which option?

Thank you,

Pankaj

Pankaj Chand

Re: Flink on Kubernetes Vs Flink Natively on Kubernetes

Thank you, Yang and Xintong!

Best,

Pankaj

On Mon, Mar 16, 2020, 9:27 PM Yang Wang <[hidden email]> wrote:

Hi Pankaj,

Just like Xintong has said, the biggest difference of Flink on Kubernetes and native
integration is dynamic resource allocation. Since the latter has en embedded K8s
client and will communicate with K8s Api server directly to allocate/release JM/TM
pods.

Both for the two ways to run Flink on K8s, you do not need to reserve the whole
cluster for Flink. Flink could run with other workloads(e.g. Spark, tensorflow, etc.).
The K8s cluster could guarantee the isolation.

Best,
Yang

Pankaj Chand <[hidden email]> 于2020年3月16日周一下午5:51写道：
Hi Xintong,

Thank you for the explanation!

If I run Flink "natively" on Kubernetes, will I also be able to run Spark on the same Kubernetes cluster, or will it make the Kubernetes cluster be reserved for Flink only?

Thank you!

Pankaj

On Mon, Mar 16, 2020 at 5:41 AM Xintong Song <[hidden email]> wrote:
Forgot to mention that "running Flink natively on Kubernetes" is newly introduced and is only available for Flink 1.10 and above.

Thank you~
Xintong Song

On Mon, Mar 16, 2020 at 5:40 PM Xintong Song <[hidden email]> wrote:
Hi Pankaj,

"Running Flink on Kubernetes" refers to the old way that basically deploys a Flink standalone cluster on Kubernetes. We leverage scripts to run Flink Master and TaskManager processes inside Kubernetes container. In this way, Flink is not ware of whether it's running in containers or directly on physical machines, and will not interact with the Kubernetes Master. Flink Master reactively accept all registered TaskManagers, whose number is decided by the Kubernetes replica.

"Running Flink natively on Kubernetes" refers deploy Flink as a Kubernetes Job. Flink Master will interact with Kubernetes Master, and actively requests for pods/containers, like on Yarn/Mesos.

Thank you~
Xintong Song

On Mon, Mar 16, 2020 at 4:03 PM Pankaj Chand <[hidden email]> wrote:
Hi all,

I want to run Flink, Spark and other processing engines on a single Kubernetes cluster.

From the Flink documentation, I did not understand the difference between:
(1) Running Flink on Kubernetes, Versus (2) Running Flink natively on Kubernetes.

Could someone please explain the difference between the two, and when would you use which option?

Thank you,

Pankaj