Flink on Kubernetes Vs Flink Natively on Kubernetes

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Flink on Kubernetes Vs Flink Natively on Kubernetes

Pankaj Chand
Hi all,

I want to run Flink, Spark and other processing engines on a single Kubernetes cluster.

From the Flink documentation, I did not understand the difference between:
(1) Running Flink on Kubernetes, Versus (2) Running Flink natively on Kubernetes.

Could someone please explain the difference between the two, and when would you use which option?

Thank you,

Pankaj
Reply | Threaded
Open this post in threaded view
|

Re: Flink on Kubernetes Vs Flink Natively on Kubernetes

Xintong Song
Hi Pankaj,

"Running Flink on Kubernetes" refers to the old way that basically deploys a Flink standalone cluster on Kubernetes. We leverage scripts to run Flink Master and TaskManager processes inside Kubernetes container. In this way, Flink is not ware of whether it's running in containers or directly on physical machines, and will not interact with the Kubernetes Master. Flink Master reactively accept all registered TaskManagers, whose number is decided by the Kubernetes replica. 

"Running Flink natively on Kubernetes" refers deploy Flink as a Kubernetes Job. Flink Master will interact with Kubernetes Master, and actively requests for pods/containers, like on Yarn/Mesos.

Thank you~

Xintong Song



On Mon, Mar 16, 2020 at 4:03 PM Pankaj Chand <[hidden email]> wrote:
Hi all,

I want to run Flink, Spark and other processing engines on a single Kubernetes cluster.

From the Flink documentation, I did not understand the difference between:
(1) Running Flink on Kubernetes, Versus (2) Running Flink natively on Kubernetes.

Could someone please explain the difference between the two, and when would you use which option?

Thank you,

Pankaj
Reply | Threaded
Open this post in threaded view
|

Re: Flink on Kubernetes Vs Flink Natively on Kubernetes

Xintong Song
Forgot to mention that "running Flink natively on Kubernetes" is newly introduced and is only available for Flink 1.10 and above.


Thank you~

Xintong Song



On Mon, Mar 16, 2020 at 5:40 PM Xintong Song <[hidden email]> wrote:
Hi Pankaj,

"Running Flink on Kubernetes" refers to the old way that basically deploys a Flink standalone cluster on Kubernetes. We leverage scripts to run Flink Master and TaskManager processes inside Kubernetes container. In this way, Flink is not ware of whether it's running in containers or directly on physical machines, and will not interact with the Kubernetes Master. Flink Master reactively accept all registered TaskManagers, whose number is decided by the Kubernetes replica. 

"Running Flink natively on Kubernetes" refers deploy Flink as a Kubernetes Job. Flink Master will interact with Kubernetes Master, and actively requests for pods/containers, like on Yarn/Mesos.

Thank you~

Xintong Song



On Mon, Mar 16, 2020 at 4:03 PM Pankaj Chand <[hidden email]> wrote:
Hi all,

I want to run Flink, Spark and other processing engines on a single Kubernetes cluster.

From the Flink documentation, I did not understand the difference between:
(1) Running Flink on Kubernetes, Versus (2) Running Flink natively on Kubernetes.

Could someone please explain the difference between the two, and when would you use which option?

Thank you,

Pankaj
Reply | Threaded
Open this post in threaded view
|

Re: Flink on Kubernetes Vs Flink Natively on Kubernetes

Pankaj Chand
Hi Xintong,

Thank you for the explanation!

If I run Flink "natively" on Kubernetes, will I also be able to run Spark on the same Kubernetes cluster, or will it make the Kubernetes cluster be reserved for Flink only?

Thank you!

Pankaj

On Mon, Mar 16, 2020 at 5:41 AM Xintong Song <[hidden email]> wrote:
Forgot to mention that "running Flink natively on Kubernetes" is newly introduced and is only available for Flink 1.10 and above.


Thank you~

Xintong Song



On Mon, Mar 16, 2020 at 5:40 PM Xintong Song <[hidden email]> wrote:
Hi Pankaj,

"Running Flink on Kubernetes" refers to the old way that basically deploys a Flink standalone cluster on Kubernetes. We leverage scripts to run Flink Master and TaskManager processes inside Kubernetes container. In this way, Flink is not ware of whether it's running in containers or directly on physical machines, and will not interact with the Kubernetes Master. Flink Master reactively accept all registered TaskManagers, whose number is decided by the Kubernetes replica. 

"Running Flink natively on Kubernetes" refers deploy Flink as a Kubernetes Job. Flink Master will interact with Kubernetes Master, and actively requests for pods/containers, like on Yarn/Mesos.

Thank you~

Xintong Song



On Mon, Mar 16, 2020 at 4:03 PM Pankaj Chand <[hidden email]> wrote:
Hi all,

I want to run Flink, Spark and other processing engines on a single Kubernetes cluster.

From the Flink documentation, I did not understand the difference between:
(1) Running Flink on Kubernetes, Versus (2) Running Flink natively on Kubernetes.

Could someone please explain the difference between the two, and when would you use which option?

Thank you,

Pankaj
Reply | Threaded
Open this post in threaded view
|

Re: Flink on Kubernetes Vs Flink Natively on Kubernetes

Yang Wang
Hi Pankaj,

Just like Xintong has said, the biggest difference of Flink on Kubernetes and native
integration is dynamic resource allocation. Since the latter has en embedded K8s
client and will communicate with K8s Api server directly to allocate/release JM/TM
pods.

Both for the two ways to run Flink on K8s, you do not need to reserve the whole
cluster for Flink. Flink could run with other workloads(e.g. Spark, tensorflow, etc.).
The K8s cluster could guarantee the isolation.


Best,
Yang

Pankaj Chand <[hidden email]> 于2020年3月16日周一 下午5:51写道:
Hi Xintong,

Thank you for the explanation!

If I run Flink "natively" on Kubernetes, will I also be able to run Spark on the same Kubernetes cluster, or will it make the Kubernetes cluster be reserved for Flink only?

Thank you!

Pankaj

On Mon, Mar 16, 2020 at 5:41 AM Xintong Song <[hidden email]> wrote:
Forgot to mention that "running Flink natively on Kubernetes" is newly introduced and is only available for Flink 1.10 and above.


Thank you~

Xintong Song



On Mon, Mar 16, 2020 at 5:40 PM Xintong Song <[hidden email]> wrote:
Hi Pankaj,

"Running Flink on Kubernetes" refers to the old way that basically deploys a Flink standalone cluster on Kubernetes. We leverage scripts to run Flink Master and TaskManager processes inside Kubernetes container. In this way, Flink is not ware of whether it's running in containers or directly on physical machines, and will not interact with the Kubernetes Master. Flink Master reactively accept all registered TaskManagers, whose number is decided by the Kubernetes replica. 

"Running Flink natively on Kubernetes" refers deploy Flink as a Kubernetes Job. Flink Master will interact with Kubernetes Master, and actively requests for pods/containers, like on Yarn/Mesos.

Thank you~

Xintong Song



On Mon, Mar 16, 2020 at 4:03 PM Pankaj Chand <[hidden email]> wrote:
Hi all,

I want to run Flink, Spark and other processing engines on a single Kubernetes cluster.

From the Flink documentation, I did not understand the difference between:
(1) Running Flink on Kubernetes, Versus (2) Running Flink natively on Kubernetes.

Could someone please explain the difference between the two, and when would you use which option?

Thank you,

Pankaj
Reply | Threaded
Open this post in threaded view
|

Re: Flink on Kubernetes Vs Flink Natively on Kubernetes

Pankaj Chand
Thank you, Yang and Xintong!

Best,

Pankaj

On Mon, Mar 16, 2020, 9:27 PM Yang Wang <[hidden email]> wrote:
Hi Pankaj,

Just like Xintong has said, the biggest difference of Flink on Kubernetes and native
integration is dynamic resource allocation. Since the latter has en embedded K8s
client and will communicate with K8s Api server directly to allocate/release JM/TM
pods.

Both for the two ways to run Flink on K8s, you do not need to reserve the whole
cluster for Flink. Flink could run with other workloads(e.g. Spark, tensorflow, etc.).
The K8s cluster could guarantee the isolation.


Best,
Yang

Pankaj Chand <[hidden email]> 于2020年3月16日周一 下午5:51写道:
Hi Xintong,

Thank you for the explanation!

If I run Flink "natively" on Kubernetes, will I also be able to run Spark on the same Kubernetes cluster, or will it make the Kubernetes cluster be reserved for Flink only?

Thank you!

Pankaj

On Mon, Mar 16, 2020 at 5:41 AM Xintong Song <[hidden email]> wrote:
Forgot to mention that "running Flink natively on Kubernetes" is newly introduced and is only available for Flink 1.10 and above.


Thank you~

Xintong Song



On Mon, Mar 16, 2020 at 5:40 PM Xintong Song <[hidden email]> wrote:
Hi Pankaj,

"Running Flink on Kubernetes" refers to the old way that basically deploys a Flink standalone cluster on Kubernetes. We leverage scripts to run Flink Master and TaskManager processes inside Kubernetes container. In this way, Flink is not ware of whether it's running in containers or directly on physical machines, and will not interact with the Kubernetes Master. Flink Master reactively accept all registered TaskManagers, whose number is decided by the Kubernetes replica. 

"Running Flink natively on Kubernetes" refers deploy Flink as a Kubernetes Job. Flink Master will interact with Kubernetes Master, and actively requests for pods/containers, like on Yarn/Mesos.

Thank you~

Xintong Song



On Mon, Mar 16, 2020 at 4:03 PM Pankaj Chand <[hidden email]> wrote:
Hi all,

I want to run Flink, Spark and other processing engines on a single Kubernetes cluster.

From the Flink documentation, I did not understand the difference between:
(1) Running Flink on Kubernetes, Versus (2) Running Flink natively on Kubernetes.

Could someone please explain the difference between the two, and when would you use which option?

Thank you,

Pankaj