k8s job cluster using StatefulSet

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

k8s job cluster using StatefulSet

Alexey Trenikhun
Hello,
Flink documentation suggests to use Deployments to deploy JM and TM for kubernetes job cluster. Is any known potential issues with using StatefulSets instead, seems StatefullSet provides uniqueness for JM during upgrade/rollback, while with Deployments could be multiple JM pods (e.g.1 terminating and 1 running) 

Thanks,
Alexey
Reply | Threaded
Open this post in threaded view
|

Re: k8s job cluster using StatefulSet

Arvid Heise-3
Hi Alexey,

I don't see any issue in using stateful sets immediately.

I'd recommend using one of the K8s operators or Ververica's community edition [1] though if you start with a new setup as they may solve even more issues that you might experience in the future.


On Mon, Aug 10, 2020 at 11:22 PM Alexey Trenikhun <[hidden email]> wrote:
Hello,
Flink documentation suggests to use Deployments to deploy JM and TM for kubernetes job cluster. Is any known potential issues with using StatefulSets instead, seems StatefullSet provides uniqueness for JM during upgrade/rollback, while with Deployments could be multiple JM pods (e.g.1 terminating and 1 running) 

Thanks,
Alexey


--

Arvid Heise | Senior Java Developer


Follow us @VervericaData

--

Join Flink Forward - The Apache Flink Conference

Stream Processing | Event Driven | Real Time

--

Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany

--

Ververica GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Toni) Cheng   
Reply | Threaded
Open this post in threaded view
|

Re: k8s job cluster using StatefulSet

Yang Wang
Hi Alexey,

Actually, StatefulSets could also be used to start the JobManager and TaskManager. 

So why do we suggest to use Deployment in the Flink documentation?
* StatefulSets requires the user to have persistent volume in the K8s cluster. However, it is not always true,
  especially for the unmanaged(self-build) K8s cluster.
* Flink uses Zookeeper and distributed storage(S3, GFS, etc.) to process the fault tolerance. If you start multiple
  JobManagers, the leader election and leader retrieval will be done via Zookeeper. Also the meta information will
   be stored in the Zookeeper. So it is unnecessary to use StatefulSet to do more things.
* The local data of JobManager and TaskManager is ephemeral. It could be discarded after crashed.


Best,
Yang




Arvid Heise <[hidden email]> 于2020年8月13日周四 下午4:38写道:
Hi Alexey,

I don't see any issue in using stateful sets immediately.

I'd recommend using one of the K8s operators or Ververica's community edition [1] though if you start with a new setup as they may solve even more issues that you might experience in the future.


On Mon, Aug 10, 2020 at 11:22 PM Alexey Trenikhun <[hidden email]> wrote:
Hello,
Flink documentation suggests to use Deployments to deploy JM and TM for kubernetes job cluster. Is any known potential issues with using StatefulSets instead, seems StatefullSet provides uniqueness for JM during upgrade/rollback, while with Deployments could be multiple JM pods (e.g.1 terminating and 1 running) 

Thanks,
Alexey


--

Arvid Heise | Senior Java Developer


Follow us @VervericaData

--

Join Flink Forward - The Apache Flink Conference

Stream Processing | Event Driven | Real Time

--

Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany

--

Ververica GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Toni) Cheng   
Reply | Threaded
Open this post in threaded view
|

Re: k8s job cluster using StatefulSet

Jan Lukavský

Hi Alexey,

I'm using StatefulSet for JM exactly as you describe (Deployment for TM is just fine). The main advantage is that you don't need distributed storage for JM fault tolerance, because you can use persistent volume mount (provided your cloud provider provides it as fault tolerant volume). You still need zookeeper (AFAIK), but that might be a feature request, as it is not actually necessary in this case. If this could be incorporated in the mentioned Flink Operators, that would be great. :-)

Best,

 Jan

On 8/14/20 5:09 AM, Yang Wang wrote:
Hi Alexey,

Actually, StatefulSets could also be used to start the JobManager and TaskManager. 

So why do we suggest to use Deployment in the Flink documentation?
* StatefulSets requires the user to have persistent volume in the K8s cluster. However, it is not always true,
  especially for the unmanaged(self-build) K8s cluster.
* Flink uses Zookeeper and distributed storage(S3, GFS, etc.) to process the fault tolerance. If you start multiple
  JobManagers, the leader election and leader retrieval will be done via Zookeeper. Also the meta information will
   be stored in the Zookeeper. So it is unnecessary to use StatefulSet to do more things.
* The local data of JobManager and TaskManager is ephemeral. It could be discarded after crashed.


Best,
Yang




Arvid Heise <[hidden email]> 于2020年8月13日周四 下午4:38写道:
Hi Alexey,

I don't see any issue in using stateful sets immediately.

I'd recommend using one of the K8s operators or Ververica's community edition [1] though if you start with a new setup as they may solve even more issues that you might experience in the future.


On Mon, Aug 10, 2020 at 11:22 PM Alexey Trenikhun <[hidden email]> wrote:
Hello,
Flink documentation suggests to use Deployments to deploy JM and TM for kubernetes job cluster. Is any known potential issues with using StatefulSets instead, seems StatefullSet provides uniqueness for JM during upgrade/rollback, while with Deployments could be multiple JM pods (e.g.1 terminating and 1 running) 

Thanks,
Alexey


--

Arvid Heise | Senior Java Developer


Follow us @VervericaData

--

Join Flink Forward - The Apache Flink Conference

Stream Processing | Event Driven | Real Time

--

Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany

--

Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Toni) Cheng   
Reply | Threaded
Open this post in threaded view
|

Re: k8s job cluster using StatefulSet

Alexey Trenikhun
In reply to this post by Yang Wang
Thank you Arvid and Yang!


From: Yang Wang <[hidden email]>
Sent: Thursday, August 13, 2020 8:09:13 PM
To: Arvid Heise <[hidden email]>
Cc: Alexey Trenikhun <[hidden email]>; user <[hidden email]>
Subject: Re: k8s job cluster using StatefulSet
 
Hi Alexey,

Actually, StatefulSets could also be used to start the JobManager and TaskManager. 

So why do we suggest to use Deployment in the Flink documentation?
* StatefulSets requires the user to have persistent volume in the K8s cluster. However, it is not always true,
  especially for the unmanaged(self-build) K8s cluster.
* Flink uses Zookeeper and distributed storage(S3, GFS, etc.) to process the fault tolerance. If you start multiple
  JobManagers, the leader election and leader retrieval will be done via Zookeeper. Also the meta information will
   be stored in the Zookeeper. So it is unnecessary to use StatefulSet to do more things.
* The local data of JobManager and TaskManager is ephemeral. It could be discarded after crashed.


Best,
Yang




Arvid Heise <[hidden email]> 于2020年8月13日周四 下午4:38写道:
Hi Alexey,

I don't see any issue in using stateful sets immediately.

I'd recommend using one of the K8s operators or Ververica's community edition [1] though if you start with a new setup as they may solve even more issues that you might experience in the future.


On Mon, Aug 10, 2020 at 11:22 PM Alexey Trenikhun <[hidden email]> wrote:
Hello,
Flink documentation suggests to use Deployments to deploy JM and TM for kubernetes job cluster. Is any known potential issues with using StatefulSets instead, seems StatefullSet provides uniqueness for JM during upgrade/rollback, while with Deployments could be multiple JM pods (e.g.1 terminating and 1 running) 

Thanks,
Alexey


--

Arvid Heise | Senior Java Developer


Follow us @VervericaData

--

Join Flink Forward - The Apache Flink Conference

Stream Processing | Event Driven | Real Time

--

Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany

--

Ververica GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Toni) Cheng   
Reply | Threaded
Open this post in threaded view
|

Re: k8s job cluster using StatefulSet

Yang Wang
Hi Jan,

Thanks for you valuable response. Actually, both standalone and native K8s integration,
we could have two ways to achieve the HA.
* StatefulSet + FilesystemHAService(with leader election/retrieval removed)[1]
* Deployment + NativeK8sHaService(native K8s leader election and config map to store meta info)[2]

And we already have some tickets for them. Hope we could make a progress in the next release cycle.



Best,
Yang


Alexey Trenikhun <[hidden email]> 于2020年8月15日周六 下午12:26写道:
Thank you Arvid and Yang!


From: Yang Wang <[hidden email]>
Sent: Thursday, August 13, 2020 8:09:13 PM
To: Arvid Heise <[hidden email]>
Cc: Alexey Trenikhun <[hidden email]>; user <[hidden email]>
Subject: Re: k8s job cluster using StatefulSet
 
Hi Alexey,

Actually, StatefulSets could also be used to start the JobManager and TaskManager. 

So why do we suggest to use Deployment in the Flink documentation?
* StatefulSets requires the user to have persistent volume in the K8s cluster. However, it is not always true,
  especially for the unmanaged(self-build) K8s cluster.
* Flink uses Zookeeper and distributed storage(S3, GFS, etc.) to process the fault tolerance. If you start multiple
  JobManagers, the leader election and leader retrieval will be done via Zookeeper. Also the meta information will
   be stored in the Zookeeper. So it is unnecessary to use StatefulSet to do more things.
* The local data of JobManager and TaskManager is ephemeral. It could be discarded after crashed.


Best,
Yang




Arvid Heise <[hidden email]> 于2020年8月13日周四 下午4:38写道:
Hi Alexey,

I don't see any issue in using stateful sets immediately.

I'd recommend using one of the K8s operators or Ververica's community edition [1] though if you start with a new setup as they may solve even more issues that you might experience in the future.


On Mon, Aug 10, 2020 at 11:22 PM Alexey Trenikhun <[hidden email]> wrote:
Hello,
Flink documentation suggests to use Deployments to deploy JM and TM for kubernetes job cluster. Is any known potential issues with using StatefulSets instead, seems StatefullSet provides uniqueness for JM during upgrade/rollback, while with Deployments could be multiple JM pods (e.g.1 terminating and 1 running) 

Thanks,
Alexey


--

Arvid Heise | Senior Java Developer


Follow us @VervericaData

--

Join Flink Forward - The Apache Flink Conference

Stream Processing | Event Driven | Real Time

--

Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany

--

Ververica GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Toni) Cheng