Re: Running Kubernetes on Flink with Savepoint

Posted by rmetzger0 on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/Running-Kubernetes-on-Flink-with-Savepoint-tp35849p35962.html

Hi Matt,

sorry for the late reply. Why are you using the "flink-docker" helm example instead of https://ci.apache.org/projects/flink/flink-docs-master/ops/deployment/kubernetes.html or https://ci.apache.org/projects/flink/flink-docs-master/ops/deployment/native_kubernetes.html ?
I don't think that the helm charts you are mentioning are actively maintained or recommended for production use.

If you want to create a savepoint in Flink, you'll need to trigger it via the JobManager's REST API (independent of how you deploy it). I guess you'll have to come up with some tooling that orchestrates triggering a savepoint before shutting down / upgrading the job.
See also: https://ci.apache.org/projects/flink/flink-docs-master/monitoring/rest_api.html#jobs-jobid-savepoints

Best,
Robert



On Wed, Jun 10, 2020 at 2:48 PM Matt Magsombol <[hidden email]> wrote:
We're currently using this template: https://github.com/docker-flink/examples/tree/master/helm/flink for running kubernetes flink for running a job specific cluster ( with a nit of specifying the class as the main runner for the cluster ).


How would I go about setting up adding savepoints, so that we can edit our currently existing running jobs to add pipes to the flink job without having to restart our state? Reasoning is that our state has a 1 day TTL and updating our code without state will have to restart this from scratch.

Through documentation, I see that I'd need to run some sort of command. This is not possible to be consistent if we're using the helm charts specified in the link.

I see this email thread talking about a certain problem with savepoints + kubernetes but doesn't quite specify how to set this up with helm: https://lists.apache.org/thread.html/4299518f4da2810aa88fe6b21f841880b619f3f8ac264084a318c034%40%3Cuser.flink.apache.org%3E


According to hasun@zendesk from that thread, they mention that "We always make a savepoint before we shutdown the job-cluster. So the savepoint is always the latest. When we fix a bug or change the job graph, it can resume well."

This is the exact use case that I'm looking to appease. Other than specifying configs, are there any other additional parameters that I'd need to add within helm to specify that it needs to take in the latest savepoint upon starting?