Hi all, As the subject states, I am proposing to temporarily remove support for changing the parallelism of a job via the following syntax [1]: ./bin/flink modify [job-id] -p [new-parallelism] This is an experimental feature that we introduced with the first rollout of FLIP-6 (Flink 1.5). However, this feature comes with a few caveats: * Rescaling does not work with HA enabled [2] * New parallelism is not persisted, i.e., after a JobManager restart, the job will be recovered with the initial parallelism Due to the above-mentioned issues, I believe that currently nobody uses "modify -p" to rescale their jobs in production. Moreover, the rescaling feature stands in the way of our current efforts to rework Flink's scheduling [3]. I therefore propose to remove the rescaling code for the time being. Note that it will still be possible to change the parallelism by taking a savepoint and restoring the job with a different parallelism [4]. Any comments and suggestions will be highly appreciated. Best, Gary [1] https://ci.apache.org/projects/flink/flink-docs-release-1.8/ops/cli.html [2] https://issues.apache.org/jira/browse/FLINK-8902 [3] https://issues.apache.org/jira/browse/FLINK-10429 [4] https://ci.apache.org/projects/flink/flink-docs-release-1.8/ops/state/savepoints.html#what-happens-when-i-change-the-parallelism-of-my-program-when-restoring |
Sounds reasonable to me. If it is a broken feature, then there is not much value in it. On Tue, Apr 23, 2019 at 7:50 PM Gary Yao <[hidden email]> wrote: Hi all, |
Hi Gary,
+ 1 to remove it for now. Actually some users are not aware of that it’s still experimental, and ask quite a lot about the problem it causes. Best, Paul Lam
|
+1 for temporarily removing support for the modify command. Eventually, we have to add it again in order to support auto scaling. The next time we add it, we should address the known limitations. Cheers, Till On Wed, Apr 24, 2019 at 9:06 AM Paul Lam <[hidden email]> wrote:
|
The idea is to also remove the rescaling code in the JobMaster. This will make it easier to remove the ExecutionGraph reference from the JobMaster which is needed for the scheduling rework [1]. [1] https://issues.apache.org/jira/browse/FLINK-12231 On Wed, Apr 24, 2019 at 12:14 PM Shuai Xu <[hidden email]> wrote: Will we only remove command support in client side or the code in job |
Since there were no objections so far, I will proceed with removing the code [1]. [1] https://issues.apache.org/jira/browse/FLINK-12312 On Wed, Apr 24, 2019 at 1:38 PM Gary Yao <[hidden email]> wrote:
|
Free forum by Nabble | Edit this page |