Hi, all: I'm learning flink's doc and curious about the fault tolerance of batch process jobs. It seems that when one of task execution fails, the whole job will be restarted, is it true? If so, isn't it impractical to deploy large flink batch jobs? -- Liu, Renjie Software Engineer, MVAD |
Hi, yes, this is indeed true. We had some plans for how to resolve this but they never materialised because of the focus on Stream Processing. We might unite the two in the future and then you will get fault-tolerant batch/stream processing in the same API. Best, Aljoscha On Wed, 15 Feb 2017 at 09:28 Renjie Liu <[hidden email]> wrote:
|
Hi Aljoscha,
Could you share your plans of resolving it? Best, Anton From: Aljoscha Krettek [mailto:[hidden email]]
Hi, yes, this is indeed true. We had some plans for how to resolve this but they never materialised because of the focus on Stream Processing. We might unite the two in the future and then you will get fault-tolerant batch/stream processing
in the same API. Best, Aljoscha On Wed, 15 Feb 2017 at 09:28 Renjie Liu <[hidden email]> wrote:
|
https://cwiki.apache.org/confluence/display/FLINK/FLIP-1+%3A+Fine+Grained+Recovery+from+Task+Failures This FLIP may help. On Thu, Feb 16, 2017 at 7:34 PM Anton Solovev <[hidden email]> wrote:
-- Liu, Renjie Software Engineer, MVAD |
Hi, It's the reason why I gave up use Flink for my current project and pick up traditional Hadoop Framework again. 2017-02-17 10:56 GMT+08:00 Renjie Liu <[hidden email]>:
Best regards
Sili Liu |
In reply to this post by Renjie Liu
yes, it is really a critical problem for large batch job because the unexpected failure is a common case. And we are already focusing on realizing the ideas mentioned in FLIP1, wish to contirbute to flink in months. Best, Zhijiang
|
@Anton, these are the Ideas I was mentioning and I'm afraid I have nothing more to add. (In the FLIP) On Fri, 17 Feb 2017 at 06:26 wangzhijiang999 <[hidden email]> wrote:
|
Free forum by Nabble | Edit this page |