if i clean the zookeeper data , it runs fine . but next time when the jobmanager failed and redeploy the error occurs again ------------------ 原始邮件 ------------------ 发件人: "Vijay Bhaskar"<[hidden email]>; 发送时间: 2019年11月28日(星期四) 下午3:05 收件人: "曾祥才"<[hidden email]>; 主题: Re: JobGraphs not cleaned up in HA mode Again it could not find the state store file: "Caused by: java.io.FileNotFoundException: /flink/ha/submittedJobGraph0c6bcff01199 " Check why its unable to find. Better thing is: Clean up zookeeper state and check your configurations, correct them and restart cluster. Otherwise it always picks up corrupted state from zookeeper and it will never restart Regards Bhaskar On Thu, Nov 28, 2019 at 11:51 AM 曾祥才 <[hidden email]> wrote:
|
Can you share the flink configuration once? Regards Bhaskar On Thu, Nov 28, 2019 at 12:09 PM 曾祥才 <[hidden email]> wrote:
|
the config (/flink is the NASdirectory ): jobmanager.rpc.address: flink-jobmanager taskmanager.numberOfTaskSlots: 16 web.upload.dir: /flink/webUpload blob.server.port: 6124 jobmanager.rpc.port: 6123 taskmanager.rpc.port: 6122 jobmanager.heap.size: 1024m taskmanager.heap.size: 1024m high-availability: zookeeper high-availability.cluster-id: /cluster-test high-availability.storageDir: /flink/ha high-availability.zookeeper.quorum: ****:2181 high-availability.jobmanager.port: 6123 high-availability.zookeeper.path.root: /flink/risk-insight high-availability.zookeeper.path.checkpoints: /flink/zk-checkpoints state.backend: filesystem state.checkpoints.dir: file:///flink/checkpoints state.savepoints.dir: file:///flink/savepoints state.checkpoints.num-retained: 2 jobmanager.execution.failover-strategy: region jobmanager.archive.fs.dir: file:///flink/archive/history ------------------ 原始邮件 ------------------ 发件人: "Vijay Bhaskar"<[hidden email]>; 发送时间: 2019年11月28日(星期四) 下午3:12 收件人: "曾祥才"<[hidden email]>; 抄送: "User-Flink"<[hidden email]>; 主题: Re: JobGraphs not cleaned up in HA mode Can you share the flink configuration once? Regards Bhaskar On Thu, Nov 28, 2019 at 12:09 PM 曾祥才 <[hidden email]> wrote:
|
anyone have the same problem? pls help, thks ------------------ 原始邮件 ------------------ 发件人: "曾祥才"<[hidden email]>; 发送时间: 2019年11月28日(星期四) 下午2:46 收件人: "Vijay Bhaskar"<[hidden email]>; 抄送: "User-Flink"<[hidden email]>; 主题: 回复: JobGraphs not cleaned up in HA mode the config (/flink is the NASdirectory ): jobmanager.rpc.address: flink-jobmanager taskmanager.numberOfTaskSlots: 16 web.upload.dir: /flink/webUpload blob.server.port: 6124 jobmanager.rpc.port: 6123 taskmanager.rpc.port: 6122 jobmanager.heap.size: 1024m taskmanager.heap.size: 1024m high-availability: zookeeper high-availability.cluster-id: /cluster-test high-availability.storageDir: /flink/ha high-availability.zookeeper.quorum: ****:2181 high-availability.jobmanager.port: 6123 high-availability.zookeeper.path.root: /flink/risk-insight high-availability.zookeeper.path.checkpoints: /flink/zk-checkpoints state.backend: filesystem state.checkpoints.dir: file:///flink/checkpoints state.savepoints.dir: file:///flink/savepoints state.checkpoints.num-retained: 2 jobmanager.execution.failover-strategy: region jobmanager.archive.fs.dir: file:///flink/archive/history ------------------ 原始邮件 ------------------ 发件人: "Vijay Bhaskar"<[hidden email]>; 发送时间: 2019年11月28日(星期四) 下午3:12 收件人: "曾祥才"<[hidden email]>; 抄送: "User-Flink"<[hidden email]>; 主题: Re: JobGraphs not cleaned up in HA mode Can you share the flink configuration once? Regards Bhaskar On Thu, Nov 28, 2019 at 12:09 PM 曾祥才 <[hidden email]> wrote:
|
Hi, Why do you not use HDFS directly? Best, Vino 曾祥才 <[hidden email]> 于2019年11月28日周四 下午6:48写道:
|
hi, Is there any deference (for me using nas is more convenient to test currently)? from the docs seems hdfs ,s3, nfs etc all will be fine. ------------------ 原始邮件 ------------------ 发件人: "vino yang"<[hidden email]>; 发送时间: 2019年11月28日(星期四) 晚上7:17 收件人: "曾祥才"<[hidden email]>; 主题: Re: JobGraphs not cleaned up in HA mode Hi, Why do you not use HDFS directly? Best, Vino 曾祥才 <[hidden email]> 于2019年11月28日周四 下午6:48写道:
|
One more thing: You configured: high-availability.cluster-id: /cluster-test it should be: high-availability.cluster-id: cluster-test I don't think this is major issue, in case it helps, you can check. Can you check one more thing: Is check pointing happening or not? Were you able to see the chk-* folder under checkpoint directory? Regards Bhaskar On Thu, Nov 28, 2019 at 5:00 PM 曾祥才 <[hidden email]> wrote:
|
the chk-* directory is not found , I think the misssing because of jobmanager removes it automaticly , but why it still in zookeeper? ------------------ 原始邮件 ------------------ 发件人: "Vijay Bhaskar"<[hidden email]>; 发送时间: 2019年11月28日(星期四) 晚上8:31 收件人: "曾祥才"<[hidden email]>; 主题: Re: JobGraphs not cleaned up in HA mode One more thing: You configured: high-availability.cluster-id: /cluster-test it should be: high-availability.cluster-id: cluster-test I don't think this is major issue, in case it helps, you can check. Can you check one more thing: Is check pointing happening or not? Were you able to see the chk-* folder under checkpoint directory? Regards Bhaskar On Thu, Nov 28, 2019 at 5:00 PM 曾祥才 <[hidden email]> wrote:
|
Free forum by Nabble | Edit this page |