Hello When HA is enabled in the flink cluster and if I've to submit job via flink CLI then in the flink-conf.yaml of flink CLI should contain this properties -high-availability: zookeeper high-availability.cluster-id: flink high-availability.zookeeper.path.root: flink high-availability.storageDir: <some path> high-availability.zookeeper.quorum: <zookeeper IP:port> |
---------- Forwarded message ---------- From: Sampath Bhat <[hidden email]> Date: Fri, Jul 13, 2018 at 3:18 PM Subject: Flink CLI properties with HA To: user <[hidden email]> Hello When HA is enabled in the flink cluster and if I've to submit job via flink CLI then in the flink-conf.yaml of flink CLI should contain this properties -high-availability: zookeeper high-availability.cluster-id: flink high-availability.zookeeper. high-availability.storageDir: <some path> high-availability.zookeeper. |
Hi Sampath, Flink CLI need to retrieve the JobManager leader address, so it need to access the HA specific configuration. Because if based on Zookeeper to implement the HA, the leader address information will fetch from Zookeeper. The main use of config item high-availability.storageDir is storage (Job graph, checkpoint and so on). Actually, the real data is stored under this path which used to recover purpose, zookeeper just store a state handle. --- Thanks. vino. 2018-07-16 15:28 GMT+08:00 Sampath Bhat <[hidden email]>:
|
Hi vino Should the flink CLI have access to the path mentioned in high-availability. If my flink cluster is on set of machines and i submit my job from flink CLI from another independent machine by giving necessary details will the CLI try to access high-availability. I'm aware of the fact that flink client will connect to zookeeper to get leader address and necessary information for job submission but my confusion is with high-availability. On Mon, Jul 16, 2018 at 2:44 PM, vino yang <[hidden email]> wrote:
|
Hi Sampath, It seems Flink CLI for standalone would not access high-availability.storageDir. What's the exception stack trace in your environment? Thanks, vino. 2018-07-17 15:08 GMT+08:00 Sampath Bhat <[hidden email]>:
|
Hi Sampath, technically the client does not need to know the `high-availability.storageDir` to submit a job. However, due to how we construct the ZooKeeperHaServices it is still needed. The reason behind this is that we use the same services for the server and the client. Thus, the implementation needs to know the storageDir in both cases. The way it should be done is to split the HighAvailabilityServices up into client and server services. The former would then not depend on `high-availability.storageDir`. Cheers, Till On Tue, Jul 17, 2018 at 1:31 PM vino yang <[hidden email]> wrote:
|
Vino, I'm not getting any error but my suspicion was that if I dont specify this `high-availability.storageDir` property in flink CLI side then the CLI will not be able to submit job to flink cluster(HA enabled). But if provide this property in CLI side the job submission will be successful even though the CLI cannot access the path mentioned in `high-availability.storageDir`. So I wanted to understand the underlying implementation. Till, Thank you for the reply. It clarified my doubt. On Tue, Jul 17, 2018 at 6:03 PM, Till Rohrmann <[hidden email]> wrote:
|
Free forum by Nabble | Edit this page |