Hi, I was going through the Flink Sql client code and came through a flow where we are loading flink-conf.yaml in the configuration object as prerequisite for the SQL client to start. I can see that the configuration file has properties pertaining to the Flink cluster. As far as my understanding for the use of SQL client it only requires the JobManager host and port information to connect which this configuration file has. The configuration file also has other properties which is confusing me a bit , the properties are as below: # The heap size for the JobManager JVM jobmanager.heap.size: 1024m # The heap size for the TaskManager JVM taskmanager.heap.size: 1024m # The number of task slots that each TaskManager offers. Each slot runs one parallel pipeline. taskmanager.numberOfTaskSlots: 1 # The parallelism used for programs that did not specify and other parallelism. parallelism.default: 1 Are the the above properties even required by the SQL client because as far as my understanding the cluster is already there up and running , so it is possible that it will already have allocated resources , or is it that this is the configuration that is custom for the current session and the Jobmanager will allocate resources for the current request based on this config file.
Can i get some help understanding this part. Because i am trying to extend the SQL client to create an API based client for my platform requirement.Awaiting for response. Regards Dipanjan |
Hi, i was reading through the Flink docs, and i have got to an understanding that each application will have its own instance of Jobamanager and TaskManagers and so every application will have to have a initial configuration for defining the application topology to be drawn in the flink cluster, so every application will have a separate flink-conf.yaml , which will specifically define the Flink topology for that application. Am i correct in my understanding. Please kindly confirm on the same. Waiting for the response. Regards Dipanjan
On Saturday, September 14, 2019, 02:03:32 PM GMT+5:30, Dipanjan Mazumder <[hidden email]> wrote:
Hi, I was going through the Flink Sql client code and came through a flow where we are loading flink-conf.yaml in the configuration object as prerequisite for the SQL client to start. I can see that the configuration file has properties pertaining to the Flink cluster. As far as my understanding for the use of SQL client it only requires the JobManager host and port information to connect which this configuration file has. The configuration file also has other properties which is confusing me a bit , the properties are as below: # The heap size for the JobManager JVM jobmanager.heap.size: 1024m # The heap size for the TaskManager JVM taskmanager.heap.size: 1024m # The number of task slots that each TaskManager offers. Each slot runs one parallel pipeline. taskmanager.numberOfTaskSlots: 1 # The parallelism used for programs that did not specify and other parallelism. parallelism.default: 1 Are the the above properties even required by the SQL client because as far as my understanding the cluster is already there up and running , so it is possible that it will already have allocated resources , or is it that this is the configuration that is custom for the current session and the Jobmanager will allocate resources for the current request based on this config file.
Can i get some help understanding this part. Because i am trying to extend the SQL client to create an API based client for my platform requirement.Awaiting for response. Regards Dipanjan |
Hi Dipanjan, not every configuration options in the flink-conf.yaml are relevant for the SQL client. If you submit to an already existing cluster, then you only need to learn about the address and the port or if it is using high availability where ZooKeeper is running. However, in the general case, the Flink SQL client can also deploy a new per-job mode cluster just for your job. In order to do this, it needs to know cluster specific configurations such as the memory or the number of slots. The flink-conf.yaml does not contain any information about the executed topology. This information is contained in the JobGraph which is submitted by the client to a cluster. Cheers, Till On Sun, Sep 15, 2019 at 9:37 AM Dipanjan Mazumder <[hidden email]> wrote:
|
Free forum by Nabble | Edit this page |