Unable to run flink job in dataproc cluster with jobmanager provided

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Unable to run flink job in dataproc cluster with jobmanager provided

Sourabh Mehta
Hi Team,

I'm  exploring flink for one of my use case, I'm facing some issues while running a flink job in cluster mode. Below are the steps I followed to setup and run job in cluster mode :

2. After setting up the cluster I could see the flink session started and could see the UI for the same.

3 Submitted job from dataproc master node using below command

sudo HADOOP_CONF_DIR=/etc/hadoop/conf /usr/lib/flink/bin/flink run -m yarn-cluster -yid application_1592311654771_0001 -class com.sm.flink.FlinkDriver /usr/lib/flink/lib/flink-1.0.10-sm-SNAPSHOT.jar hdfs://cluster-flink-poc-m:8020/user/flink/rocksdb/

After running the job I see the job started successfully but created a mini local cluster and ran in local mode. I don't see any jobs submitted to JobManger and I also see 0 task managers on UI.

Can someone please help me understand here?, do let me know what input is required to investigate the same.



Reply | Threaded
Open this post in threaded view
|

Re: Unable to run flink job in dataproc cluster with jobmanager provided

Chesnay Schepler
Are you by any chance creating a local environment via (Stream)ExecutionEnvironment#createLocalEnvironment?

On 17/06/2020 17:05, Sourabh Mehta wrote:
Hi Team,

I'm  exploring flink for one of my use case, I'm facing some issues while running a flink job in cluster mode. Below are the steps I followed to setup and run job in cluster mode :

2. After setting up the cluster I could see the flink session started and could see the UI for the same.

3 Submitted job from dataproc master node using below command

sudo HADOOP_CONF_DIR=/etc/hadoop/conf /usr/lib/flink/bin/flink run -m yarn-cluster -yid application_1592311654771_0001 -class com.sm.flink.FlinkDriver /usr/lib/flink/lib/flink-1.0.10-sm-SNAPSHOT.jar hdfs://cluster-flink-poc-m:8020/user/flink/rocksdb/

After running the job I see the job started successfully but created a mini local cluster and ran in local mode. I don't see any jobs submitted to JobManger and I also see 0 task managers on UI.

Can someone please help me understand here?, do let me know what input is required to investigate the same.




Reply | Threaded
Open this post in threaded view
|

Re: Unable to run flink job in dataproc cluster with jobmanager provided

Sourabh Mehta
No, I am not.

On Wed, 17 Jun 2020 at 10:48 PM, Chesnay Schepler <[hidden email]> wrote:
Are you by any chance creating a local environment via (Stream)ExecutionEnvironment#createLocalEnvironment?

On 17/06/2020 17:05, Sourabh Mehta wrote:
Hi Team,

I'm  exploring flink for one of my use case, I'm facing some issues while running a flink job in cluster mode. Below are the steps I followed to setup and run job in cluster mode :

2. After setting up the cluster I could see the flink session started and could see the UI for the same.

3 Submitted job from dataproc master node using below command

sudo HADOOP_CONF_DIR=/etc/hadoop/conf /usr/lib/flink/bin/flink run -m yarn-cluster -yid application_1592311654771_0001 -class com.sm.flink.FlinkDriver /usr/lib/flink/lib/flink-1.0.10-sm-SNAPSHOT.jar hdfs://cluster-flink-poc-m:8020/user/flink/rocksdb/

After running the job I see the job started successfully but created a mini local cluster and ran in local mode. I don't see any jobs submitted to JobManger and I also see 0 task managers on UI.

Can someone please help me understand here?, do let me know what input is required to investigate the same.




Reply | Threaded
Open this post in threaded view
|

Re: Unable to run flink job in dataproc cluster with jobmanager provided

Till Rohrmann
Hi Sourabh,

do you have access to the cluster logs? They could be helpful for debugging the problem. Which version of Flink are you using?

Cheers,
Till

On Wed, Jun 17, 2020 at 7:39 PM Sourabh Mehta <[hidden email]> wrote:
No, I am not.

On Wed, 17 Jun 2020 at 10:48 PM, Chesnay Schepler <[hidden email]> wrote:
Are you by any chance creating a local environment via (Stream)ExecutionEnvironment#createLocalEnvironment?

On 17/06/2020 17:05, Sourabh Mehta wrote:
Hi Team,

I'm  exploring flink for one of my use case, I'm facing some issues while running a flink job in cluster mode. Below are the steps I followed to setup and run job in cluster mode :

2. After setting up the cluster I could see the flink session started and could see the UI for the same.

3 Submitted job from dataproc master node using below command

sudo HADOOP_CONF_DIR=/etc/hadoop/conf /usr/lib/flink/bin/flink run -m yarn-cluster -yid application_1592311654771_0001 -class com.sm.flink.FlinkDriver /usr/lib/flink/lib/flink-1.0.10-sm-SNAPSHOT.jar hdfs://cluster-flink-poc-m:8020/user/flink/rocksdb/

After running the job I see the job started successfully but created a mini local cluster and ran in local mode. I don't see any jobs submitted to JobManger and I also see 0 task managers on UI.

Can someone please help me understand here?, do let me know what input is required to investigate the same.




Reply | Threaded
Open this post in threaded view
|

Re: Unable to run flink job in dataproc cluster with jobmanager provided

Sourabh Mehta
Hi ,
application is using 1.10.0 but cluster is setup on 1.9.0. 

Yes I do have access. please find below starting logs from cluster 


2020-06-17 11:28:18,989 INFO  org.apache.shaded.flink.table.module.ModuleManager            - Got FunctionDefinition equals from module core
2020-06-17 11:28:20,538 INFO  org.apache.shaded.flink.configuration.GlobalConfiguration     - Loading configuration property: jobmanager.rpc.address, localhost
2020-06-17 11:28:20,538 INFO  org.apache.shaded.flink.configuration.GlobalConfiguration     - Loading configuration property: jobmanager.rpc.port, 6123
2020-06-17 11:28:20,538 INFO  org.apache.shaded.flink.configuration.GlobalConfiguration     - Loading configuration property: jobmanager.heap.size, 1024m
2020-06-17 11:28:20,538 INFO  org.apache.shaded.flink.configuration.GlobalConfiguration     - Loading configuration property: taskmanager.heap.size, 1024m
2020-06-17 11:28:20,538 INFO  org.apache.shaded.flink.configuration.GlobalConfiguration     - Loading configuration property: taskmanager.numberOfTaskSlots, 1
2020-06-17 11:28:20,538 INFO  org.apache.shaded.flink.configuration.GlobalConfiguration     - Loading configuration property: parallelism.default, 1
2020-06-17 11:28:20,539 INFO  org.apache.shaded.flink.configuration.GlobalConfiguration     - Loading configuration property: jobmanager.execution.failover-strategy, region
2020-06-17 11:28:20,539 INFO  org.apache.shaded.flink.configuration.GlobalConfiguration     - Loading configuration property: jobmanager.rpc.address, cluster-flink-poc-m
2020-06-17 11:28:20,539 INFO  org.apache.shaded.flink.configuration.GlobalConfiguration     - Loading configuration property: jobmanager.heap.mb, 12288
2020-06-17 11:28:20,539 INFO  org.apache.shaded.flink.configuration.GlobalConfiguration     - Loading configuration property: taskmanager.heap.mb, 12288
2020-06-17 11:28:20,540 INFO  org.apache.shaded.flink.configuration.GlobalConfiguration     - Loading configuration property: taskmanager.numberOfTaskSlots, 4
2020-06-17 11:28:20,540 INFO  org.apache.shaded.flink.configuration.GlobalConfiguration     - Loading configuration property: parallelism.default, 28
2020-06-17 11:28:20,540 INFO  org.apache.shaded.flink.configuration.GlobalConfiguration     - Loading configuration property: taskmanager.network.numberOfBuffers, 2048
2020-06-17 11:28:20,540 INFO  org.apache.shaded.flink.configuration.GlobalConfiguration     - Loading configuration property: fs.hdfs.hadoopconf, /etc/hadoop/conf
2020-06-17 11:28:20,550 INFO  org.apache.shaded.flink.runtime.taskexecutor.TaskExecutorResourceUtils  - The configuration option Key: 'taskmanager.cpu.cores' , default: null (fallback keys: []) required for local execution is not set, setting it to its default value 1.7976931348623157E308
2020-06-17 11:28:20,552 INFO  org.apache.shaded.flink.runtime.taskexecutor.TaskExecutorResourceUtils  - The configuration option Key: 'taskmanager.memory.task.heap.size' , default: null (fallback keys: []) required for local execution is not set, setting it to its default value 9223372036854775807 bytes
2020-06-17 11:28:20,552 INFO  org.apache.shaded.flink.runtime.taskexecutor.TaskExecutorResourceUtils  - The configuration option Key: 'taskmanager.memory.task.off-heap.size' , default: 0 bytes (fallback keys: []) required for local execution is not set, setting it to its default value 9223372036854775807 bytes
2020-06-17 11:28:20,552 INFO  org.apache.shaded.flink.runtime.taskexecutor.TaskExecutorResourceUtils  - The configuration option Key: 'taskmanager.memory.network.min' , default: 64 mb (fallback keys: [{key=taskmanager.network.memory.min, isDeprecated=true}]) required for local execution is not set, setting it to its default value 64 mb
2020-06-17 11:28:20,553 INFO  org.apache.shaded.flink.runtime.taskexecutor.TaskExecutorResourceUtils  - The configuration option Key: 'taskmanager.memory.network.max' , default: 1 gb (fallback keys: [{key=taskmanager.network.memory.max, isDeprecated=true}]) required for local execution is not set, setting it to its default value 64 mb
2020-06-17 11:28:20,553 INFO  org.apache.shaded.flink.runtime.taskexecutor.TaskExecutorResourceUtils  - The configuration option Key: 'taskmanager.memory.managed.size' , default: null (fallback keys: [{key=taskmanager.memory.size, isDeprecated=true}]) required for local execution is not set, setting it to its default value 128 mb
2020-06-17 11:28:20,558 INFO  org.apache.shaded.flink.runtime.minicluster.MiniCluster       - Starting Flink Mini Cluster
2020-06-17 11:28:20,561 INFO  org.apache.shaded.flink.configuration.GlobalConfiguration     - Loading configuration property: jobmanager.rpc.address, localhost
2020-06-17 11:28:20,561 INFO  org.apache.shaded.flink.configuration.GlobalConfiguration     - Loading configuration property: jobmanager.rpc.port, 6123
2020-06-17 11:28:20,561 INFO  org.apache.shaded.flink.configuration.GlobalConfiguration     - Loading configuration property: jobmanager.heap.size, 1024m
2020-06-17 11:28:20,561 INFO  org.apache.shaded.flink.configuration.GlobalConfiguration     - Loading configuration property: taskmanager.heap.size, 1024m
2020-06-17 11:28:20,561 INFO  org.apache.shaded.flink.configuration.GlobalConfiguration     - Loading configuration property: taskmanager.numberOfTaskSlots, 1
2020-06-17 11:28:20,561 INFO  org.apache.shaded.flink.configuration.GlobalConfiguration     - Loading configuration property: parallelism.default, 1
2020-06-17 11:28:20,561 INFO  org.apache.shaded.flink.configuration.GlobalConfiguration     - Loading configuration property: jobmanager.execution.failover-strategy, region
2020-06-17 11:28:20,562 INFO  org.apache.shaded.flink.configuration.GlobalConfiguration     - Loading configuration property: jobmanager.rpc.address, cluster-flink-poc-m
2020-06-17 11:28:20,562 INFO  org.apache.shaded.flink.configuration.GlobalConfiguration     - Loading configuration property: jobmanager.heap.mb, 12288
2020-06-17 11:28:20,562 INFO  org.apache.shaded.flink.configuration.GlobalConfiguration     - Loading configuration property: taskmanager.heap.mb, 12288
2020-06-17 11:28:20,562 INFO  org.apache.shaded.flink.configuration.GlobalConfiguration     - Loading configuration property: taskmanager.numberOfTaskSlots, 4
2020-06-17 11:28:20,562 INFO  org.apache.shaded.flink.configuration.GlobalConfiguration     - Loading configuration property: parallelism.default, 28
2020-06-17 11:28:20,563 INFO  org.apache.shaded.flink.configuration.GlobalConfiguration     - Loading configuration property: taskmanager.network.numberOfBuffers, 2048
2020-06-17 11:28:20,563 INFO  org.apache.shaded.flink.configuration.GlobalConfiguration     - Loading configuration property: fs.hdfs.hadoopconf, /etc/hadoop/conf
2020-06-17 11:28:20,563 INFO  org.apache.shaded.flink.runtime.minicluster.MiniCluster       - Starting Metrics Registry
2020-06-17 11:28:20,610 INFO  org.apache.shaded.flink.runtime.metrics.MetricRegistryImpl    - No metrics reporter configured, no metrics will be exposed/reported.
2020-06-17 11:28:20,610 INFO  org.apache.shaded.flink.runtime.minicluster.MiniCluster       - Starting RPC Service(s)
2020-06-17 11:28:20,976 INFO  akka.event.slf4j.Slf4jLogger                                  - Slf4jLogger started
2020-06-17 11:28:21,070 INFO  org.apache.shaded.flink.runtime.rpc.akka.AkkaRpcServiceUtils  - Trying to start actor system at :0
2020-06-17 11:28:21,115 INFO  akka.event.slf4j.Slf4jLogger                                  - Slf4jLogger started
2020-06-17 11:28:21,131 INFO  akka.remote.Remoting                                          - Starting remoting
2020-06-17 11:28:21,279 INFO  akka.remote.Remoting                                          - Remoting started; listening on addresses :[akka.tcp://flink-metrics@<<IP:PORT>>]
2020-06-17 11:28:21,283 INFO  org.apache.shaded.flink.runtime.rpc.akka.AkkaRpcServiceUtils  - Actor system started at akka.tcp://flink-metrics@<<IP:PORT>>



Note : I have removed a few IP addresses from the log. 

On Thu, Jun 18, 2020 at 12:08 AM Till Rohrmann <[hidden email]> wrote:
Hi Sourabh,

do you have access to the cluster logs? They could be helpful for debugging the problem. Which version of Flink are you using?

Cheers,
Till

On Wed, Jun 17, 2020 at 7:39 PM Sourabh Mehta <[hidden email]> wrote:
No, I am not.

On Wed, 17 Jun 2020 at 10:48 PM, Chesnay Schepler <[hidden email]> wrote:
Are you by any chance creating a local environment via (Stream)ExecutionEnvironment#createLocalEnvironment?

On 17/06/2020 17:05, Sourabh Mehta wrote:
Hi Team,

I'm  exploring flink for one of my use case, I'm facing some issues while running a flink job in cluster mode. Below are the steps I followed to setup and run job in cluster mode :

2. After setting up the cluster I could see the flink session started and could see the UI for the same.

3 Submitted job from dataproc master node using below command

sudo HADOOP_CONF_DIR=/etc/hadoop/conf /usr/lib/flink/bin/flink run -m yarn-cluster -yid application_1592311654771_0001 -class com.sm.flink.FlinkDriver /usr/lib/flink/lib/flink-1.0.10-sm-SNAPSHOT.jar hdfs://cluster-flink-poc-m:8020/user/flink/rocksdb/

After running the job I see the job started successfully but created a mini local cluster and ran in local mode. I don't see any jobs submitted to JobManger and I also see 0 task managers on UI.

Can someone please help me understand here?, do let me know what input is required to investigate the same.




Reply | Threaded
Open this post in threaded view
|

Re: Unable to run flink job in dataproc cluster with jobmanager provided

Chesnay Schepler
Is your user-jar packaging and relocating Flink classes? If so, then your job actually operate against the classes provided by the cluster, which, well, just wouldn't work.

On 18/06/2020 09:34, Sourabh Mehta wrote:
Hi ,
application is using 1.10.0 but cluster is setup on 1.9.0. 

Yes I do have access. please find below starting logs from cluster 


2020-06-17 11:28:18,989 INFO  org.apache.shaded.flink.table.module.ModuleManager            - Got FunctionDefinition equals from module core
2020-06-17 11:28:20,538 INFO  org.apache.shaded.flink.configuration.GlobalConfiguration     - Loading configuration property: jobmanager.rpc.address, localhost
2020-06-17 11:28:20,538 INFO  org.apache.shaded.flink.configuration.GlobalConfiguration     - Loading configuration property: jobmanager.rpc.port, 6123
2020-06-17 11:28:20,538 INFO  org.apache.shaded.flink.configuration.GlobalConfiguration     - Loading configuration property: jobmanager.heap.size, 1024m
2020-06-17 11:28:20,538 INFO  org.apache.shaded.flink.configuration.GlobalConfiguration     - Loading configuration property: taskmanager.heap.size, 1024m
2020-06-17 11:28:20,538 INFO  org.apache.shaded.flink.configuration.GlobalConfiguration     - Loading configuration property: taskmanager.numberOfTaskSlots, 1
2020-06-17 11:28:20,538 INFO  org.apache.shaded.flink.configuration.GlobalConfiguration     - Loading configuration property: parallelism.default, 1
2020-06-17 11:28:20,539 INFO  org.apache.shaded.flink.configuration.GlobalConfiguration     - Loading configuration property: jobmanager.execution.failover-strategy, region
2020-06-17 11:28:20,539 INFO  org.apache.shaded.flink.configuration.GlobalConfiguration     - Loading configuration property: jobmanager.rpc.address, cluster-flink-poc-m
2020-06-17 11:28:20,539 INFO  org.apache.shaded.flink.configuration.GlobalConfiguration     - Loading configuration property: jobmanager.heap.mb, 12288
2020-06-17 11:28:20,539 INFO  org.apache.shaded.flink.configuration.GlobalConfiguration     - Loading configuration property: taskmanager.heap.mb, 12288
2020-06-17 11:28:20,540 INFO  org.apache.shaded.flink.configuration.GlobalConfiguration     - Loading configuration property: taskmanager.numberOfTaskSlots, 4
2020-06-17 11:28:20,540 INFO  org.apache.shaded.flink.configuration.GlobalConfiguration     - Loading configuration property: parallelism.default, 28
2020-06-17 11:28:20,540 INFO  org.apache.shaded.flink.configuration.GlobalConfiguration     - Loading configuration property: taskmanager.network.numberOfBuffers, 2048
2020-06-17 11:28:20,540 INFO  org.apache.shaded.flink.configuration.GlobalConfiguration     - Loading configuration property: fs.hdfs.hadoopconf, /etc/hadoop/conf
2020-06-17 11:28:20,550 INFO  org.apache.shaded.flink.runtime.taskexecutor.TaskExecutorResourceUtils  - The configuration option Key: 'taskmanager.cpu.cores' , default: null (fallback keys: []) required for local execution is not set, setting it to its default value 1.7976931348623157E308
2020-06-17 11:28:20,552 INFO  org.apache.shaded.flink.runtime.taskexecutor.TaskExecutorResourceUtils  - The configuration option Key: 'taskmanager.memory.task.heap.size' , default: null (fallback keys: []) required for local execution is not set, setting it to its default value 9223372036854775807 bytes
2020-06-17 11:28:20,552 INFO  org.apache.shaded.flink.runtime.taskexecutor.TaskExecutorResourceUtils  - The configuration option Key: 'taskmanager.memory.task.off-heap.size' , default: 0 bytes (fallback keys: []) required for local execution is not set, setting it to its default value 9223372036854775807 bytes
2020-06-17 11:28:20,552 INFO  org.apache.shaded.flink.runtime.taskexecutor.TaskExecutorResourceUtils  - The configuration option Key: 'taskmanager.memory.network.min' , default: 64 mb (fallback keys: [{key=taskmanager.network.memory.min, isDeprecated=true}]) required for local execution is not set, setting it to its default value 64 mb
2020-06-17 11:28:20,553 INFO  org.apache.shaded.flink.runtime.taskexecutor.TaskExecutorResourceUtils  - The configuration option Key: 'taskmanager.memory.network.max' , default: 1 gb (fallback keys: [{key=taskmanager.network.memory.max, isDeprecated=true}]) required for local execution is not set, setting it to its default value 64 mb
2020-06-17 11:28:20,553 INFO  org.apache.shaded.flink.runtime.taskexecutor.TaskExecutorResourceUtils  - The configuration option Key: 'taskmanager.memory.managed.size' , default: null (fallback keys: [{key=taskmanager.memory.size, isDeprecated=true}]) required for local execution is not set, setting it to its default value 128 mb
2020-06-17 11:28:20,558 INFO  org.apache.shaded.flink.runtime.minicluster.MiniCluster       - Starting Flink Mini Cluster
2020-06-17 11:28:20,561 INFO  org.apache.shaded.flink.configuration.GlobalConfiguration     - Loading configuration property: jobmanager.rpc.address, localhost
2020-06-17 11:28:20,561 INFO  org.apache.shaded.flink.configuration.GlobalConfiguration     - Loading configuration property: jobmanager.rpc.port, 6123
2020-06-17 11:28:20,561 INFO  org.apache.shaded.flink.configuration.GlobalConfiguration     - Loading configuration property: jobmanager.heap.size, 1024m
2020-06-17 11:28:20,561 INFO  org.apache.shaded.flink.configuration.GlobalConfiguration     - Loading configuration property: taskmanager.heap.size, 1024m
2020-06-17 11:28:20,561 INFO  org.apache.shaded.flink.configuration.GlobalConfiguration     - Loading configuration property: taskmanager.numberOfTaskSlots, 1
2020-06-17 11:28:20,561 INFO  org.apache.shaded.flink.configuration.GlobalConfiguration     - Loading configuration property: parallelism.default, 1
2020-06-17 11:28:20,561 INFO  org.apache.shaded.flink.configuration.GlobalConfiguration     - Loading configuration property: jobmanager.execution.failover-strategy, region
2020-06-17 11:28:20,562 INFO  org.apache.shaded.flink.configuration.GlobalConfiguration     - Loading configuration property: jobmanager.rpc.address, cluster-flink-poc-m
2020-06-17 11:28:20,562 INFO  org.apache.shaded.flink.configuration.GlobalConfiguration     - Loading configuration property: jobmanager.heap.mb, 12288
2020-06-17 11:28:20,562 INFO  org.apache.shaded.flink.configuration.GlobalConfiguration     - Loading configuration property: taskmanager.heap.mb, 12288
2020-06-17 11:28:20,562 INFO  org.apache.shaded.flink.configuration.GlobalConfiguration     - Loading configuration property: taskmanager.numberOfTaskSlots, 4
2020-06-17 11:28:20,562 INFO  org.apache.shaded.flink.configuration.GlobalConfiguration     - Loading configuration property: parallelism.default, 28
2020-06-17 11:28:20,563 INFO  org.apache.shaded.flink.configuration.GlobalConfiguration     - Loading configuration property: taskmanager.network.numberOfBuffers, 2048
2020-06-17 11:28:20,563 INFO  org.apache.shaded.flink.configuration.GlobalConfiguration     - Loading configuration property: fs.hdfs.hadoopconf, /etc/hadoop/conf
2020-06-17 11:28:20,563 INFO  org.apache.shaded.flink.runtime.minicluster.MiniCluster       - Starting Metrics Registry
2020-06-17 11:28:20,610 INFO  org.apache.shaded.flink.runtime.metrics.MetricRegistryImpl    - No metrics reporter configured, no metrics will be exposed/reported.
2020-06-17 11:28:20,610 INFO  org.apache.shaded.flink.runtime.minicluster.MiniCluster       - Starting RPC Service(s)
2020-06-17 11:28:20,976 INFO  akka.event.slf4j.Slf4jLogger                                  - Slf4jLogger started
2020-06-17 11:28:21,070 INFO  org.apache.shaded.flink.runtime.rpc.akka.AkkaRpcServiceUtils  - Trying to start actor system at :0
2020-06-17 11:28:21,115 INFO  akka.event.slf4j.Slf4jLogger                                  - Slf4jLogger started
2020-06-17 11:28:21,131 INFO  akka.remote.Remoting                                          - Starting remoting
2020-06-17 11:28:21,279 INFO  akka.remote.Remoting                                          - Remoting started; listening on addresses :[akka.tcp://flink-metrics@<<IP:PORT>>]
2020-06-17 11:28:21,283 INFO  org.apache.shaded.flink.runtime.rpc.akka.AkkaRpcServiceUtils  - Actor system started at akka.tcp://flink-metrics@<<IP:PORT>>



Note : I have removed a few IP addresses from the log. 

On Thu, Jun 18, 2020 at 12:08 AM Till Rohrmann <[hidden email]> wrote:
Hi Sourabh,

do you have access to the cluster logs? They could be helpful for debugging the problem. Which version of Flink are you using?

Cheers,
Till

On Wed, Jun 17, 2020 at 7:39 PM Sourabh Mehta <[hidden email]> wrote:
No, I am not.

On Wed, 17 Jun 2020 at 10:48 PM, Chesnay Schepler <[hidden email]> wrote:
Are you by any chance creating a local environment via (Stream)ExecutionEnvironment#createLocalEnvironment?

On 17/06/2020 17:05, Sourabh Mehta wrote:
Hi Team,

I'm  exploring flink for one of my use case, I'm facing some issues while running a flink job in cluster mode. Below are the steps I followed to setup and run job in cluster mode :

2. After setting up the cluster I could see the flink session started and could see the UI for the same.

3 Submitted job from dataproc master node using below command

sudo HADOOP_CONF_DIR=/etc/hadoop/conf /usr/lib/flink/bin/flink run -m yarn-cluster -yid application_1592311654771_0001 -class com.sm.flink.FlinkDriver /usr/lib/flink/lib/flink-1.0.10-sm-SNAPSHOT.jar hdfs://cluster-flink-poc-m:8020/user/flink/rocksdb/

After running the job I see the job started successfully but created a mini local cluster and ran in local mode. I don't see any jobs submitted to JobManger and I also see 0 task managers on UI.

Can someone please help me understand here?, do let me know what input is required to investigate the same.