Using two different configurations for StreamExecutionEnvironment

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Using two different configurations for StreamExecutionEnvironment

Georgi Stoyanov

Hi, folks

 

We have an Util that creates for us StreamExecutionEnvironment with some Checkpoint Configurations. The configuration for externalized checkpoints and state backend fails running of the job locally from IntelliJ. As a solution we currently comment those two configurations, but I don’t like that.
So far I found that I can just use ‘createLocalEnvoirment’ method but I still have no much ideas how to place it nicely in the existing logic (In the way that it doesn’t look like a hack)

 

If you have some ideas or know-how from your projects, please share them 😊

 

Kind Regards,

Georgi

Reply | Threaded
Open this post in threaded view
|

Re: Using two different configurations for StreamExecutionEnvironment

Chesnay Schepler
I suggest to run jobs in the IDE only as tests. This allows you to use various Flink utilities to setup a cluster with your desired configuration, all while keeping these details out of the actual job code.

If you are using 1.5-SNAPSHOT, have a look at the MiniClusterResource class. The class can be used as a jUnit Rule and takes care of everything needed to run any job against the cluster it sets up internally.

For previous versions have your test class extend StreamingMultipleProgramsTestBase.

The test method would simply call the main method of your job.

On 09.05.2018 11:55, Georgi Stoyanov wrote:

Hi, folks

 

We have an Util that creates for us StreamExecutionEnvironment with some Checkpoint Configurations. The configuration for externalized checkpoints and state backend fails running of the job locally from IntelliJ. As a solution we currently comment those two configurations, but I don’t like that.
So far I found that I can just use ‘createLocalEnvoirment’ method but I still have no much ideas how to place it nicely in the existing logic (In the way that it doesn’t look like a hack)

 

If you have some ideas or know-how from your projects, please share them 😊

 

Kind Regards,

Georgi


Reply | Threaded
Open this post in threaded view
|

RE: Using two different configurations for StreamExecutionEnvironment

Georgi Stoyanov

 

Hi Chesnay

Thanks for the suggestion but this doesn’t sound like a good option, since I prefer local to remote debugging.

My question sounds like really common thing and the guys behind Flink should’ve think about it.

Of course I’m really new to the field of stream processing and maybe I don’t understand completely the processes here.

 

 

 


From: Chesnay Schepler <[hidden email]>
Sent: Wednesday, May 9, 2018 2:34:31 PM
To: [hidden email]
Subject: Re: Using two different configurations for StreamExecutionEnvironment
 
I suggest to run jobs in the IDE only as tests. This allows you to use various Flink utilities to setup a cluster with your desired configuration, all while keeping these details out of the actual job code.

If you are using 1.5-SNAPSHOT, have a look at the MiniClusterResource class. The class can be used as a jUnit Rule and takes care of everything needed to run any job against the cluster it sets up internally.

For previous versions have your test class extend StreamingMultipleProgramsTestBase.

The test method would simply call the main method of your job.

On 09.05.2018 11:55, Georgi Stoyanov wrote:

Hi, folks

 

We have an Util that creates for us StreamExecutionEnvironment with some Checkpoint Configurations. The configuration for externalized checkpoints and state backend fails running of the job locally from IntelliJ. As a solution we currently comment those two configurations, but I don’t like that.
So far I found that I can just use ‘createLocalEnvoirment’ method but I still have no much ideas how to place it nicely in the existing logic (In the way that it doesn’t look like a hack)

 

If you have some ideas or know-how from your projects, please share them 😊

 

Kind Regards,

Georgi


Reply | Threaded
Open this post in threaded view
|

Re: Using two different configurations for StreamExecutionEnvironment

Chesnay Schepler
The jobs are still executed in local JVM along with the cluster, so local debugging is still possible.

On 09.05.2018 14:15, Georgi Stoyanov wrote:

 

Hi Chesnay

Thanks for the suggestion but this doesn’t sound like a good option, since I prefer local to remote debugging.

My question sounds like really common thing and the guys behind Flink should’ve think about it.

Of course I’m really new to the field of stream processing and maybe I don’t understand completely the processes here.

 

 

 


From: Chesnay Schepler [hidden email]
Sent: Wednesday, May 9, 2018 2:34:31 PM
To: [hidden email]
Subject: Re: Using two different configurations for StreamExecutionEnvironment
 
I suggest to run jobs in the IDE only as tests. This allows you to use various Flink utilities to setup a cluster with your desired configuration, all while keeping these details out of the actual job code.

If you are using 1.5-SNAPSHOT, have a look at the MiniClusterResource class. The class can be used as a jUnit Rule and takes care of everything needed to run any job against the cluster it sets up internally.

For previous versions have your test class extend StreamingMultipleProgramsTestBase.

The test method would simply call the main method of your job.

On 09.05.2018 11:55, Georgi Stoyanov wrote:

Hi, folks

 

We have an Util that creates for us StreamExecutionEnvironment with some Checkpoint Configurations. The configuration for externalized checkpoints and state backend fails running of the job locally from IntelliJ. As a solution we currently comment those two configurations, but I don’t like that.
So far I found that I can just use ‘createLocalEnvoirment’ method but I still have no much ideas how to place it nicely in the existing logic (In the way that it doesn’t look like a hack)

 

If you have some ideas or know-how from your projects, please share them 😊

 

Kind Regards,

Georgi



Reply | Threaded
Open this post in threaded view
|

RE: Using two different configurations for StreamExecutionEnvironment

Georgi Stoyanov

Sorry for the lame question – but how can I achieve that?

 I’m with 1.4, created a new class that extends StreamingMultipleProgramsTestBase, call the super() and then call the main method. 
The thing is that in my Util class I create all of the things about Checkpoints (including the setting of the hdfs folder and the other stuffs described below) and I again can’t run it cause of “CheckpointConfig says to persist periodic checkpoints, but no checkpoint directory has been configured. You can configure configure one via key 'state.checkpoints.dir'” 
 
Regards
Georgi

 


From: Chesnay Schepler <[hidden email]>
Sent: Wednesday, May 9, 2018 3:19:19 PM
To: Georgi Stoyanov; [hidden email]
Subject: Re: Using two different configurations for StreamExecutionEnvironment
 
The jobs are still executed in local JVM along with the cluster, so local debugging is still possible.

On 09.05.2018 14:15, Georgi Stoyanov wrote:

 

Hi Chesnay

Thanks for the suggestion but this doesn’t sound like a good option, since I prefer local to remote debugging.

My question sounds like really common thing and the guys behind Flink should’ve think about it.

Of course I’m really new to the field of stream processing and maybe I don’t understand completely the processes here.

 

 

 


From: Chesnay Schepler [hidden email]
Sent: Wednesday, May 9, 2018 2:34:31 PM
To: [hidden email]
Subject: Re: Using two different configurations for StreamExecutionEnvironment
 
I suggest to run jobs in the IDE only as tests. This allows you to use various Flink utilities to setup a cluster with your desired configuration, all while keeping these details out of the actual job code.

If you are using 1.5-SNAPSHOT, have a look at the MiniClusterResource class. The class can be used as a jUnit Rule and takes care of everything needed to run any job against the cluster it sets up internally.

For previous versions have your test class extend StreamingMultipleProgramsTestBase.

The test method would simply call the main method of your job.

On 09.05.2018 11:55, Georgi Stoyanov wrote:

Hi, folks

 

We have an Util that creates for us StreamExecutionEnvironment with some Checkpoint Configurations. The configuration for externalized checkpoints and state backend fails running of the job locally from IntelliJ. As a solution we currently comment those two configurations, but I don’t like that.
So far I found that I can just use ‘createLocalEnvoirment’ method but I still have no much ideas how to place it nicely in the existing logic (In the way that it doesn’t look like a hack)

 

If you have some ideas or know-how from your projects, please share them 😊

 

Kind Regards,

Georgi



Reply | Threaded
Open this post in threaded view
|

Re: Using two different configurations for StreamExecutionEnvironment

Chesnay Schepler
Ah my bad, the StreamingMultipleProgramsTestBase doesn't allow setting the configuration...

I went ahead and wrote you a utility class that your test class should extend. The configuration for the cluster is passed through the constructor.

public class MyMultipleProgramsTestBase extends AbstractTestBase {

   private LocalFlinkMiniCluster cluster;

   public MyMultipleProgramsTestBase(Configuration config) {
      super(config);
   }

   @Before
   public void setup() throws Exception {
      cluster = TestBaseUtils.startCluster(config, true);
      int numTm = config.getInteger(ConfigConstants.LOCAL_NUMBER_TASK_MANAGER, ConfigConstants.DEFAULT_LOCAL_NUMBER_TASK_MANAGER);
      int numSlots = config.getInteger(ConfigConstants.TASK_MANAGER_NUM_TASK_SLOTS, 1);
      TestStreamEnvironment.setAsContext(cluster, numTm * numSlots);
   }

   @After
   public void teardown() throws Exception {
      TestStreamEnvironment.unsetAsContext();
      stopCluster(cluster, TestBaseUtils.DEFAULT_TIMEOUT);
   }
}

On 09.05.2018 14:48, Georgi Stoyanov wrote:

Sorry for the lame question – but how can I achieve that?

 I’m with 1.4, created a new class that extends StreamingMultipleProgramsTestBase, call the super() and then call the main method. 
The thing is that in my Util class I create all of the things about Checkpoints (including the setting of the hdfs folder and the other stuffs described below) and I again can’t run it cause of “CheckpointConfig says to persist periodic checkpoints, but no checkpoint directory has been configured. You can configure configure one via key 'state.checkpoints.dir'” 
 
Regards
Georgi

 


From: Chesnay Schepler [hidden email]
Sent: Wednesday, May 9, 2018 3:19:19 PM
To: Georgi Stoyanov; [hidden email]
Subject: Re: Using two different configurations for StreamExecutionEnvironment
 
The jobs are still executed in local JVM along with the cluster, so local debugging is still possible.

On 09.05.2018 14:15, Georgi Stoyanov wrote:

 

Hi Chesnay

Thanks for the suggestion but this doesn’t sound like a good option, since I prefer local to remote debugging.

My question sounds like really common thing and the guys behind Flink should’ve think about it.

Of course I’m really new to the field of stream processing and maybe I don’t understand completely the processes here.

 

 

 


From: Chesnay Schepler [hidden email]
Sent: Wednesday, May 9, 2018 2:34:31 PM
To: [hidden email]
Subject: Re: Using two different configurations for StreamExecutionEnvironment
 
I suggest to run jobs in the IDE only as tests. This allows you to use various Flink utilities to setup a cluster with your desired configuration, all while keeping these details out of the actual job code.

If you are using 1.5-SNAPSHOT, have a look at the MiniClusterResource class. The class can be used as a jUnit Rule and takes care of everything needed to run any job against the cluster it sets up internally.

For previous versions have your test class extend StreamingMultipleProgramsTestBase.

The test method would simply call the main method of your job.

On 09.05.2018 11:55, Georgi Stoyanov wrote:

Hi, folks

 

We have an Util that creates for us StreamExecutionEnvironment with some Checkpoint Configurations. The configuration for externalized checkpoints and state backend fails running of the job locally from IntelliJ. As a solution we currently comment those two configurations, but I don’t like that.
So far I found that I can just use ‘createLocalEnvoirment’ method but I still have no much ideas how to place it nicely in the existing logic (In the way that it doesn’t look like a hack)

 

If you have some ideas or know-how from your projects, please share them 😊

 

Kind Regards,

Georgi