Flink Yarn Cluster - Jobs Isolation

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Flink Yarn Cluster - Jobs Isolation

Eran Twili

Hi,

 

In my company we are interesting in running Flink jobs on a Yarn cluster (on AWS EMR).

We understand that there are 2 ways ('modes') to execute Flink jobs on a yarn cluster.

We must have the jobs run concurrently!

From what we understand so far those are the options:

  1. Start a long running yarn session, to which we'll send jobs.
  2. Run each job as a 'single job'.

We searched the web to understand the difference and consequences of each option,

(We read threw flink-yarn-setup and FLIP6, along many other references),

but couldn't find clear comprehensive info.

 

In the 'session' mode:

  1. Does running multiple jobs in single session means there's no job isolation?
  2. All jobs will run on the same jvm?
  3. Can we define different classpath for each job in this mode?

In the 'single job' mode:

  1. Can we run multiple jobs concurrently?
  2. Is there a complete job isolation by default or do we need to configure it (different jvm/classpath)?

 

Overall, what will be the different implications in aspects of resource management, security, and monitoring?

Another question: what is the difference between multiple sessions of a single job vs multiple 'single job' executions?

 

We'll be very thankful if someone could provide some answers or reference to a comprehensive documentation on those subjects.

 

Regards,

Eran

 

 


Confidentiality: This communication and any attachments are intended for the above-named persons only and may be confidential and/or legally privileged. Any opinions expressed in this communication are not necessarily those of NICE Actimize. If this communication has come to you in error you must take no action based on it, nor must you copy or show it to anyone; please delete/destroy and inform the sender by e-mail immediately. 
Monitoring: NICE Actimize may monitor incoming and outgoing e-mails.
Viruses: Although we have taken steps toward ensuring that this e-mail and attachments are free from any virus, we advise that in keeping with good computing practice the recipient should ensure they are actually virus free.


Confidentiality: This communication and any attachments are intended for the above-named persons only and may be confidential and/or legally privileged. Any opinions expressed in this communication are not necessarily those of NICE Actimize. If this communication has come to you in error you must take no action based on it, nor must you copy or show it to anyone; please delete/destroy and inform the sender by e-mail immediately. 
Monitoring: NICE Actimize may monitor incoming and outgoing e-mails.
Viruses: Although we have taken steps toward ensuring that this e-mail and attachments are free from any virus, we advise that in keeping with good computing practice the recipient should ensure they are actually virus free.

Reply | Threaded
Open this post in threaded view
|

Re: Flink Yarn Cluster - Jobs Isolation

Jamie Grier-2

Yes they will run concurrently and be completely isolated from each other.

-Jamie


On Sun, Jan 27, 2019 at 6:08 AM Eran Twili <[hidden email]> wrote:

Hi,

 

In my company we are interesting in running Flink jobs on a Yarn cluster (on AWS EMR).

We understand that there are 2 ways ('modes') to execute Flink jobs on a yarn cluster.

We must have the jobs run concurrently!

From what we understand so far those are the options:

  1. Start a long running yarn session, to which we'll send jobs.
  2. Run each job as a 'single job'.

We searched the web to understand the difference and consequences of each option,

(We read threw flink-yarn-setup and FLIP6, along many other references),

but couldn't find clear comprehensive info.

 

In the 'session' mode:

  1. Does running multiple jobs in single session means there's no job isolation?
  2. All jobs will run on the same jvm?
  3. Can we define different classpath for each job in this mode?

In the 'single job' mode:

  1. Can we run multiple jobs concurrently?
  2. Is there a complete job isolation by default or do we need to configure it (different jvm/classpath)?

 

Overall, what will be the different implications in aspects of resource management, security, and monitoring?

Another question: what is the difference between multiple sessions of a single job vs multiple 'single job' executions?

 

We'll be very thankful if someone could provide some answers or reference to a comprehensive documentation on those subjects.

 

Regards,

Eran

 

 


Confidentiality: This communication and any attachments are intended for the above-named persons only and may be confidential and/or legally privileged. Any opinions expressed in this communication are not necessarily those of NICE Actimize. If this communication has come to you in error you must take no action based on it, nor must you copy or show it to anyone; please delete/destroy and inform the sender by e-mail immediately. 
Monitoring: NICE Actimize may monitor incoming and outgoing e-mails.
Viruses: Although we have taken steps toward ensuring that this e-mail and attachments are free from any virus, we advise that in keeping with good computing practice the recipient should ensure they are actually virus free.


Confidentiality: This communication and any attachments are intended for the above-named persons only and may be confidential and/or legally privileged. Any opinions expressed in this communication are not necessarily those of NICE Actimize. If this communication has come to you in error you must take no action based on it, nor must you copy or show it to anyone; please delete/destroy and inform the sender by e-mail immediately. 
Monitoring: NICE Actimize may monitor incoming and outgoing e-mails.
Viruses: Although we have taken steps toward ensuring that this e-mail and attachments are free from any virus, we advise that in keeping with good computing practice the recipient should ensure they are actually virus free.