Re: Running Flink in Google Cloud Platform (GCP) - can Flink be truly elastic?

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: Running Flink in Google Cloud Platform (GCP) - can Flink be truly elastic?

Dawid Wysakowicz-2
Hi Alexander,

I've redirected your question to user mailing list. The goal of
community list is for "Broader community discussions related to meetups,
conferences, blog posts and job offers"

Quick answer to your question is that dynamic scaling of flink job's is
a work in progress. Maybe Gary or Till cc'ed can share some more details
on that topic.

Best,

Dawid


On 21/09/18 17:25, [hidden email] wrote:

> Hi
>
> I'm trying to understand what it means to run a Flink cluster inside the Google Cloud Platform and whether it can act in an "elastic" way; if the cluster needs more resources to accommodate a sudden demand or increase in Flink jobs, will GCP automatically detect this and spool up more Task Managers to provide extra task slots?
>
> If we consider the following two simple use cases, how would GCP address them?
>
>
> 1)     No free task slots to run new flink jobs
>
> 2)     A slow flink job needs an increased parallelism to improve throughput
>
> Currently, we'd handle the above use cases by:
>
>
> 1)     knowing that the job failed due to "no free slots", check the exception text, schedule to add a new task manager and rerun the job, knowing that there are now available task slots.
>
> 2)     We'd monitor the speed of the job ourselves, stop the job, specify which components (operators) in the stream reqd an increase in parallelism (for example via job properties), then relaunch the job; if not enough slots were available, we'd have to consider adding extra task managers.
>
>
> So my question is...can Google Cloud Platform (GCP) automatically launch extra TMs to handle the above?
>
> If we proposed to run a Flink cluster in a GCP container, can GCP make Flink behave dynamically elastic in the same way that Google DataFlow apparently can?
>
> Regards
>
>
> Alex
>
>
> The Royal Bank of Scotland plc. Registered in Scotland No 83026. Registered Office: 36 St Andrew Square, Edinburgh EH2 2YB. The Royal Bank of Scotland is authorised by the Prudential Regulation Authority, and regulated by the Financial Conduct Authority and Prudential Regulation Authority. The Royal Bank of Scotland N.V. is authorised and regulated by the De Nederlandsche Bank and has its seat at Amsterdam, the Netherlands, and is registered in the Commercial Register under number 33002587. Registered Office: Gustav Mahlerlaan 350, Amsterdam, The Netherlands. The Royal Bank of Scotland N.V. and The Royal Bank of Scotland plc are authorised to act as agent for each other in certain jurisdictions.
>
> National Westminster Bank Plc.  Registered in England No. 929027.  Registered Office: 135 Bishopsgate, London EC2M 3UR.  National Westminster Bank Plc is authorised by the Prudential Regulation Authority, and regulated by the Financial Conduct Authority and the Prudential Regulation Authority.
>
> The Royal Bank of Scotland plc and National Westminster Bank Plc are authorised to act as agent for each other.
>
> This e-mail message is confidential and for use by the addressee only.  If the message is received by anyone other than the addressee, please return the message to the sender by replying to it and then delete the message from your computer.  Internet e-mails are not necessarily secure.  The Royal Bank of Scotland plc, The Royal Bank of Scotland N.V., National Westminster Bank Plc or any affiliated entity (RBS or us) does not accept responsibility for changes made to this message after it was sent.  RBS may monitor e-mails for business and operational purposes.  By replying to this message you understand that the content of your message may be monitored.
>
> Whilst all reasonable care has been taken to avoid the transmission of viruses, it is the responsibility of the recipient to ensure that the onward transmission, opening or use of this message and any attachments will not adversely affect its systems or data.  No responsibility is accepted by RBS in this regard and the recipient should carry out such virus and other checks as it considers appropriate.
>
> Visit our website at www.rbs.com <http://www.rbs.com/>
>


signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Running Flink in Google Cloud Platform (GCP) - can Flink be truly elastic?

Konstantin Knauf
Hi Alexander,

broadly speaking, what you are doing right now, is in line with what is currently possible with Apache Flink. Can you share a little bit more information about your setup (K8s/Flink-Standalone? Job-Mode/Session-Mode?)? You might find Gary's Flink Forward [1] talk interesting. He demonstrates how a Flink job automatically scales out, when it is given more resources by the resource manager, e.g. Kubernetes. But this is still work-in-progress.

Best,

Konstantin



On Fri, Sep 21, 2018 at 5:42 PM Dawid Wysakowicz <[hidden email]> wrote:
Hi Alexander,

I've redirected your question to user mailing list. The goal of
community list is for "Broader community discussions related to meetups,
conferences, blog posts and job offers"

Quick answer to your question is that dynamic scaling of flink job's is
a work in progress. Maybe Gary or Till cc'ed can share some more details
on that topic.

Best,

Dawid


On 21/09/18 17:25, [hidden email].INVALID wrote:
> Hi
>
> I'm trying to understand what it means to run a Flink cluster inside the Google Cloud Platform and whether it can act in an "elastic" way; if the cluster needs more resources to accommodate a sudden demand or increase in Flink jobs, will GCP automatically detect this and spool up more Task Managers to provide extra task slots?
>
> If we consider the following two simple use cases, how would GCP address them?
>
>
> 1)     No free task slots to run new flink jobs
>
> 2)     A slow flink job needs an increased parallelism to improve throughput
>
> Currently, we'd handle the above use cases by:
>
>
> 1)     knowing that the job failed due to "no free slots", check the exception text, schedule to add a new task manager and rerun the job, knowing that there are now available task slots.
>
> 2)     We'd monitor the speed of the job ourselves, stop the job, specify which components (operators) in the stream reqd an increase in parallelism (for example via job properties), then relaunch the job; if not enough slots were available, we'd have to consider adding extra task managers.
>
>
> So my question is...can Google Cloud Platform (GCP) automatically launch extra TMs to handle the above?
>
> If we proposed to run a Flink cluster in a GCP container, can GCP make Flink behave dynamically elastic in the same way that Google DataFlow apparently can?
>
> Regards
>
>
> Alex
>
>
> The Royal Bank of Scotland plc. Registered in Scotland No 83026. Registered Office: 36 St Andrew Square, Edinburgh EH2 2YB. The Royal Bank of Scotland is authorised by the Prudential Regulation Authority, and regulated by the Financial Conduct Authority and Prudential Regulation Authority. The Royal Bank of Scotland N.V. is authorised and regulated by the De Nederlandsche Bank and has its seat at Amsterdam, the Netherlands, and is registered in the Commercial Register under number 33002587. Registered Office: Gustav Mahlerlaan 350, Amsterdam, The Netherlands. The Royal Bank of Scotland N.V. and The Royal Bank of Scotland plc are authorised to act as agent for each other in certain jurisdictions.
>
> National Westminster Bank Plc.  Registered in England No. 929027.  Registered Office: 135 Bishopsgate, London EC2M 3UR.  National Westminster Bank Plc is authorised by the Prudential Regulation Authority, and regulated by the Financial Conduct Authority and the Prudential Regulation Authority.
>
> The Royal Bank of Scotland plc and National Westminster Bank Plc are authorised to act as agent for each other.
>
> This e-mail message is confidential and for use by the addressee only.  If the message is received by anyone other than the addressee, please return the message to the sender by replying to it and then delete the message from your computer.  Internet e-mails are not necessarily secure.  The Royal Bank of Scotland plc, The Royal Bank of Scotland N.V., National Westminster Bank Plc or any affiliated entity (RBS or us) does not accept responsibility for changes made to this message after it was sent.  RBS may monitor e-mails for business and operational purposes.  By replying to this message you understand that the content of your message may be monitored.
>
> Whilst all reasonable care has been taken to avoid the transmission of viruses, it is the responsibility of the recipient to ensure that the onward transmission, opening or use of this message and any attachments will not adversely affect its systems or data.  No responsibility is accepted by RBS in this regard and the recipient should carry out such virus and other checks as it considers appropriate.
>
> Visit our website at www.rbs.com <http://www.rbs.com/>
>




--

Konstantin Knauf | Solution Architect

data Artisans

Follow us @dataArtisans

--

Join Flink Forward - The Apache Flink Conference

Stream Processing | Event Driven | Real Time

--

Data Artisans GmbH | Stresemannstr. 121A,10963 Berlin, Germany
data Artisans, Inc. | 1161 Mission Street, San Francisco, CA-94103, USA

--

Data Artisans GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen
Reply | Threaded
Open this post in threaded view
|

Re: Running Flink in Google Cloud Platform (GCP) - can Flink be truly elastic?

Till Rohrmann
Hi Alexander,

the issue for the reactive mode, the mode which reacts to newly available resources and scales the up accordingly, is here: https://issues.apache.org/jira/browse/FLINK-10407. It does not contain a lot of details but we are actively working on publishing the corresponding design document soon. See also https://issues.apache.org/jira/browse/FLINK-10404 which is related to the reactive mode.

Cheers,
Till

On Sun, Sep 23, 2018 at 5:33 PM Konstantin Knauf <[hidden email]> wrote:
Hi Alexander,

broadly speaking, what you are doing right now, is in line with what is currently possible with Apache Flink. Can you share a little bit more information about your setup (K8s/Flink-Standalone? Job-Mode/Session-Mode?)? You might find Gary's Flink Forward [1] talk interesting. He demonstrates how a Flink job automatically scales out, when it is given more resources by the resource manager, e.g. Kubernetes. But this is still work-in-progress.

Best,

Konstantin



On Fri, Sep 21, 2018 at 5:42 PM Dawid Wysakowicz <[hidden email]> wrote:
Hi Alexander,

I've redirected your question to user mailing list. The goal of
community list is for "Broader community discussions related to meetups,
conferences, blog posts and job offers"

Quick answer to your question is that dynamic scaling of flink job's is
a work in progress. Maybe Gary or Till cc'ed can share some more details
on that topic.

Best,

Dawid


On 21/09/18 17:25, [hidden email].INVALID wrote:
> Hi
>
> I'm trying to understand what it means to run a Flink cluster inside the Google Cloud Platform and whether it can act in an "elastic" way; if the cluster needs more resources to accommodate a sudden demand or increase in Flink jobs, will GCP automatically detect this and spool up more Task Managers to provide extra task slots?
>
> If we consider the following two simple use cases, how would GCP address them?
>
>
> 1)     No free task slots to run new flink jobs
>
> 2)     A slow flink job needs an increased parallelism to improve throughput
>
> Currently, we'd handle the above use cases by:
>
>
> 1)     knowing that the job failed due to "no free slots", check the exception text, schedule to add a new task manager and rerun the job, knowing that there are now available task slots.
>
> 2)     We'd monitor the speed of the job ourselves, stop the job, specify which components (operators) in the stream reqd an increase in parallelism (for example via job properties), then relaunch the job; if not enough slots were available, we'd have to consider adding extra task managers.
>
>
> So my question is...can Google Cloud Platform (GCP) automatically launch extra TMs to handle the above?
>
> If we proposed to run a Flink cluster in a GCP container, can GCP make Flink behave dynamically elastic in the same way that Google DataFlow apparently can?
>
> Regards
>
>
> Alex
>
>
> The Royal Bank of Scotland plc. Registered in Scotland No 83026. Registered Office: 36 St Andrew Square, Edinburgh EH2 2YB. The Royal Bank of Scotland is authorised by the Prudential Regulation Authority, and regulated by the Financial Conduct Authority and Prudential Regulation Authority. The Royal Bank of Scotland N.V. is authorised and regulated by the De Nederlandsche Bank and has its seat at Amsterdam, the Netherlands, and is registered in the Commercial Register under number 33002587. Registered Office: Gustav Mahlerlaan 350, Amsterdam, The Netherlands. The Royal Bank of Scotland N.V. and The Royal Bank of Scotland plc are authorised to act as agent for each other in certain jurisdictions.
>
> National Westminster Bank Plc.  Registered in England No. 929027.  Registered Office: 135 Bishopsgate, London EC2M 3UR.  National Westminster Bank Plc is authorised by the Prudential Regulation Authority, and regulated by the Financial Conduct Authority and the Prudential Regulation Authority.
>
> The Royal Bank of Scotland plc and National Westminster Bank Plc are authorised to act as agent for each other.
>
> This e-mail message is confidential and for use by the addressee only.  If the message is received by anyone other than the addressee, please return the message to the sender by replying to it and then delete the message from your computer.  Internet e-mails are not necessarily secure.  The Royal Bank of Scotland plc, The Royal Bank of Scotland N.V., National Westminster Bank Plc or any affiliated entity (RBS or us) does not accept responsibility for changes made to this message after it was sent.  RBS may monitor e-mails for business and operational purposes.  By replying to this message you understand that the content of your message may be monitored.
>
> Whilst all reasonable care has been taken to avoid the transmission of viruses, it is the responsibility of the recipient to ensure that the onward transmission, opening or use of this message and any attachments will not adversely affect its systems or data.  No responsibility is accepted by RBS in this regard and the recipient should carry out such virus and other checks as it considers appropriate.
>
> Visit our website at www.rbs.com <http://www.rbs.com/>
>




--

Konstantin Knauf | Solution Architect

data Artisans

Follow us @dataArtisans

--

Join Flink Forward - The Apache Flink Conference

Stream Processing | Event Driven | Real Time

--

Data Artisans GmbH | Stresemannstr. 121A,10963 Berlin, Germany
data Artisans, Inc. | 1161 Mission Street, San Francisco, CA-94103, USA

--

Data Artisans GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen