How does Flink handle backpressure in EMR

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

How does Flink handle backpressure in EMR

Nguyen, Michael

Hello all,

 

How does Flink handle backpressure (caused by an increase in traffic) in a Flink job when it’s being hosted in an EMR cluster? Does Flink detect the backpressure and auto-scales the EMR cluster to handle the workload to relieve the backpressure? Once the backpressure is gone, then the EMR cluster would scale back down?

 

Thanks,

Michael

 

 

Reply | Threaded
Open this post in threaded view
|

Re: How does Flink handle backpressure in EMR

r_khachatryan
Hi Michael

Flink *does* detect backpressure but currently, it only propagates it back
to sources.
And so it doesn't support auto-scaling.

Regards,
Roman


Nguyen, Michael wrote
> How does Flink handle backpressure (caused by an increase in traffic) in a
> Flink job when it’s being hosted in an EMR cluster? Does Flink detect the
> backpressure and auto-scales the EMR cluster to handle the workload to
> relieve the backpressure? Once the backpressure is gone, then the EMR
> cluster would scale back down?





--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: How does Flink handle backpressure in EMR

Piotr Nowojski-3
Hi Michael,

As Roman pointed out Flink currently doesn’t support the auto-scaling. It’s on our roadmap but it requires quite a bit of preliminary work to happen before.

Piotrek

> On 5 Dec 2019, at 15:32, r_khachatryan <[hidden email]> wrote:
>
> Hi Michael
>
> Flink *does* detect backpressure but currently, it only propagates it back
> to sources.
> And so it doesn't support auto-scaling.
>
> Regards,
> Roman
>
>
> Nguyen, Michael wrote
>> How does Flink handle backpressure (caused by an increase in traffic) in a
>> Flink job when it’s being hosted in an EMR cluster? Does Flink detect the
>> backpressure and auto-scales the EMR cluster to handle the workload to
>> relieve the backpressure? Once the backpressure is gone, then the EMR
>> cluster would scale back down?
>
>
>
>
>
> --
> Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Reply | Threaded
Open this post in threaded view
|

Re: How does Flink handle backpressure in EMR

Nguyen, Michael
Thank you for the response Roman and Piotrek!

@Roman - can you clarify on what you mean when you mentioned Flink propagating it back to the sources?

Also, if one of my Flink operators is processing records too slowly and is getting further away from the latest record of my source data stream, is there a way to detect this slow processing in Flink? Would this be detected by Flink's backpressure mechanism?

Thanks,
Michael

On 12/5/19, 7:57 AM, "Piotr Nowojski" <[hidden email] on behalf of [hidden email]> wrote:

    [External]
   
   
    Hi Michael,
   
    As Roman pointed out Flink currently doesn’t support the auto-scaling. It’s on our roadmap but it requires quite a bit of preliminary work to happen before.
   
    Piotrek
   
    > On 5 Dec 2019, at 15:32, r_khachatryan <[hidden email]> wrote:
    >
    > Hi Michael
    >
    > Flink *does* detect backpressure but currently, it only propagates it back
    > to sources.
    > And so it doesn't support auto-scaling.
    >
    > Regards,
    > Roman
    >
    >
    > Nguyen, Michael wrote
    >> How does Flink handle backpressure (caused by an increase in traffic) in a
    >> Flink job when it’s being hosted in an EMR cluster? Does Flink detect the
    >> backpressure and auto-scales the EMR cluster to handle the workload to
    >> relieve the backpressure? Once the backpressure is gone, then the EMR
    >> cluster would scale back down?
    >
    >
    >
    >
    >
    > --
    > Sent from: https://nam02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fapache-flink-user-mailing-list-archive.2336050.n4.nabble.com%2F&amp;data=02%7C01%7CMichael.Nguyen79%40t-mobile.com%7Cfda964dacbf04cc2d2cb08d7799bc347%7Cbe0f980bdd994b19bd7bbc71a09b026c%7C0%7C0%7C637111582216065152&amp;sdata=FIHbAWbkH7Tq8AjG%2BVSZoxNrOl0Bn6SukN%2BY1PyhSHg%3D&amp;reserved=0
   
   

Reply | Threaded
Open this post in threaded view
|

Re: How does Flink handle backpressure in EMR

r_khachatryan
@Michael,
Could you please describe your topology with which operators being slow, back-pressured and probably skews in sources?

Regards,
Roman


On Thu, Dec 5, 2019 at 6:20 PM Nguyen, Michael <[hidden email]> wrote:
Thank you for the response Roman and Piotrek!

@Roman - can you clarify on what you mean when you mentioned Flink propagating it back to the sources?

Also, if one of my Flink operators is processing records too slowly and is getting further away from the latest record of my source data stream, is there a way to detect this slow processing in Flink? Would this be detected by Flink's backpressure mechanism?

Thanks,
Michael

On 12/5/19, 7:57 AM, "Piotr Nowojski" <[hidden email] on behalf of [hidden email]> wrote:

    [External]


    Hi Michael,

    As Roman pointed out Flink currently doesn’t support the auto-scaling. It’s on our roadmap but it requires quite a bit of preliminary work to happen before.

    Piotrek

    > On 5 Dec 2019, at 15:32, r_khachatryan <[hidden email]> wrote:
    >
    > Hi Michael
    >
    > Flink *does* detect backpressure but currently, it only propagates it back
    > to sources.
    > And so it doesn't support auto-scaling.
    >
    > Regards,
    > Roman
    >
    >
    > Nguyen, Michael wrote
    >> How does Flink handle backpressure (caused by an increase in traffic) in a
    >> Flink job when it’s being hosted in an EMR cluster? Does Flink detect the
    >> backpressure and auto-scales the EMR cluster to handle the workload to
    >> relieve the backpressure? Once the backpressure is gone, then the EMR
    >> cluster would scale back down?
    >
    >
    >
    >
    >
    > --
    > Sent from: https://nam02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fapache-flink-user-mailing-list-archive.2336050.n4.nabble.com%2F&amp;data=02%7C01%7CMichael.Nguyen79%40t-mobile.com%7Cfda964dacbf04cc2d2cb08d7799bc347%7Cbe0f980bdd994b19bd7bbc71a09b026c%7C0%7C0%7C637111582216065152&amp;sdata=FIHbAWbkH7Tq8AjG%2BVSZoxNrOl0Bn6SukN%2BY1PyhSHg%3D&amp;reserved=0



Reply | Threaded
Open this post in threaded view
|

Re: How does Flink handle backpressure in EMR

Nguyen, Michael

Hi Roman,

 

So right now we have a couple Flink jobs that consumes data from one Kinesis data stream. These jobs vary from a simple dump into a PostgreSQL table to calculating anomalies in a 30 minute window.

 

One large scenario we were worried about was what if one of our jobs was taking a long time to process the Kinesis stream data? How would we detect this scenario from within our Flink job?

 

We do not want our Flink jobs to lag too far from the latest point in our Kinesis stream as we are trying to deliver information in (near) real-time.

 

From: Khachatryan Roman <[hidden email]>
Date: Thursday, December 5, 2019 at 9:47 AM
To: Michael Nguyen <[hidden email]>
Cc: Piotr Nowojski <[hidden email]>, "[hidden email]" <[hidden email]>
Subject: Re: How does Flink handle backpressure in EMR

 

[External]

 

@Michael,

Could you please describe your topology with which operators being slow, back-pressured and probably skews in sources?

 

Regards,

Roman

 

 

On Thu, Dec 5, 2019 at 6:20 PM Nguyen, Michael <[hidden email]> wrote:

Thank you for the response Roman and Piotrek!

@Roman - can you clarify on what you mean when you mentioned Flink propagating it back to the sources?

Also, if one of my Flink operators is processing records too slowly and is getting further away from the latest record of my source data stream, is there a way to detect this slow processing in Flink? Would this be detected by Flink's backpressure mechanism?

Thanks,
Michael

On 12/5/19, 7:57 AM, "Piotr Nowojski" <[hidden email] on behalf of [hidden email]> wrote:

    [External]


    Hi Michael,

    As Roman pointed out Flink currently doesn’t support the auto-scaling. It’s on our roadmap but it requires quite a bit of preliminary work to happen before.

    Piotrek

    > On 5 Dec 2019, at 15:32, r_khachatryan <[hidden email]> wrote:
    >
    > Hi Michael
    >
    > Flink *does* detect backpressure but currently, it only propagates it back
    > to sources.
    > And so it doesn't support auto-scaling.
    >
    > Regards,
    > Roman
    >
    >
    > Nguyen, Michael wrote
    >> How does Flink handle backpressure (caused by an increase in traffic) in a
    >> Flink job when it’s being hosted in an EMR cluster? Does Flink detect the
    >> backpressure and auto-scales the EMR cluster to handle the workload to
    >> relieve the backpressure? Once the backpressure is gone, then the EMR
    >> cluster would scale back down?
    >
    >
    >
    >
    >
    > --
    > Sent from: https://nam02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fapache-flink-user-mailing-list-archive.2336050.n4.nabble.com%2F&amp;data=02%7C01%7CMichael.Nguyen79%40t-mobile.com%7Cfda964dacbf04cc2d2cb08d7799bc347%7Cbe0f980bdd994b19bd7bbc71a09b026c%7C0%7C0%7C637111582216065152&amp;sdata=FIHbAWbkH7Tq8AjG%2BVSZoxNrOl0Bn6SukN%2BY1PyhSHg%3D&amp;reserved=0


Reply | Threaded
Open this post in threaded view
|

Re: How does Flink handle backpressure in EMR

Piotr Nowojski-3
Hi,

If you are using event time and watermarks, you can monitor the delays using `currentInputWatermark` metric [1]. If not (or alternatively), this blog post [2] describes how to check back pressure status [2] for Flink up to 1.9. In Flink 1.10 there will be an additional new metric for that [3].

Piotrek

[3] https://issues.apache.org/jira/browse/FLINK-14813

On 5 Dec 2019, at 19:11, Nguyen, Michael <[hidden email]> wrote:

Hi Roman,
 
So right now we have a couple Flink jobs that consumes data from one Kinesis data stream. These jobs vary from a simple dump into a PostgreSQL table to calculating anomalies in a 30 minute window.
 
One large scenario we were worried about was what if one of our jobs was taking a long time to process the Kinesis stream data? How would we detect this scenario from within our Flink job?
 
We do not want our Flink jobs to lag too far from the latest point in our Kinesis stream as we are trying to deliver information in (near) real-time.
 
From: Khachatryan Roman <[hidden email]>
Date: Thursday, December 5, 2019 at 9:47 AM
To: Michael Nguyen <[hidden email]>
Cc: Piotr Nowojski <[hidden email]>, "[hidden email]" <[hidden email]>
Subject: Re: How does Flink handle backpressure in EMR
 
[External]
 
@Michael, 
Could you please describe your topology with which operators being slow, back-pressured and probably skews in sources?
 
Regards,
Roman
 
 
On Thu, Dec 5, 2019 at 6:20 PM Nguyen, Michael <[hidden email]> wrote:

Thank you for the response Roman and Piotrek!

@Roman - can you clarify on what you mean when you mentioned Flink propagating it back to the sources? 

Also, if one of my Flink operators is processing records too slowly and is getting further away from the latest record of my source data stream, is there a way to detect this slow processing in Flink? Would this be detected by Flink's backpressure mechanism?

Thanks,
Michael

On 12/5/19, 7:57 AM, "Piotr Nowojski" <[hidden email] on behalf of [hidden email]> wrote:

    [External]


    Hi Michael,

    As Roman pointed out Flink currently doesn’t support the auto-scaling. It’s on our roadmap but it requires quite a bit of preliminary work to happen before.

    Piotrek

    > On 5 Dec 2019, at 15:32, r_khachatryan <[hidden email]> wrote:
    >
    > Hi Michael
    >
    > Flink *does* detect backpressure but currently, it only propagates it back
    > to sources.
    > And so it doesn't support auto-scaling.
    >
    > Regards,
    > Roman
    >
    >
    > Nguyen, Michael wrote
    >> How does Flink handle backpressure (caused by an increase in traffic) in a
    >> Flink job when it’s being hosted in an EMR cluster? Does Flink detect the
    >> backpressure and auto-scales the EMR cluster to handle the workload to
    >> relieve the backpressure? Once the backpressure is gone, then the EMR
    >> cluster would scale back down?
    >
    >
    >
    >
    >
    > --
    > Sent from: https://nam02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fapache-flink-user-mailing-list-archive.2336050.n4.nabble.com%2F&amp;data=02%7C01%7CMichael.Nguyen79%40t-mobile.com%7Cfda964dacbf04cc2d2cb08d7799bc347%7Cbe0f980bdd994b19bd7bbc71a09b026c%7C0%7C0%7C637111582216065152&amp;sdata=FIHbAWbkH7Tq8AjG%2BVSZoxNrOl0Bn6SukN%2BY1PyhSHg%3D&amp;reserved=0


Reply | Threaded
Open this post in threaded view
|

Re: How does Flink handle backpressure in EMR

Nguyen, Michael

Hi Piotrek,

 

For the second article, I understand I can monitor the backpressure status via the Flink Web UI. Can I refer to the same metrics in my Flink jobs itself? For example, can I put in an if statement to check for when outPoolUsage reaches 100%?

 

Thank you,

Michael

 

From: Piotr Nowojski <[hidden email]>
Date: Thursday, December 5, 2019 at 10:27 AM
To: Michael Nguyen <[hidden email]>
Cc: Khachatryan Roman <[hidden email]>, "[hidden email]" <[hidden email]>
Subject: Re: How does Flink handle backpressure in EMR

 

[External]

 

Hi,

 

If you are using event time and watermarks, you can monitor the delays using `currentInputWatermark` metric [1]. If not (or alternatively), this blog post [2] describes how to check back pressure status [2] for Flink up to 1.9. In Flink 1.10 there will be an additional new metric for that [3].

 

Piotrek

 

[3] https://issues.apache.org/jira/browse/FLINK-14813



On 5 Dec 2019, at 19:11, Nguyen, Michael <[hidden email]> wrote:

 

Hi Roman,

 

So right now we have a couple Flink jobs that consumes data from one Kinesis data stream. These jobs vary from a simple dump into a PostgreSQL table to calculating anomalies in a 30 minute window.

 

One large scenario we were worried about was what if one of our jobs was taking a long time to process the Kinesis stream data? How would we detect this scenario from within our Flink job?

 

We do not want our Flink jobs to lag too far from the latest point in our Kinesis stream as we are trying to deliver information in (near) real-time.

 

From: Khachatryan Roman <[hidden email]>
Date: Thursday, December 5, 2019 at 9:47 AM
To: Michael Nguyen <[hidden email]>
Cc: Piotr Nowojski <[hidden email]>, "[hidden email]" <[hidden email]>
Subject: Re: How does Flink handle backpressure in EMR

 

[External]

 

@Michael, 

Could you please describe your topology with which operators being slow, back-pressured and probably skews in sources?

 

Regards,

Roman

 

 

On Thu, Dec 5, 2019 at 6:20 PM Nguyen, Michael <[hidden email]> wrote:

Thank you for the response Roman and Piotrek!

@Roman - can you clarify on what you mean when you mentioned Flink propagating it back to the sources? 

Also, if one of my Flink operators is processing records too slowly and is getting further away from the latest record of my source data stream, is there a way to detect this slow processing in Flink? Would this be detected by Flink's backpressure mechanism?

Thanks,
Michael

On 12/5/19, 7:57 AM, "Piotr Nowojski" <[hidden email] on behalf of [hidden email]> wrote:

    [External]


    Hi Michael,

    As Roman pointed out Flink currently doesn’t support the auto-scaling. It’s on our roadmap but it requires quite a bit of preliminary work to happen before.

    Piotrek

    > On 5 Dec 2019, at 15:32, r_khachatryan <[hidden email]> wrote:
    >
    > Hi Michael
    >
    > Flink *does* detect backpressure but currently, it only propagates it back
    > to sources.
    > And so it doesn't support auto-scaling.
    >
    > Regards,
    > Roman
    >
    >
    > Nguyen, Michael wrote
    >> How does Flink handle backpressure (caused by an increase in traffic) in a
    >> Flink job when it’s being hosted in an EMR cluster? Does Flink detect the
    >> backpressure and auto-scales the EMR cluster to handle the workload to
    >> relieve the backpressure? Once the backpressure is gone, then the EMR
    >> cluster would scale back down?
    >
    >
    >
    >
    >
    > --
    > Sent from: https://nam02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fapache-flink-user-mailing-list-archive.2336050.n4.nabble.com%2F&amp;data=02%7C01%7CMichael.Nguyen79%40t-mobile.com%7Cfda964dacbf04cc2d2cb08d7799bc347%7Cbe0f980bdd994b19bd7bbc71a09b026c%7C0%7C0%7C637111582216065152&amp;sdata=FIHbAWbkH7Tq8AjG%2BVSZoxNrOl0Bn6SukN%2BY1PyhSHg%3D&amp;reserved=0

 

Reply | Threaded
Open this post in threaded view
|

Re: How does Flink handle backpressure in EMR

Piotr Nowojski-3
Hi,

You can find information how to use metrics here [1]. I don’t think there is a straightforward way to access them from within a job. You could access them via JMX when using JMXReporter or you can implement some custom reporter, that could expose the metrics via localhost connections or some static (:S) variables.

Piotrek

[1] https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/metrics.html

On 5 Dec 2019, at 19:40, Nguyen, Michael <[hidden email]> wrote:

Hi Piotrek,
 
For the second article, I understand I can monitor the backpressure status via the Flink Web UI. Can I refer to the same metrics in my Flink jobs itself? For example, can I put in an if statement to check for when outPoolUsage reaches 100%?
 
Thank you,
Michael
 
From: Piotr Nowojski <[hidden email]>
Date: Thursday, December 5, 2019 at 10:27 AM
To: Michael Nguyen <[hidden email]>
Cc: Khachatryan Roman <[hidden email]>, "[hidden email]" <[hidden email]>
Subject: Re: How does Flink handle backpressure in EMR
 
[External]
 
Hi, 
 
If you are using event time and watermarks, you can monitor the delays using `currentInputWatermark` metric [1]. If not (or alternatively), this blog post [2] describes how to check back pressure status [2] for Flink up to 1.9. In Flink 1.10 there will be an additional new metric for that [3].
 
Piotrek
 


On 5 Dec 2019, at 19:11, Nguyen, Michael <[hidden email]> wrote:
 
Hi Roman,
 
So right now we have a couple Flink jobs that consumes data from one Kinesis data stream. These jobs vary from a simple dump into a PostgreSQL table to calculating anomalies in a 30 minute window.
 
One large scenario we were worried about was what if one of our jobs was taking a long time to process the Kinesis stream data? How would we detect this scenario from within our Flink job?
 
We do not want our Flink jobs to lag too far from the latest point in our Kinesis stream as we are trying to deliver information in (near) real-time.
 
From: Khachatryan Roman <[hidden email]>
Date: Thursday, December 5, 2019 at 9:47 AM
To: Michael Nguyen <[hidden email]>
Cc: Piotr Nowojski <[hidden email]>, "[hidden email]" <[hidden email]>
Subject: Re: How does Flink handle backpressure in EMR
 
[External]
 
@Michael, 
Could you please describe your topology with which operators being slow, back-pressured and probably skews in sources?
 
Regards,
Roman
 
 
On Thu, Dec 5, 2019 at 6:20 PM Nguyen, Michael <[hidden email]> wrote:

Thank you for the response Roman and Piotrek!

@Roman - can you clarify on what you mean when you mentioned Flink propagating it back to the sources? 

Also, if one of my Flink operators is processing records too slowly and is getting further away from the latest record of my source data stream, is there a way to detect this slow processing in Flink? Would this be detected by Flink's backpressure mechanism?

Thanks,
Michael

On 12/5/19, 7:57 AM, "Piotr Nowojski" <[hidden email] on behalf of [hidden email]> wrote:

    [External]


    Hi Michael,

    As Roman pointed out Flink currently doesn’t support the auto-scaling. It’s on our roadmap but it requires quite a bit of preliminary work to happen before.

    Piotrek

    > On 5 Dec 2019, at 15:32, r_khachatryan <[hidden email]> wrote:
    >
    > Hi Michael
    >
    > Flink *does* detect backpressure but currently, it only propagates it back
    > to sources.
    > And so it doesn't support auto-scaling.
    >
    > Regards,
    > Roman
    >
    >
    > Nguyen, Michael wrote
    >> How does Flink handle backpressure (caused by an increase in traffic) in a
    >> Flink job when it’s being hosted in an EMR cluster? Does Flink detect the
    >> backpressure and auto-scales the EMR cluster to handle the workload to
    >> relieve the backpressure? Once the backpressure is gone, then the EMR
    >> cluster would scale back down?
    >
    >
    >
    >
    >
    > --
    > Sent from: https://nam02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fapache-flink-user-mailing-list-archive.2336050.n4.nabble.com%2F&amp;data=02%7C01%7CMichael.Nguyen79%40t-mobile.com%7Cfda964dacbf04cc2d2cb08d7799bc347%7Cbe0f980bdd994b19bd7bbc71a09b026c%7C0%7C0%7C637111582216065152&amp;sdata=FIHbAWbkH7Tq8AjG%2BVSZoxNrOl0Bn6SukN%2BY1PyhSHg%3D&amp;reserved=0


Reply | Threaded
Open this post in threaded view
|

Re: How does Flink handle backpressure in EMR

Nguyen, Michael

Great! I’ll try it out – thank you Piotrek.

 

Michael

 

From: Piotr Nowojski <[hidden email]> on behalf of Piotr Nowojski <[hidden email]>
Date: Thursday, December 5, 2019 at 11:03 AM
To: Michael Nguyen <[hidden email]>
Cc: Khachatryan Roman <[hidden email]>, "[hidden email]" <[hidden email]>
Subject: Re: How does Flink handle backpressure in EMR

 

[External]

 

Hi,

 

You can find information how to use metrics here [1]. I don’t think there is a straightforward way to access them from within a job. You could access them via JMX when using JMXReporter or you can implement some custom reporter, that could expose the metrics via localhost connections or some static (:S) variables.

 

Piotrek

 

[1] https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/metrics.html



On 5 Dec 2019, at 19:40, Nguyen, Michael <[hidden email]> wrote:

 

Hi Piotrek,

 

For the second article, I understand I can monitor the backpressure status via the Flink Web UI. Can I refer to the same metrics in my Flink jobs itself? For example, can I put in an if statement to check for when outPoolUsage reaches 100%?

 

Thank you,

Michael

 

From: Piotr Nowojski <[hidden email]>
Date: Thursday, December 5, 2019 at 10:27 AM
To: Michael Nguyen <[hidden email]>
Cc: Khachatryan Roman <[hidden email]>, "[hidden email]" <[hidden email]>
Subject: Re: How does Flink handle backpressure in EMR

 

[External]

 

Hi, 

 

If you are using event time and watermarks, you can monitor the delays using `currentInputWatermark` metric [1]. If not (or alternatively), this blog post [2] describes how to check back pressure status [2] for Flink up to 1.9. In Flink 1.10 there will be an additional new metric for that [3].

 

Piotrek

 




On 5 Dec 2019, at 19:11, Nguyen, Michael <[hidden email]> wrote:

 

Hi Roman,

 

So right now we have a couple Flink jobs that consumes data from one Kinesis data stream. These jobs vary from a simple dump into a PostgreSQL table to calculating anomalies in a 30 minute window.

 

One large scenario we were worried about was what if one of our jobs was taking a long time to process the Kinesis stream data? How would we detect this scenario from within our Flink job?

 

We do not want our Flink jobs to lag too far from the latest point in our Kinesis stream as we are trying to deliver information in (near) real-time.

 

From: Khachatryan Roman <[hidden email]>
Date: Thursday, December 5, 2019 at 9:47 AM
To: Michael Nguyen <[hidden email]>
Cc: Piotr Nowojski <[hidden email]>, "[hidden email]" <[hidden email]>
Subject: Re: How does Flink handle backpressure in EMR

 

[External]

 

@Michael, 

Could you please describe your topology with which operators being slow, back-pressured and probably skews in sources?

 

Regards,

Roman

 

 

On Thu, Dec 5, 2019 at 6:20 PM Nguyen, Michael <[hidden email]> wrote:

Thank you for the response Roman and Piotrek!

@Roman - can you clarify on what you mean when you mentioned Flink propagating it back to the sources? 

Also, if one of my Flink operators is processing records too slowly and is getting further away from the latest record of my source data stream, is there a way to detect this slow processing in Flink? Would this be detected by Flink's backpressure mechanism?

Thanks,
Michael

On 12/5/19, 7:57 AM, "Piotr Nowojski" <[hidden email] on behalf of [hidden email]> wrote:

    [External]


    Hi Michael,

    As Roman pointed out Flink currently doesn’t support the auto-scaling. It’s on our roadmap but it requires quite a bit of preliminary work to happen before.

    Piotrek

    > On 5 Dec 2019, at 15:32, r_khachatryan <[hidden email]> wrote:
    >
    > Hi Michael
    >
    > Flink *does* detect backpressure but currently, it only propagates it back
    > to sources.
    > And so it doesn't support auto-scaling.
    >
    > Regards,
    > Roman
    >
    >
    > Nguyen, Michael wrote
    >> How does Flink handle backpressure (caused by an increase in traffic) in a
    >> Flink job when it’s being hosted in an EMR cluster? Does Flink detect the
    >> backpressure and auto-scales the EMR cluster to handle the workload to
    >> relieve the backpressure? Once the backpressure is gone, then the EMR
    >> cluster would scale back down?
    >
    >
    >
    >
    >
    > --
    > Sent from: https://nam02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fapache-flink-user-mailing-list-archive.2336050.n4.nabble.com%2F&amp;data=02%7C01%7CMichael.Nguyen79%40t-mobile.com%7Cfda964dacbf04cc2d2cb08d7799bc347%7Cbe0f980bdd994b19bd7bbc71a09b026c%7C0%7C0%7C637111582216065152&amp;sdata=FIHbAWbkH7Tq8AjG%2BVSZoxNrOl0Bn6SukN%2BY1PyhSHg%3D&amp;reserved=0