Adding and removing operations after execute

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Adding and removing operations after execute

adamlehenbauer
Hi, I'm exploring using Flink to replace an in-house micro-batch application. Many of the features and concepts are perfect for what I need, but the biggest gap is that there doesn't seem to be a way to add new operations at runtime after execute().

What is the preferred approach for adding new operations, windows, etc to a running application? Should I start multiple execution contexts?
Reply | Threaded
Open this post in threaded view
|

Re: Adding and removing operations after execute

Kostas Kloudas
Hi,

The best way to do so is to use a Flink feature called savepoints. You can find more here:


In a nutshell, savepoints just take a consistent snapshot of the state of your job at the time you 
take them, and you can resume execution from that point.

Using this, you can write your initial job, whenever you want to add the new operator you take a save point, 
and after adding your new operator, you can start the execution of the new job from the point where the old job stopped. 
In addition, the old job can still keep running, in case you need it, so there will be no downtime for that.

If this does not cover your use case, it would be helpful to share some more information about 
what exactly you want to do, so that we can figure out a solution that fits your needs.

Kostas

On Jul 7, 2016, at 1:25 PM, adamlehenbauer <[hidden email]> wrote:

Hi, I'm exploring using Flink to replace an in-house micro-batch application.
Many of the features and concepts are perfect for what I need, but the
biggest gap is that there doesn't seem to be a way to add new operations at
runtime after execute().

What is the preferred approach for adding new operations, windows, etc to a
running application? Should I start multiple execution contexts?




--
View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Adding-and-removing-operations-after-execute-tp7863.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.

Reply | Threaded
Open this post in threaded view
|

Re: Adding and removing operations after execute

Jamie Grier
Hi Adam,

Another way to do this, depending on your exact requirements, could be to consume a second stream that essentially "configures" the operators that make up the Flink job thus dynamically altering the behavior of the job at runtime.  Whether or not this approach is feasible really depends on exactly what you're trying to accomplish, though.  For some users this type of approach works very well.

However, if you really need to add new operators to the running job that's currently not possible with Flink.  The best approach there is exactly as Kostas said.

-Jamie


On Thu, Jul 7, 2016 at 5:24 AM, Kostas Kloudas <[hidden email]> wrote:
Hi,

The best way to do so is to use a Flink feature called savepoints. You can find more here:


In a nutshell, savepoints just take a consistent snapshot of the state of your job at the time you 
take them, and you can resume execution from that point.

Using this, you can write your initial job, whenever you want to add the new operator you take a save point, 
and after adding your new operator, you can start the execution of the new job from the point where the old job stopped. 
In addition, the old job can still keep running, in case you need it, so there will be no downtime for that.

If this does not cover your use case, it would be helpful to share some more information about 
what exactly you want to do, so that we can figure out a solution that fits your needs.

Kostas

On Jul 7, 2016, at 1:25 PM, adamlehenbauer <[hidden email]> wrote:

Hi, I'm exploring using Flink to replace an in-house micro-batch application.
Many of the features and concepts are perfect for what I need, but the
biggest gap is that there doesn't seem to be a way to add new operations at
runtime after execute().

What is the preferred approach for adding new operations, windows, etc to a
running application? Should I start multiple execution contexts?




--
View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Adding-and-removing-operations-after-execute-tp7863.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.




--

Jamie Grier
data Artisans, Director of Applications Engineering

Reply | Threaded
Open this post in threaded view
|

Re: Adding and removing operations after execute

Aljoscha Krettek
This Blog post goes into the direction of what Jamie suggested: https://techblog.king.com/rbea-scalable-real-time-analytics-king/ The folks at King developed a system where users can dynamically inject scripts written in Groovy into a running general-purpose Flink job.

On Thu, 7 Jul 2016 at 20:34 Jamie Grier <[hidden email]> wrote:
Hi Adam,

Another way to do this, depending on your exact requirements, could be to consume a second stream that essentially "configures" the operators that make up the Flink job thus dynamically altering the behavior of the job at runtime.  Whether or not this approach is feasible really depends on exactly what you're trying to accomplish, though.  For some users this type of approach works very well.

However, if you really need to add new operators to the running job that's currently not possible with Flink.  The best approach there is exactly as Kostas said.

-Jamie


On Thu, Jul 7, 2016 at 5:24 AM, Kostas Kloudas <[hidden email]> wrote:
Hi,

The best way to do so is to use a Flink feature called savepoints. You can find more here:


In a nutshell, savepoints just take a consistent snapshot of the state of your job at the time you 
take them, and you can resume execution from that point.

Using this, you can write your initial job, whenever you want to add the new operator you take a save point, 
and after adding your new operator, you can start the execution of the new job from the point where the old job stopped. 
In addition, the old job can still keep running, in case you need it, so there will be no downtime for that.

If this does not cover your use case, it would be helpful to share some more information about 
what exactly you want to do, so that we can figure out a solution that fits your needs.

Kostas

On Jul 7, 2016, at 1:25 PM, adamlehenbauer <[hidden email]> wrote:

Hi, I'm exploring using Flink to replace an in-house micro-batch application.
Many of the features and concepts are perfect for what I need, but the
biggest gap is that there doesn't seem to be a way to add new operations at
runtime after execute().

What is the preferred approach for adding new operations, windows, etc to a
running application? Should I start multiple execution contexts?




--
View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Adding-and-removing-operations-after-execute-tp7863.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.




--

Jamie Grier
data Artisans, Director of Applications Engineering