(DEPRECATED) Apache Flink User Mailing List archive.

Adding and removing operations after execute

Classic

List

Threaded

4 messages Options

adamlehenbauer

Adding and removing operations after execute

Hi, I'm exploring using Flink to replace an in-house micro-batch application. Many of the features and concepts are perfect for what I need, but the biggest gap is that there doesn't seem to be a way to add new operations at runtime after execute().

What is the preferred approach for adding new operations, windows, etc to a running application? Should I start multiple execution contexts?

Kostas Kloudas

Re: Adding and removing operations after execute

Hi,

The best way to do so is to use a Flink feature called savepoints. You can find more here:

https://ci.apache.org/projects/flink/flink-docs-master/apis/streaming/savepoints.html

In a nutshell, savepoints just take a consistent snapshot of the state of your job at the time you

take them, and you can resume execution from that point.

Using this, you can write your initial job, whenever you want to add the new operator you take a save point,

and after adding your new operator, you can start the execution of the new job from the point where the old job stopped.

In addition, the old job can still keep running, in case you need it, so there will be no downtime for that.

If this does not cover your use case, it would be helpful to share some more information about

what exactly you want to do, so that we can figure out a solution that fits your needs.

Kostas

On Jul 7, 2016, at 1:25 PM, adamlehenbauer <[hidden email]> wrote:

Hi, I'm exploring using Flink to replace an in-house micro-batch application.
Many of the features and concepts are perfect for what I need, but the
biggest gap is that there doesn't seem to be a way to add new operations at
runtime after execute().

What is the preferred approach for adding new operations, windows, etc to a
running application? Should I start multiple execution contexts?

--
View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Adding-and-removing-operations-after-execute-tp7863.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.

Jamie Grier

Re: Adding and removing operations after execute

Hi Adam,

Another way to do this, depending on your exact requirements, could be to consume a second stream that essentially "configures" the operators that make up the Flink job thus dynamically altering the behavior of the job at runtime. Whether or not this approach is feasible really depends on exactly what you're trying to accomplish, though. For some users this type of approach works very well.

However, if you really need to add new operators to the running job that's currently not possible with Flink. The best approach there is exactly as Kostas said.

-Jamie

On Thu, Jul 7, 2016 at 5:24 AM, Kostas Kloudas <[hidden email]> wrote:

Hi,

The best way to do so is to use a Flink feature called savepoints. You can find more here:

https://ci.apache.org/projects/flink/flink-docs-master/apis/streaming/savepoints.html

In a nutshell, savepoints just take a consistent snapshot of the state of your job at the time you
take them, and you can resume execution from that point.

Using this, you can write your initial job, whenever you want to add the new operator you take a save point,
and after adding your new operator, you can start the execution of the new job from the point where the old job stopped.
In addition, the old job can still keep running, in case you need it, so there will be no downtime for that.

If this does not cover your use case, it would be helpful to share some more information about
what exactly you want to do, so that we can figure out a solution that fits your needs.

Kostas

On Jul 7, 2016, at 1:25 PM, adamlehenbauer <[hidden email]> wrote:

Hi, I'm exploring using Flink to replace an in-house micro-batch application.
Many of the features and concepts are perfect for what I need, but the
biggest gap is that there doesn't seem to be a way to add new operations at
runtime after execute().

What is the preferred approach for adding new operations, windows, etc to a
running application? Should I start multiple execution contexts?

--
View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Adding-and-removing-operations-after-execute-tp7863.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.

Jamie Grier

data Artisans, Director of Applications Engineering

@jamiegrier

[hidden email]

Aljoscha Krettek

Re: Adding and removing operations after execute

This Blog post goes into the direction of what Jamie suggested: https://techblog.king.com/rbea-scalable-real-time-analytics-king/ The folks at King developed a system where users can dynamically inject scripts written in Groovy into a running general-purpose Flink job.

On Thu, 7 Jul 2016 at 20:34 Jamie Grier <[hidden email]> wrote:

Hi Adam,

Another way to do this, depending on your exact requirements, could be to consume a second stream that essentially "configures" the operators that make up the Flink job thus dynamically altering the behavior of the job at runtime. Whether or not this approach is feasible really depends on exactly what you're trying to accomplish, though. For some users this type of approach works very well.

However, if you really need to add new operators to the running job that's currently not possible with Flink. The best approach there is exactly as Kostas said.

-Jamie

On Thu, Jul 7, 2016 at 5:24 AM, Kostas Kloudas <[hidden email]> wrote:
Hi,

The best way to do so is to use a Flink feature called savepoints. You can find more here:

https://ci.apache.org/projects/flink/flink-docs-master/apis/streaming/savepoints.html

In a nutshell, savepoints just take a consistent snapshot of the state of your job at the time you
take them, and you can resume execution from that point.

Using this, you can write your initial job, whenever you want to add the new operator you take a save point,
and after adding your new operator, you can start the execution of the new job from the point where the old job stopped.
In addition, the old job can still keep running, in case you need it, so there will be no downtime for that.

If this does not cover your use case, it would be helpful to share some more information about
what exactly you want to do, so that we can figure out a solution that fits your needs.

Kostas

On Jul 7, 2016, at 1:25 PM, adamlehenbauer <[hidden email]> wrote:

Hi, I'm exploring using Flink to replace an in-house micro-batch application.
Many of the features and concepts are perfect for what I need, but the
biggest gap is that there doesn't seem to be a way to add new operations at
runtime after execute().

What is the preferred approach for adding new operations, windows, etc to a
running application? Should I start multiple execution contexts?

--
View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Adding-and-removing-operations-after-execute-tp7863.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.

--

Jamie Grier
data Artisans, Director of Applications Engineering
@jamiegrier
[hidden email]