hot deployment of stream processing(event at a time) jobs

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

hot deployment of stream processing(event at a time) jobs

igor.berman
Hi,
I have simple job that consumes events from Kafka topic and process events with filter/flat-map only(i.e. no aggregation, no windows, no private state)

The most important constraint in my setup is to continue processing no matter what(i.e. stopping for few seconds to cancel job and restart it with new version is not an option because it will take few seconds)

I was thinking about Blue/Green deployment concept and will start new job with new fat-jar while old job still running and then eventually cancel old job

How Flink will handle such scenario? What will happen regarding semantics of event processing in transition time?

I know that core Kafka has consumer re-balancing mechanism, but I'm not too familiar with it

any thought will be highly appreciated

thanks in advance
Igor



Reply | Threaded
Open this post in threaded view
|

Re: hot deployment of stream processing(event at a time) jobs

Ufuk Celebi
I think you are looking for the savepoints feature:
https://ci.apache.org/projects/flink/flink-docs-release-1.0/apis/streaming/savepoints.html

The general idea is to trigger a savepoint, start the second job from
this savepoint (reading from the same topic), and then eventually
cancel the first job. Depending on the sink, the second job needs to
produce its result to a different end point though.

Does this help?

– Ufuk

On Thu, May 19, 2016 at 8:18 AM, Igor Berman <[hidden email]> wrote:

> Hi,
> I have simple job that consumes events from Kafka topic and process events
> with filter/flat-map only(i.e. no aggregation, no windows, no private state)
>
> The most important constraint in my setup is to continue processing no
> matter what(i.e. stopping for few seconds to cancel job and restart it with
> new version is not an option because it will take few seconds)
>
> I was thinking about Blue/Green deployment concept and will start new job
> with new fat-jar while old job still running and then eventually cancel old
> job
>
> How Flink will handle such scenario? What will happen regarding semantics of
> event processing in transition time?
>
> I know that core Kafka has consumer re-balancing mechanism, but I'm not too
> familiar with it
>
> any thought will be highly appreciated
>
> thanks in advance
> Igor
>
>
>