(DEPRECATED) Apache Flink User Mailing List archive.

Batch Processing as Streaming

Classic

List

Threaded

3 messages Options

tambunanw

Batch Processing as Streaming

Hi All,

I see that the way batch processing works in Flink is quite different with Spark. It's all about using streaming engine in Flink.

I have a couple of question

1. Is there any support on Checkpointing on batch processing also ? Or that's only for streaming

2. I want to ask about operator lifecyle ? is that short live or long live ? Any docs where i can read about this more ?

Cheers

Stephan Ewen

Re: Batch Processing as Streaming

Hi!

I am actually working to get some more docs out there, there is a lack right now, agreed.

Concerning your questions:

(1) Batch programs basically recover from the data sources right now. Checkpointing as in the streaming case does not happen for batch programs. We have branches that materialize the intermediate streams and apply backtracking logic for batch programs, but they are not merged into the master at this point.

(2) Streaming operators and user functions are long lived. They are started once and live to the end of the stream, or the machine failure.

Greetings,

Stephan

On Thu, Jul 2, 2015 at 11:48 AM, tambunanw <[hidden email]> wrote:

Hi All,

I see that the way batch processing works in Flink is quite different with
Spark. It's all about using streaming engine in Flink.

I have a couple of question

1. Is there any support on Checkpointing on batch processing also ? Or
that's only for streaming

2. I want to ask about operator lifecyle ? is that short live or long live ?
Any docs where i can read about this more ?

Cheers

--
View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Batch-Processing-as-Streaming-tp1909.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.

tambunanw

Re: Batch Processing as Streaming

Thanks Stephan,

That's clear !

Cheers

On Thu, Jul 2, 2015 at 6:13 PM, Stephan Ewen <[hidden email]> wrote:

Hi!

I am actually working to get some more docs out there, there is a lack right now, agreed.

Concerning your questions:

(1) Batch programs basically recover from the data sources right now. Checkpointing as in the streaming case does not happen for batch programs. We have branches that materialize the intermediate streams and apply backtracking logic for batch programs, but they are not merged into the master at this point.

(2) Streaming operators and user functions are long lived. They are started once and live to the end of the stream, or the machine failure.

Greetings,
Stephan

On Thu, Jul 2, 2015 at 11:48 AM, tambunanw <[hidden email]> wrote:
Hi All,

I see that the way batch processing works in Flink is quite different with
Spark. It's all about using streaming engine in Flink.

I have a couple of question

1. Is there any support on Checkpointing on batch processing also ? Or
that's only for streaming

2. I want to ask about operator lifecyle ? is that short live or long live ?
Any docs where i can read about this more ?

Cheers

--
View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Batch-Processing-as-Streaming-tp1909.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.

Welly Tambunan
Triplelands

http://weltam.wordpress.com

http://www.triplelands.com