FYI: Blog Post on Flink's Streaming Performance and Fault Tolerance

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

FYI: Blog Post on Flink's Streaming Performance and Fault Tolerance

Stephan Ewen
Hi all!

We just published a blog post about how streaming fault tolerance mechanisms evolved, and what kind of performance Flink gets with its checkpointing mechanism.

I think it is a pretty interesting read for people that are interested in Flink or data streaming in general.

The blog post talks about:

  - Fault tolerance techniques, starting from acknowledgements, over micro batches, to transactional updates and distributed snapshots.

  - Performance of Flink, throughput, latency, and tradeoffs.

  - A "chaos monkey" experiment where computation continues strongly consistent even when periodically killing workers.


Comments welcome!

Greetings,
Stephan


Reply | Threaded
Open this post in threaded view
|

Re: FYI: Blog Post on Flink's Streaming Performance and Fault Tolerance

Stephan Ewen

On Wed, Aug 5, 2015 at 4:11 PM, Stephan Ewen <[hidden email]> wrote:
Hi all!

We just published a blog post about how streaming fault tolerance mechanisms evolved, and what kind of performance Flink gets with its checkpointing mechanism.

I think it is a pretty interesting read for people that are interested in Flink or data streaming in general.

The blog post talks about:

  - Fault tolerance techniques, starting from acknowledgements, over micro batches, to transactional updates and distributed snapshots.

  - Performance of Flink, throughput, latency, and tradeoffs.

  - A "chaos monkey" experiment where computation continues strongly consistent even when periodically killing workers.


Comments welcome!

Greetings,
Stephan



Reply | Threaded
Open this post in threaded view
|

Re: FYI: Blog Post on Flink's Streaming Performance and Fault Tolerance

hawin
Great job, Guys

Let me read it carefully. 







On Wed, Aug 5, 2015 at 7:25 AM, Stephan Ewen <[hidden email]> wrote:

On Wed, Aug 5, 2015 at 4:11 PM, Stephan Ewen <[hidden email]> wrote:
Hi all!

We just published a blog post about how streaming fault tolerance mechanisms evolved, and what kind of performance Flink gets with its checkpointing mechanism.

I think it is a pretty interesting read for people that are interested in Flink or data streaming in general.

The blog post talks about:

  - Fault tolerance techniques, starting from acknowledgements, over micro batches, to transactional updates and distributed snapshots.

  - Performance of Flink, throughput, latency, and tradeoffs.

  - A "chaos monkey" experiment where computation continues strongly consistent even when periodically killing workers.


Comments welcome!

Greetings,
Stephan




Reply | Threaded
Open this post in threaded view
|

Re: FYI: Blog Post on Flink's Streaming Performance and Fault Tolerance

Ankur Chauhan
Pretty awesome piece. 

Sent from my iPhone

On Aug 5, 2015, at 10:10, Hawin Jiang <[hidden email]> wrote:

Great job, Guys

Let me read it carefully. 







On Wed, Aug 5, 2015 at 7:25 AM, Stephan Ewen <[hidden email]> wrote:

On Wed, Aug 5, 2015 at 4:11 PM, Stephan Ewen <[hidden email]> wrote:
Hi all!

We just published a blog post about how streaming fault tolerance mechanisms evolved, and what kind of performance Flink gets with its checkpointing mechanism.

I think it is a pretty interesting read for people that are interested in Flink or data streaming in general.

The blog post talks about:

  - Fault tolerance techniques, starting from acknowledgements, over micro batches, to transactional updates and distributed snapshots.

  - Performance of Flink, throughput, latency, and tradeoffs.

  - A "chaos monkey" experiment where computation continues strongly consistent even when periodically killing workers.


Comments welcome!

Greetings,
Stephan