Benchmarking streaming frameworks

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Benchmarking streaming frameworks

Giselle van Dongen

Dear users of Streaming Technologies,

As a PhD student in big data analytics, I am currently in the process of
compiling a list of benchmarks (to test multiple streaming frameworks) in
order to create an expanded benchmarking suite. The benchmark suite is being
developed as a part of my current work at Ghent University.

The included frameworks at this time are, in no particular order, Spark,
Flink, Kafka (Streams), Storm (Trident) and Drizzle. Any pointers to
previous work or relevant benchmarks would be appreciated.

Best regards,
Giselle van Dongen

Reply | Threaded
Open this post in threaded view
|

Re: Benchmarking streaming frameworks

Christophe Salperwyck

2017-03-23 11:09 GMT+01:00 Giselle van Dongen <[hidden email]>:

Dear users of Streaming Technologies,

As a PhD student in big data analytics, I am currently in the process of
compiling a list of benchmarks (to test multiple streaming frameworks) in
order to create an expanded benchmarking suite. The benchmark suite is being
developed as a part of my current work at Ghent University.

The included frameworks at this time are, in no particular order, Spark,
Flink, Kafka (Streams), Storm (Trident) and Drizzle. Any pointers to
previous work or relevant benchmarks would be appreciated.

Best regards,
Giselle van Dongen


Reply | Threaded
Open this post in threaded view
|

Re: Benchmarking streaming frameworks

Felix Neutatz
Hi,

our team already created a benchmark framework for batch processing (including MR,Yarn, Spark, Flink), maybe you like to extend it for streaming: https://github.com/peelframework/peel

Best regards,
Felix

On Mar 23, 2017 11:51, "Christophe Salperwyck" <[hidden email]> wrote:

2017-03-23 11:09 GMT+01:00 Giselle van Dongen <[hidden email]>:

Dear users of Streaming Technologies,

As a PhD student in big data analytics, I am currently in the process of
compiling a list of benchmarks (to test multiple streaming frameworks) in
order to create an expanded benchmarking suite. The benchmark suite is being
developed as a part of my current work at Ghent University.

The included frameworks at this time are, in no particular order, Spark,
Flink, Kafka (Streams), Storm (Trident) and Drizzle. Any pointers to
previous work or relevant benchmarks would be appreciated.

Best regards,
Giselle van Dongen



Reply | Threaded
Open this post in threaded view
|

Re: Benchmarking streaming frameworks

Michael Noll
A recent one is "Analytics on Fast Data: Main-Memory Database Systems versus Modern Streaming Systems" (http://db.in.tum.de/~kipf/papers/fastdata.pdf)

For the record, the paper above doesn't yet cover/realize that, nowadays, the Kafka project includes native stream processing capabilities aka the Kafka Streams API.

-Michael


On Thu, Mar 23, 2017 at 2:00 PM, Felix Neutatz <[hidden email]> wrote:
Hi,

our team already created a benchmark framework for batch processing (including MR,Yarn, Spark, Flink), maybe you like to extend it for streaming: https://github.com/peelframework/peel

Best regards,
Felix


On Mar 23, 2017 11:51, "Christophe Salperwyck" <[hidden email]> wrote:

2017-03-23 11:09 GMT+01:00 Giselle van Dongen <[hidden email]>:

Dear users of Streaming Technologies,

As a PhD student in big data analytics, I am currently in the process of
compiling a list of benchmarks (to test multiple streaming frameworks) in
order to create an expanded benchmarking suite. The benchmark suite is being
developed as a part of my current work at Ghent University.

The included frameworks at this time are, in no particular order, Spark,
Flink, Kafka (Streams), Storm (Trident) and Drizzle. Any pointers to
previous work or relevant benchmarks would be appreciated.

Best regards,
Giselle van Dongen






Reply | Threaded
Open this post in threaded view
|

Re: Benchmarking streaming frameworks

Dominik Safaric
In reply to this post by Giselle van Dongen
Dear Giselle,

Various stream processing engines benchmarks already exist. Here are only a few of them I believe are worthwhile mentioning:

Regards,
Dominik

On 23 Mar 2017, at 11:09, Giselle van Dongen <[hidden email]> wrote:

Dear users of Streaming Technologies,

As a PhD student in big data analytics, I am currently in the process of
compiling a list of benchmarks (to test multiple streaming frameworks) in
order to create an expanded benchmarking suite. The benchmark suite is being
developed as a part of my current work at Ghent University.

The included frameworks at this time are, in no particular order, Spark,
Flink, Kafka (Streams), Storm (Trident) and Drizzle. Any pointers to
previous work or relevant benchmarks would be appreciated.

Best regards,
Giselle van Dongen