Re: long runtime

Posted by Stephan Ewen on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/long-runtime-tp104p105.html

Hi!

Ad-hoc, that is not easy to say. It depends on your algorithm, how much data replication it does...

We'd need a bit of time to look into the code. It would help if you could roughly sketch the algorithm for us and give us a breakdown of how much time is spent in which operator (like a screenshot of the runtime web monitor).

Greetings,
Stephan


On Wed, Sep 24, 2014 at 6:18 PM, Florian Hönicke <[hidden email]> wrote:
Hello :)

my Flink program is extreme slow.
I implemented a set similarity join in Flink (Mass-Join).
Furthermore, I implemented a local version in Java.
I compared both Implementations.
The Local version needs one minute to compute a 500MB Dataset.
My Flink program needs 5 minutes (cluster: 11 nodes, 20 000 MB RAM).
I use the Flink version 0.6.
What could be the cause?

I would welcome your response,
Florian Hönicke