(DEPRECATED) Apache Flink User Mailing List archive.

in-memory optimization

Classic

List

Threaded

2 messages Options

Robert Schwarzenberg

in-memory optimization

Hello,

I have a question regarding the loop-awareness of Flink wrt invariant
datasets.

Does Flink serialize the DataSet 'points' in line 85

https://github.com/apache/flink/blob/master/flink-examples/flink-examples-batch/src/main/scala/org/apache/flink/examples/scala/clustering/KMeans.scala

each iteration or are there in-memory optimization procedures in place?

Thanks for your help!

Regards,
Robert

Ufuk Celebi

Re: in-memory optimization

Loop invariant data should be kept in Flink's managed memory in
serialized form (in a custom hash table). That means that they are not
read back again from the CSV file, but they are kept in serialized
form and need be deserialized again on access.

CC'ing Fabian to double check...

On Mon, Apr 24, 2017 at 4:20 PM, Robert Schwarzenberg
<[hidden email]> wrote:

> Hello,
>
> I have a question regarding the loop-awareness of Flink wrt invariant
> datasets.
>
> Does Flink serialize the DataSet 'points' in line 85
>
> https://github.com/apache/flink/blob/master/flink-examples/flink-examples-batch/src/main/scala/org/apache/flink/examples/scala/clustering/KMeans.scala
>
> each iteration or are there in-memory optimization procedures in place?
>
> Thanks for your help!
>
> Regards,
> Robert