Java parallel streams vs IterativeStream

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Java parallel streams vs IterativeStream

nragon
In my map functions i have an object containing a list which must be changed, executing some logic.
So, considering java 8 parallel streams would it be worth to use them or does IterativeStreams offer a better performance without java 8 streams parallel overhead?

Thanks
Reply | Threaded
Open this post in threaded view
|

Re: Java parallel streams vs IterativeStream

Aljoscha Krettek
Hi,

How would you use IterativeStream? In Flink IterativeStream is a pipeline-level concept whereas your problem seems to be scoped to one user function.

Best,
Aljoscha

> On 12. Jun 2017, at 19:17, nragon <[hidden email]> wrote:
>
> In my map functions i have an object containing a list which must be changed,
> executing some logic.
> So, considering java 8 parallel streams would it be worth to use them or
> does IterativeStreams offer a better performance without java 8 streams
> parallel overhead?
>
> Thanks
>
>
>
> --
> View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Java-parallel-streams-vs-IterativeStream-tp13655.html
> Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.

Reply | Threaded
Open this post in threaded view
|

Re: Java parallel streams vs IterativeStream

nragon
Iterate until all elements were changed perhaps. But just wanted to know if there areimplementations out there using java 8 streams, in cases where you want to parallelize a map function even if it is function scoped.
So, in my case, if the computation for each list element is to heavy, how can one parallelize it?
Reply | Threaded
Open this post in threaded view
|

Re: Java parallel streams vs IterativeStream

Aljoscha Krettek
I think for that you would unpack to List of values, for example with a FlatMap<List<T>, T>. This would emit each element of the list as a separate element. Then, downstream operations can operate on each element individually and you will exploit parallelism in the cluster.

Best,
Aljoscha

> On 13. Jun 2017, at 10:49, nragon <[hidden email]> wrote:
>
> Iterate until all elements were changed perhaps. But just wanted to know if
> there areimplementations out there using java 8 streams, in cases where you
> want to parallelize a map function even if it is function scoped.
> So, in my case, if the computation for each list element is to heavy, how
> can one parallelize it?
>
>
>
> --
> View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Java-parallel-streams-vs-IterativeStream-tp13655p13677.html
> Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.

Reply | Threaded
Open this post in threaded view
|

Re: Java parallel streams vs IterativeStream

nragon
This post was updated on .
That would work but after FlatMap<List<T>, T> I would have to downstream all elements into one.

Map function would go from
for (Attribute dimension : this.dimensions) {
      this.transformed.setParameterNoCheck(dimension.getName(), dimension.get(this.context));
    }
return this.transformed

to 

this.dimensions.parallelStream().forEach(dimension -> this.transformed.setParameterNoCheck(dimension.getName(), dimension.get(this.context)));
return this.transformed
Reply | Threaded
Open this post in threaded view
|

Re: Java parallel streams vs IterativeStream

Aljoscha Krettek
I see, I’m afraid that is not easily possible except by doing a custom stateful function that waits for all elements to arrive and combines them again.

Another thing you could look at is the async I/O operator: https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/stream/asyncio.html. You would not necessarily use it for I/O but spawn threads for processing your data in parallel.

Best,
Aljoscha

On 13. Jun 2017, at 11:20, nragon <[hidden email]> wrote:

That would work but after FlatMap<List&lt;T>, T> I would have to downstream
all elements into one.




--
View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Java-parallel-streams-vs-IterativeStream-tp13655p13685.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.

Reply | Threaded
Open this post in threaded view
|

Re: Java parallel streams vs IterativeStream

nragon
I've mentioned java 8 stream beacuse avoids leaving map, thus decreasing network io, if not chained, and takes advantage of multiple cpus. Guess will have to test it.
Reply | Threaded
Open this post in threaded view
|

Re: Java parallel streams vs IterativeStream

Aljoscha Krettek
Ah yes, I forgot Java8 streams. That could probably be your best option. Yes!

> On 14. Jun 2017, at 16:25, nragon <[hidden email]> wrote:
>
> I've mentioned java 8 stream beacuse avoids leaving map, thus decreasing
> network io, if not chained, and takes advantage of multiple cpus. Guess will
> have to test it.
>
>
>
> --
> View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Java-parallel-streams-vs-IterativeStream-tp13655p13735.html
> Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.

Reply | Threaded
Open this post in threaded view
|

Re: Java parallel streams vs IterativeStream

nragon
I'll test with java 8 streams.

Thanks,
Nuno