aggregation problem

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

aggregation problem

Riccardo Diomedi
Hi everybody

In a DeltaIteration I have a DataSet<Tuple3<K, V, HashSet<K>>> where, at a certain point of the iteration, i need to count the total number of tuples and the total number of elements in the HashSet of each tuple, and then send both value to the ConvergenceCriterion function.

Example:

this is the content of my DataSet:
(1,2,[2,3])
(2,1,[3,4])
(3,2,[4,5])

i should have:
first count: 3 (1,2,3)
second count: 4 (2,3,4,5)

i tried to iterate the dataset through a flatMap and exploit so an aggregator, putting an HashSet into it(Aggregator), but it didn’t work!

Do you have any suggestion??

thanks 

Riccardo
Reply | Threaded
Open this post in threaded view
|

Re: aggregation problem

Vasiliki Kalavri
Hi Riccardo,

can you please be a bit more specific? What do you mean by "it didn't work"? Did it crash? Did it give you a wrong value? Something else?

-Vasia.

On 28 April 2016 at 16:52, Riccardo Diomedi <[hidden email]> wrote:
Hi everybody

In a DeltaIteration I have a DataSet<Tuple3<K, V, HashSet<K>>> where, at a certain point of the iteration, i need to count the total number of tuples and the total number of elements in the HashSet of each tuple, and then send both value to the ConvergenceCriterion function.

Example:

this is the content of my DataSet:
(1,2,[2,3])
(2,1,[3,4])
(3,2,[4,5])

i should have:
first count: 3 (1,2,3)
second count: 4 (2,3,4,5)

i tried to iterate the dataset through a flatMap and exploit so an aggregator, putting an HashSet into it(Aggregator), but it didn’t work!

Do you have any suggestion??

thanks 

Riccardo