Get DataSet sum

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Get DataSet sum

Giacomo Licari
Hi Guys,
how can obtain the sum of all items (integer or double) in a DataSet?

Do I have to use Flink Iterators? And how?

Thank you,
Giacomo
Reply | Threaded
Open this post in threaded view
|

Re: Get DataSet sum

Maximilian Michels
Hi Giacomo,

If you have your data stored in a Tuple inside a DataSet, then a call to dataSet.sum(int field) should do it.

Best,
Max

On Tue, Apr 28, 2015 at 2:52 PM, Giacomo Licari <[hidden email]> wrote:
Hi Guys,
how can obtain the sum of all items (integer or double) in a DataSet?

Do I have to use Flink Iterators? And how?

Thank you,
Giacomo

Reply | Threaded
Open this post in threaded view
|

Re: Get DataSet sum

Fabian Hueske-2
You can also use Reduce to compute a sum on any data type (e.g., an Integer field in a POJO).

2015-04-28 15:25 GMT+02:00 Maximilian Michels <[hidden email]>:
Hi Giacomo,

If you have your data stored in a Tuple inside a DataSet, then a call to dataSet.sum(int field) should do it.

Best,
Max

On Tue, Apr 28, 2015 at 2:52 PM, Giacomo Licari <[hidden email]> wrote:
Hi Guys,
how can obtain the sum of all items (integer or double) in a DataSet?

Do I have to use Flink Iterators? And how?

Thank you,
Giacomo


Reply | Threaded
Open this post in threaded view
|

Re: Get DataSet sum

Giacomo Licari
Hi Fabian,
is possible to assign the reduce result to a POJO variable?
At the moment inside the reduce function I'm passing the final count to a global variable.

Example:
double X = DataSet<Double> myDataSet.GroupReduce(new MyReducer());

On Tue, Apr 28, 2015 at 9:54 PM, Fabian Hueske <[hidden email]> wrote:
You can also use Reduce to compute a sum on any data type (e.g., an Integer field in a POJO).

2015-04-28 15:25 GMT+02:00 Maximilian Michels <[hidden email]>:
Hi Giacomo,

If you have your data stored in a Tuple inside a DataSet, then a call to dataSet.sum(int field) should do it.

Best,
Max

On Tue, Apr 28, 2015 at 2:52 PM, Giacomo Licari <[hidden email]> wrote:
Hi Guys,
how can obtain the sum of all items (integer or double) in a DataSet?

Do I have to use Flink Iterators? And how?

Thank you,
Giacomo



Reply | Threaded
Open this post in threaded view
|

Re: Get DataSet sum

Fabian Hueske-2
Hi Giacomo,

a DataSet is just a logical construct to define data flows. It does not actually hold any data.

Here's a code snippet that sums some Integers and returns the result to the client program:

DataSet<Integer> data = ...

// sum
DataSet<Integer> sum = data.reduce(new ReduceFunction<Integer>() {
  public Integer reduce(Integer v1, Integer v2) { return v1+v2; }
}

// fetch result back
List<Integer> values = sum.collect(); // returns a list because the DataSet might contain more elements


Let me know, if you have more questions.

Cheers, Fabian

2015-04-29 10:48 GMT+02:00 Giacomo Licari <[hidden email]>:
Hi Fabian,
is possible to assign the reduce result to a POJO variable?
At the moment inside the reduce function I'm passing the final count to a global variable.

Example:
double X = DataSet<Double> myDataSet.GroupReduce(new MyReducer());

On Tue, Apr 28, 2015 at 9:54 PM, Fabian Hueske <[hidden email]> wrote:
You can also use Reduce to compute a sum on any data type (e.g., an Integer field in a POJO).

2015-04-28 15:25 GMT+02:00 Maximilian Michels <[hidden email]>:
Hi Giacomo,

If you have your data stored in a Tuple inside a DataSet, then a call to dataSet.sum(int field) should do it.

Best,
Max

On Tue, Apr 28, 2015 at 2:52 PM, Giacomo Licari <[hidden email]> wrote:
Hi Guys,
how can obtain the sum of all items (integer or double) in a DataSet?

Do I have to use Flink Iterators? And how?

Thank you,
Giacomo