Multi-field "sum" function just like "keyBy"

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Multi-field "sum" function just like "keyBy"

Rami Al-Isawi
Hi,

Is there any reason why “keyBy" accepts multi-field, while for example “sum” does not.

-Rami
Disclaimer: This message and any attachments thereto are intended solely for the addressed recipient(s) and may contain confidential information. If you are not the intended recipient, please notify the sender by reply e-mail and delete the e-mail (including any attachments thereto) without producing, distributing or retaining any copies thereof. Any review, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient(s) is prohibited. Thank you.
Reply | Threaded
Open this post in threaded view
|

Re: Multi-field "sum" function just like "keyBy"

Gábor Gévay
Hello,

In the case of "sum", you can just specify them one after the other, like:

stream.sum(1).sum(2)

This works, because summing the two fields are independent. However,
in the case of "keyBy", the information is needed from both fields at
the same time to produce the key.

Best,
Gábor



2016-06-07 14:41 GMT+02:00 Al-Isawi Rami <[hidden email]>:
> Hi,
>
> Is there any reason why “keyBy" accepts multi-field, while for example “sum” does not.
>
> -Rami
> Disclaimer: This message and any attachments thereto are intended solely for the addressed recipient(s) and may contain confidential information. If you are not the intended recipient, please notify the sender by reply e-mail and delete the e-mail (including any attachments thereto) without producing, distributing or retaining any copies thereof. Any review, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient(s) is prohibited. Thank you.
Reply | Threaded
Open this post in threaded view
|

Re: Multi-field "sum" function just like "keyBy"

Rami Al-Isawi
Thanks Gábor, but the first sum call will return
SingleOutputStreamOperator
I could not do another sum call on that. Would tell me how did you manage to do

stream.sum().sum() 

Regards,
-Rami

On 7 Jun 2016, at 16:13, Gábor Gévay <[hidden email]> wrote:

Hello,

In the case of "sum", you can just specify them one after the other, like:

stream.sum(1).sum(2)

This works, because summing the two fields are independent. However,
in the case of "keyBy", the information is needed from both fields at
the same time to produce the key.

Best,
Gábor



2016-06-07 14:41 GMT+02:00 Al-Isawi Rami <[hidden email]>:
Hi,

Is there any reason why “keyBy" accepts multi-field, while for example “sum” does not.

-Rami
Disclaimer: This message and any attachments thereto are intended solely for the addressed recipient(s) and may contain confidential information. If you are not the intended recipient, please notify the sender by reply e-mail and delete the e-mail (including any attachments thereto) without producing, distributing or retaining any copies thereof. Any review, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient(s) is prohibited. Thank you.

Disclaimer: This message and any attachments thereto are intended solely for the addressed recipient(s) and may contain confidential information. If you are not the intended recipient, please notify the sender by reply e-mail and delete the e-mail (including any attachments thereto) without producing, distributing or retaining any copies thereof. Any review, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient(s) is prohibited. Thank you.
Reply | Threaded
Open this post in threaded view
|

Re: Multi-field "sum" function just like "keyBy"

Gábor Gévay
Ah, sorry, you are right. You could also call keyBy again before the
second sum, but maybe someone else has a better idea.

Best,
Gábor



2016-06-07 16:18 GMT+02:00 Al-Isawi Rami <[hidden email]>:

> Thanks Gábor, but the first sum call will return
>
> SingleOutputStreamOperator
>
> I could not do another sum call on that. Would tell me how did you manage to
> do
>
> stream.sum().sum()
>
> Regards,
> -Rami
>
> On 7 Jun 2016, at 16:13, Gábor Gévay <[hidden email]> wrote:
>
> Hello,
>
> In the case of "sum", you can just specify them one after the other, like:
>
> stream.sum(1).sum(2)
>
> This works, because summing the two fields are independent. However,
> in the case of "keyBy", the information is needed from both fields at
> the same time to produce the key.
>
> Best,
> Gábor
>
>
>
> 2016-06-07 14:41 GMT+02:00 Al-Isawi Rami <[hidden email]>:
>
> Hi,
>
> Is there any reason why “keyBy" accepts multi-field, while for example “sum”
> does not.
>
> -Rami
> Disclaimer: This message and any attachments thereto are intended solely for
> the addressed recipient(s) and may contain confidential information. If you
> are not the intended recipient, please notify the sender by reply e-mail and
> delete the e-mail (including any attachments thereto) without producing,
> distributing or retaining any copies thereof. Any review, dissemination or
> other use of, or taking of any action in reliance upon, this information by
> persons or entities other than the intended recipient(s) is prohibited.
> Thank you.
>
>
> Disclaimer: This message and any attachments thereto are intended solely for
> the addressed recipient(s) and may contain confidential information. If you
> are not the intended recipient, please notify the sender by reply e-mail and
> delete the e-mail (including any attachments thereto) without producing,
> distributing or retaining any copies thereof. Any review, dissemination or
> other use of, or taking of any action in reliance upon, this information by
> persons or entities other than the intended recipient(s) is prohibited.
> Thank you.
Reply | Threaded
Open this post in threaded view
|

Re: Multi-field "sum" function just like "keyBy"

Jamie Grier
In reply to this post by Rami Al-Isawi
I'm assuming what you're trying to do is essentially sum over two different fields of your data.  I would do this with my own ReduceFunction.


stream
  .keyBy("someKey")
  .reduce(CustomReduceFunction) // sum whatever fields you want and return the result

I think it does make sense that Flink could provide a generic sum function that could sum over multiple fields, though.

-Jamie


On Tue, Jun 7, 2016 at 5:41 AM, Al-Isawi Rami <[hidden email]> wrote:
Hi,

Is there any reason why “keyBy" accepts multi-field, while for example “sum” does not.

-Rami
Disclaimer: This message and any attachments thereto are intended solely for the addressed recipient(s) and may contain confidential information. If you are not the intended recipient, please notify the sender by reply e-mail and delete the e-mail (including any attachments thereto) without producing, distributing or retaining any copies thereof. Any review, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient(s) is prohibited. Thank you.



--

Jamie Grier
data Artisans, Director of Applications Engineering

Reply | Threaded
Open this post in threaded view
|

Re: Multi-field "sum" function just like "keyBy"

Rami Al-Isawi
Thanks Jamie, Yes your assumption is correct. 

I can use keyBy as follows:
stream.keyBy(“pojo.field1”,”pojo.field2”,…)
Would make sense that I can use sum for example, to do its job for more than one field:
stream.sum(“pojo.field1”,”pojo.field2”,…)

I have created this Jira issue for it, hopefully, it will get picked someday.

-Rami


On 8 Jun 2016, at 04:25, Jamie Grier <[hidden email]> wrote:

I'm assuming what you're trying to do is essentially sum over two different fields of your data.  I would do this with my own ReduceFunction.


stream
  .keyBy("someKey")
  .reduce(CustomReduceFunction) // sum whatever fields you want and return the result

I think it does make sense that Flink could provide a generic sum function that could sum over multiple fields, though.

-Jamie


On Tue, Jun 7, 2016 at 5:41 AM, Al-Isawi Rami <[hidden email]> wrote:
Hi,

Is there any reason why “keyBy" accepts multi-field, while for example “sum” does not.

-Rami
Disclaimer: This message and any attachments thereto are intended solely for the addressed recipient(s) and may contain confidential information. If you are not the intended recipient, please notify the sender by reply e-mail and delete the e-mail (including any attachments thereto) without producing, distributing or retaining any copies thereof. Any review, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient(s) is prohibited. Thank you.



--

Jamie Grier
data Artisans, Director of Applications Engineering


Disclaimer: This message and any attachments thereto are intended solely for the addressed recipient(s) and may contain confidential information. If you are not the intended recipient, please notify the sender by reply e-mail and delete the e-mail (including any attachments thereto) without producing, distributing or retaining any copies thereof. Any review, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient(s) is prohibited. Thank you.