Suggestion for top 'k' products

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Suggestion for top 'k' products

sandeep6
Hi All,

I'm trying to use Flink for a use case where I would want to see my top selling products in time windows in near real time (windows of size 1-2 mins if fine). I guess this is the most common use case to use streaming apis in e-commerce. I see that I can iterate over records in a windowed stream and do the sorting myself. I'm wondering if thats the best way. Is there any in built sort functionality that I missed anywhere in Flink docs?

Thanks,
Sandeep
Reply | Threaded
Open this post in threaded view
|

Re: Suggestion for top 'k' products

Suneel Marthi
A simple way is to populate a Priority Queue of  max size 'k' and implement a comparator on ur records.  That would ensure that u always have Top k records at any instant in time. 

On Mon, Mar 13, 2017 at 1:25 PM, Meghashyam Sandeep V <[hidden email]> wrote:
Hi All,

I'm trying to use Flink for a use case where I would want to see my top selling products in time windows in near real time (windows of size 1-2 mins if fine). I guess this is the most common use case to use streaming apis in e-commerce. I see that I can iterate over records in a windowed stream and do the sorting myself. I'm wondering if thats the best way. Is there any in built sort functionality that I missed anywhere in Flink docs?

Thanks,
Sandeep

Reply | Threaded
Open this post in threaded view
|

Re: Suggestion for top 'k' products

Suneel Marthi

On Mon, Mar 13, 2017 at 1:29 PM, Suneel Marthi <[hidden email]> wrote:
A simple way is to populate a Priority Queue of  max size 'k' and implement a comparator on ur records.  That would ensure that u always have Top k records at any instant in time. 

On Mon, Mar 13, 2017 at 1:25 PM, Meghashyam Sandeep V <[hidden email]> wrote:
Hi All,

I'm trying to use Flink for a use case where I would want to see my top selling products in time windows in near real time (windows of size 1-2 mins if fine). I guess this is the most common use case to use streaming apis in e-commerce. I see that I can iterate over records in a windowed stream and do the sorting myself. I'm wondering if thats the best way. Is there any in built sort functionality that I missed anywhere in Flink docs?

Thanks,
Sandeep


Reply | Threaded
Open this post in threaded view
|

Re: Suggestion for top 'k' products

sandeep6
Thanks Suneel. Exactly what I was looking for.

On Mon, Mar 13, 2017 at 10:31 AM, Suneel Marthi <[hidden email]> wrote:

On Mon, Mar 13, 2017 at 1:29 PM, Suneel Marthi <[hidden email]> wrote:
A simple way is to populate a Priority Queue of  max size 'k' and implement a comparator on ur records.  That would ensure that u always have Top k records at any instant in time. 

On Mon, Mar 13, 2017 at 1:25 PM, Meghashyam Sandeep V <[hidden email]> wrote:
Hi All,

I'm trying to use Flink for a use case where I would want to see my top selling products in time windows in near real time (windows of size 1-2 mins if fine). I guess this is the most common use case to use streaming apis in e-commerce. I see that I can iterate over records in a windowed stream and do the sorting myself. I'm wondering if thats the best way. Is there any in built sort functionality that I missed anywhere in Flink docs?

Thanks,
Sandeep



Reply | Threaded
Open this post in threaded view
|

Re: Suggestion for top 'k' products

sandeep6
Is there an equivalent of spark function like 'takeOrdered' in Flink? If I implement a function to order messages in the stream, I'm not sure if thats executed in a distributed mode by splitting data into available task manager nodes and then evaluate the function.

Thanks,
Sandeep

On Mon, Mar 13, 2017 at 11:22 AM, Meghashyam Sandeep V <[hidden email]> wrote:
Thanks Suneel. Exactly what I was looking for.

On Mon, Mar 13, 2017 at 10:31 AM, Suneel Marthi <[hidden email]> wrote:

On Mon, Mar 13, 2017 at 1:29 PM, Suneel Marthi <[hidden email]> wrote:
A simple way is to populate a Priority Queue of  max size 'k' and implement a comparator on ur records.  That would ensure that u always have Top k records at any instant in time. 

On Mon, Mar 13, 2017 at 1:25 PM, Meghashyam Sandeep V <[hidden email]> wrote:
Hi All,

I'm trying to use Flink for a use case where I would want to see my top selling products in time windows in near real time (windows of size 1-2 mins if fine). I guess this is the most common use case to use streaming apis in e-commerce. I see that I can iterate over records in a windowed stream and do the sorting myself. I'm wondering if thats the best way. Is there any in built sort functionality that I missed anywhere in Flink docs?

Thanks,
Sandeep