Flink keyed stream windows

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Flink keyed stream windows

Garvit Sharma
Hi,

I am working on a use case where I have a stream of users active locations and I want to store(by hitting an HTTP API) the latest active location for each of the unique users every 30 minutes.

Since I have a lot of unique users(rpm 1.5 million), how to use Flink's timed windows on keyed stream to solve this problem.

Please help!

Thanks,

--

Garvit Sharma
github.com/garvitlnmiit/

No Body is a Scholar by birth, its only hard work and strong determination that makes him master.
Reply | Threaded
Open this post in threaded view
|

Re: Flink keyed stream windows

Garvit Sharma
Clarification: Its 30 Seconds not 30 minutes.

On Mon, Aug 13, 2018 at 3:20 PM Garvit Sharma <[hidden email]> wrote:
Hi,

I am working on a use case where I have a stream of users active locations and I want to store(by hitting an HTTP API) the latest active location for each of the unique users every 30 minutes.

Since I have a lot of unique users(rpm 1.5 million), how to use Flink's timed windows on keyed stream to solve this problem.

Please help!

Thanks,

--

Garvit Sharma
github.com/garvitlnmiit/

No Body is a Scholar by birth, its only hard work and strong determination that makes him master.


--

Garvit Sharma
github.com/garvitlnmiit/

No Body is a Scholar by birth, its only hard work and strong determination that makes him master.
Reply | Threaded
Open this post in threaded view
|

Re: Flink keyed stream windows

vino yang
Hi Garvit,

Please refer to the Flink official documentation for the window description. [1]
In this scenario, you should use Tumbling Windows. [2]
If you want to call your own API to handle the window, you can extend the process window function to achieve your needs.[3]


Thanks, vino.

Garvit Sharma <[hidden email]> 于2018年8月13日周一 下午5:53写道:
Clarification: Its 30 Seconds not 30 minutes.

On Mon, Aug 13, 2018 at 3:20 PM Garvit Sharma <[hidden email]> wrote:
Hi,

I am working on a use case where I have a stream of users active locations and I want to store(by hitting an HTTP API) the latest active location for each of the unique users every 30 minutes.

Since I have a lot of unique users(rpm 1.5 million), how to use Flink's timed windows on keyed stream to solve this problem.

Please help!

Thanks,

--

Garvit Sharma
github.com/garvitlnmiit/

No Body is a Scholar by birth, its only hard work and strong determination that makes him master.


--

Garvit Sharma
github.com/garvitlnmiit/

No Body is a Scholar by birth, its only hard work and strong determination that makes him master.
Reply | Threaded
Open this post in threaded view
|

Re: Flink keyed stream windows

Ken Krugler
Hi Garvit,

One other point - once you start making HTTP requests, you likely want to use an AsyncFunction to avoid the inefficiencies of your process spending most of its time waiting for the remote server to handle the request.

Which means emitting the results (user + active location) from the window function, and then processing them in a downstream AsyncFunction.

The other choice is to multi-thread your custom process window function, but then reliably recovering from errors becomes challenging.

— Ken


On Aug 13, 2018, at 4:26 AM, vino yang <[hidden email]> wrote:

Hi Garvit,

Please refer to the Flink official documentation for the window description. [1]
In this scenario, you should use Tumbling Windows. [2]
If you want to call your own API to handle the window, you can extend the process window function to achieve your needs.[3]


Thanks, vino.

Garvit Sharma <[hidden email]> 于2018年8月13日周一 下午5:53写道:
Clarification: Its 30 Seconds not 30 minutes.

On Mon, Aug 13, 2018 at 3:20 PM Garvit Sharma <[hidden email]> wrote:
Hi,

I am working on a use case where I have a stream of users active locations and I want to store(by hitting an HTTP API) the latest active location for each of the unique users every 30 minutes.

Since I have a lot of unique users(rpm 1.5 million), how to use Flink's timed windows on keyed stream to solve this problem.

Please help!

Thanks,
----------------------------------------


--------------------------
Ken Krugler
+1 530-210-6378
http://www.scaleunlimited.com
Custom big data solutions & training
Flink, Solr, Hadoop, Cascading & Cassandra