Hi,
I am working on a use case where I have a stream of users active locations and I want to store(by hitting an HTTP API) the latest active location for each of the unique users every 30 minutes. Since I have a lot of unique users(rpm 1.5 million), how to use Flink's timed windows on keyed stream to solve this problem. Please help! Thanks, Garvit Sharma github.com/garvitlnmiit/ No Body is a Scholar by birth, its only hard work and strong determination that makes him master. |
Clarification: Its 30 Seconds not 30 minutes. On Mon, Aug 13, 2018 at 3:20 PM Garvit Sharma <[hidden email]> wrote:
Garvit Sharma github.com/garvitlnmiit/ No Body is a Scholar by birth, its only hard work and strong determination that makes him master. |
Hi Garvit, Please refer to the Flink official documentation for the window description. [1] In this scenario, you should use Tumbling Windows. [2] If you want to call your own API to handle the window, you can extend the process window function to achieve your needs.[3] [1]: https://ci.apache.org/projects/flink/flink-docs-master/dev/stream/operators/windows.html#windows Thanks, vino. Garvit Sharma <[hidden email]> 于2018年8月13日周一 下午5:53写道:
|
Hi Garvit, Which means emitting the results (user + active location) from the window function, and then processing them in a downstream AsyncFunction. The other choice is to multi-thread your custom process window function, but then reliably recovering from errors becomes challenging. — Ken
-------------------------- Ken Krugler +1 530-210-6378 http://www.scaleunlimited.com Custom big data solutions & training Flink, Solr, Hadoop, Cascading & Cassandra |
Free forum by Nabble | Edit this page |