Implementing CountWindow in Window Join and continuous joining for 2 datastreams

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Implementing CountWindow in Window Join and continuous joining for 2 datastreams

Tay Zhen Shen

Hi , I'm currently working on Flink with a simple stock market analysis.Basically i need to have the sum of 100 element (Count Window)(sliding size : 10) and  also sum of 20 element(Count Window) (sliding size: 10) respectively. I realised that i have to calculate the both sum on 2 different stream respectively and so i did. Now i have 2 streams 1 containing the sum of 100 element and the other one containing the sum of 20 element. Now i wanted to join the both datastream into 1 datastream. I'm using join and i realised that it can only supports Time Windows and Tumbling Window? Is there any other functions that i can use to solve my problem?


Example:

records:

date,price1,price2

date,price1,price2


sum to become:

date,sum_of_price1_for_100_element,sum_of_price2_for_20_element)

Reply | Threaded
Open this post in threaded view
|

Re: Implementing CountWindow in Window Join and continuous joining for 2 datastreams

Fabian Hueske-2
Hi,

If I understood your problem correctly, you want to join two records, one from each windowed stream.
You can do this by keying and connecting the two streams and apply a stateful CoFlatMapFunction or CoProcessFunction to join them.

DataStream<X> windowed1 = ...
DataStream<Y> windowed2 = ...

windowed1.keyBy(x).connect(windowed2.keyBy(y)).flatMap(new YourJoiningCoFlatMapFunction());

Best, Fabian


2018-02-24 7:58 GMT+01:00 Tay Zhen Shen <[hidden email]>:

Hi , I'm currently working on Flink with a simple stock market analysis.Basically i need to have the sum of 100 element (Count Window)(sliding size : 10) and  also sum of 20 element(Count Window) (sliding size: 10) respectively. I realised that i have to calculate the both sum on 2 different stream respectively and so i did. Now i have 2 streams 1 containing the sum of 100 element and the other one containing the sum of 20 element. Now i wanted to join the both datastream into 1 datastream. I'm using join and i realised that it can only supports Time Windows and Tumbling Window? Is there any other functions that i can use to solve my problem?


Example:

records:

date,price1,price2

date,price1,price2


sum to become:

date,sum_of_price1_for_100_element,sum_of_price2_for_20_element)