(DEPRECATED) Apache Flink User Mailing List archive.

streaming join implementation

Posted by Henry Cai on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/streaming-join-implementation-tp6095.html

Hi,

We are evaluating different streaming platforms. For a typical join between two streams

select a.*, b.*

FROM a, b

ON a.id == b.id

How does flink implement the join? The matching record from either stream can come late, we consider it's a valid join as long as the event time for record a and b are in the same day.

I think some streaming platform (e.g. google data flow) will store the records from both streams in a K/V lookup store and later do the lookup. Is this how flink implement the streaming join?

If we need to store all the records in a state store, that's going to be a lots of records for a day.