streaming join implementation
Posted by
Henry Cai on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/streaming-join-implementation-tp6095.html
Hi,
We are evaluating different streaming platforms. For a typical join between two streams
select a.*, b.*
FROM a, b
How does flink implement the join? The matching record from either stream can come late, we consider it's a valid join as long as the event time for record a and b are in the same day.
I think some streaming platform (e.g. google data flow) will store the records from both streams in a K/V lookup store and later do the lookup. Is this how flink implement the streaming join?
If we need to store all the records in a state store, that's going to be a lots of records for a day.