Best pattern for achieving stream enrichment (side-input) from a large static source
Posted by
Nimrod Hauser on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/Best-pattern-for-achieving-stream-enrichment-side-input-from-a-large-static-source-tp25771.html
Hello,
We're using Flink on a high velocity data-stream, and we're looking for the best way to enrich our stream using a large static source (originating from Parquet files, which are rarely updated).
The source for the enrichment weights a few GBs, which is why we want to avoid using techniques such as broadcast streams, which cannot be keyed and need to be duplicated for every Flink operator that is used.
We started looking into the possibility of merging streams with datasets, or using the Table API, but any best-practice that's been done before will be greatly appreciated.
I'm attaching a simple chart for convenience,
Thanks you very much,
Nimrod.