Re: data enrichment with SQL use case
Posted by
Fabian Hueske-2 on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/data-enrichment-with-SQL-use-case-tp19520p19689.html
Hi Miki,
Sorry for the late response.
There are basically two ways to implement an enrichment join as in your use case.
1) Keep the meta data in the database and implement a job that reads the stream from Kafka and queries the database in an ASyncIO operator for every stream record. This should be the easier implementation but it will send one query to the DB for each streamed record.
2) Replicate the meta data into Flink state and join the streamed records with the state. This solution is more complex because you need propagate updates of the meta data (if there are any) into the Flink state. At the moment, Flink lacks a few features to have a good implementation of this approach, but there a some workarounds that help in certain cases.
Note that Flink's SQL support does not add advantages for the either of both approaches. You should use the DataStream API (and possible ProcessFunctions).
I'd go for the first approach if one query per record is feasible.
Let me know if you need to tackle the second approach and I can give some details on the workarounds I mentioned.
Best, Fabian