Enrich streaming data with small lookup data that slowly changes over time

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Enrich streaming data with small lookup data that slowly changes over time

Mu Kong
Hi community,

I have a stream of traffic data with a service_id in it.
I'm enriching this data with a map of (service_id, service_name), which only has 10 ~ 20 pairs and is read from config file.

The problem I'm facing now is, this map changes from time to time, and I don't want to redeploy the application to just change the map in the config file.

Is there an existing solution for solving this problem?

Thanks in advance!

Best regards,
Mu
Reply | Threaded
Open this post in threaded view
|

Re: Enrich streaming data with small lookup data that slowly changes over time

Jark Wu-3
Hi Mu,

Flink SQL does support dimension table join. There are two ways to join the dimension table. 
If the data is in your database (e.g. MySQL, HBase), you can use this way [1] to join the data in your database in realtime and enrich fresh data. 
If the data is in a log stream (change stream), you can use this way [2] to join the data.

Best,
Jark


On Tue, 21 Apr 2020 at 10:51, Mu Kong <[hidden email]> wrote:
Hi community,

I have a stream of traffic data with a service_id in it.
I'm enriching this data with a map of (service_id, service_name), which only has 10 ~ 20 pairs and is read from config file.

The problem I'm facing now is, this map changes from time to time, and I don't want to redeploy the application to just change the map in the config file.

Is there an existing solution for solving this problem?

Thanks in advance!

Best regards,
Mu
Reply | Threaded
Open this post in threaded view
|

Re: Enrich streaming data with small lookup data that slowly changes over time

Mu Kong
Hi Jark Wu,

Thanks for your help!
I gave the document a quick glimpse, it seems method [1] fits my purpose better.
Let me give it a deeper look.

Thank you very much!!

Best,
Mu


On Tue, Apr 21, 2020 at 12:06 PM Jark Wu <[hidden email]> wrote:
Hi Mu,

Flink SQL does support dimension table join. There are two ways to join the dimension table. 
If the data is in your database (e.g. MySQL, HBase), you can use this way [1] to join the data in your database in realtime and enrich fresh data. 
If the data is in a log stream (change stream), you can use this way [2] to join the data.

Best,
Jark


On Tue, 21 Apr 2020 at 10:51, Mu Kong <[hidden email]> wrote:
Hi community,

I have a stream of traffic data with a service_id in it.
I'm enriching this data with a map of (service_id, service_name), which only has 10 ~ 20 pairs and is read from config file.

The problem I'm facing now is, this map changes from time to time, and I don't want to redeploy the application to just change the map in the config file.

Is there an existing solution for solving this problem?

Thanks in advance!

Best regards,
Mu