Mixing Batch & Streaming

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Mixing Batch & Streaming

Francis Aranda
Hi everyone,

There is any way of mixing dataStreams and dataSets ? For example, enrich messages from a dataStream with a precalculated info in a dataSet.

Thanks in advance!
Reply | Threaded
Open this post in threaded view
|

Re: Mixing Batch & Streaming

Fabian Hueske-2
Hi,

this is currently not support yet. However, this feature is on our roadmap and has been requested for a few times.
So I hope somebody will pick it up soon.

If the static data set is small enough, you can read the full data set (e.g., as a file) in the open method of FlatMapFunction, build a hash-table, and do a hash join in the map method.

Best, Fabian

2016-01-28 8:54 GMT+01:00 Don Frascuchon <[hidden email]>:
Hi everyone,

There is any way of mixing dataStreams and dataSets ? For example, enrich messages from a dataStream with a precalculated info in a dataSet.

Thanks in advance!

Reply | Threaded
Open this post in threaded view
|

Re: Mixing Batch & Streaming

Nick Dimiduk
If the dataset is too large for a file, you can put it behind a service and have your stream operators query the service for enrichment. You can even support updates to that dataset in a style very similar to the "lambda architecture" discussed elsewhere.

On Thursday, January 28, 2016, Fabian Hueske <[hidden email]> wrote:
Hi,

this is currently not support yet. However, this feature is on our roadmap and has been requested for a few times.
So I hope somebody will pick it up soon.

If the static data set is small enough, you can read the full data set (e.g., as a file) in the open method of FlatMapFunction, build a hash-table, and do a hash join in the map method.

Best, Fabian

2016-01-28 8:54 GMT+01:00 Don Frascuchon <<a href="javascript:_e(%7B%7D,&#39;cvml&#39;,&#39;frascuchon@gmail.com&#39;);" target="_blank">frascuchon@...>:
Hi everyone,

There is any way of mixing dataStreams and dataSets ? For example, enrich messages from a dataStream with a precalculated info in a dataSet.

Thanks in advance!

Reply | Threaded
Open this post in threaded view
|

Re: Mixing Batch & Streaming

sskhiri
Nick, Fabian,

Is there any update on that point ?
Is the pattern described by Nick is still the only way to enrich an event with information in a DB ? Is there any way to load a table through a table and to query it from the Stream ?

Thanks,

Sabri.
Reply | Threaded
Open this post in threaded view
|

Re: Mixing Batch & Streaming

Jeyhun Karimov
Hi all,

We are currently working on this issue to make efficient mixing between datastream window and dataset. 
However, the simplest solution would be, to output each window in a sequential file to HDFS and do computation on that datasource as dataset. 




On Fri, Mar 4, 2016 at 4:05 PM sskhiri <[hidden email]> wrote:
Nick, Fabian,

Is there any update on that point ?
Is the pattern described by Nick is still the only way to enrich an event
with information in a DB ? Is there any way to load a table through a table
and to query it from the Stream ?

Thanks,

Sabri.



--
View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Mixing-Batch-Streaming-tp4530p5300.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.