Join two streams using a count-based window

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Join two streams using a count-based window

Nikos R. Katsipoulakis
Hello all,

At first, I have a question posted on http://stackoverflow.com/questions/37732978/join-two-streams-using-a-count-based-window . I am re-posting this on the mailing list in case some of you are not on SO.

In addition, I would like to know what is the difference between Flink and other Streaming engines on data-granularity transport and processing. To be more precise, I am aware that Storm sends tuples using Netty (by filling up queues) and a Bolt's logic is executed per tuple. Spark, employs micro-batches to simulate streaming and (I am not entirely certain) each task performs processing on a micro-batch. What about Flink? How are tuples transferred and processed. Any explanation and or article/blog-post/link is more than welcome.

Thanks

--
Nikos R. Katsipoulakis, 
Department of Computer Science 
University of Pittsburgh
Reply | Threaded
Open this post in threaded view
|

Re: Join two streams using a count-based window

Matthias J. Sax-2
I just put an answer to SO.

About the other questions: Flink processes tuple-by-tuple and does some
internal buffering. You might be interested in
https://cwiki.apache.org/confluence/display/FLINK/Data+exchange+between+tasks

-Matthias

On 06/09/2016 08:13 PM, Nikos R. Katsipoulakis wrote:

> Hello all,
>
> At first, I have a question posted on
> http://stackoverflow.com/questions/37732978/join-two-streams-using-a-count-based-window
> . I am re-posting this on the mailing list in case some of you are not
> on SO.
>
> In addition, I would like to know what is the difference between Flink
> and other Streaming engines on data-granularity transport and
> processing. To be more precise, I am aware that Storm sends tuples using
> Netty (by filling up queues) and a Bolt's logic is executed per tuple.
> Spark, employs micro-batches to simulate streaming and (I am not
> entirely certain) each task performs processing on a micro-batch. What
> about Flink? How are tuples transferred and processed. Any explanation
> and or article/blog-post/link is more than welcome.
>
> Thanks
>
> --
> Nikos R. Katsipoulakis,
> Department of Computer Science
> University of Pittsburgh


signature.asc (836 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Join two streams using a count-based window

Nikos R. Katsipoulakis
Thank you very much Matthias! Also, the link you provided is very helpful.

Cheers,
Nikos

On Fri, Jun 10, 2016 at 3:16 AM, Matthias J. Sax <[hidden email]> wrote:
I just put an answer to SO.

About the other questions: Flink processes tuple-by-tuple and does some
internal buffering. You might be interested in
https://cwiki.apache.org/confluence/display/FLINK/Data+exchange+between+tasks

-Matthias

On 06/09/2016 08:13 PM, Nikos R. Katsipoulakis wrote:
> Hello all,
>
> At first, I have a question posted on
> http://stackoverflow.com/questions/37732978/join-two-streams-using-a-count-based-window
> . I am re-posting this on the mailing list in case some of you are not
> on SO.
>
> In addition, I would like to know what is the difference between Flink
> and other Streaming engines on data-granularity transport and
> processing. To be more precise, I am aware that Storm sends tuples using
> Netty (by filling up queues) and a Bolt's logic is executed per tuple.
> Spark, employs micro-batches to simulate streaming and (I am not
> entirely certain) each task performs processing on a micro-batch. What
> about Flink? How are tuples transferred and processed. Any explanation
> and or article/blog-post/link is more than welcome.
>
> Thanks
>
> --
> Nikos R. Katsipoulakis,
> Department of Computer Science
> University of Pittsburgh




--
Nikos R. Katsipoulakis, 
Department of Computer Science 
University of Pittsburgh