[Discuss] Ordering of Records in Stream

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

[Discuss] Ordering of Records in Stream

Vinay Patil
This post was updated on .
Hi,

Here are some of the queries I have :

I have two different streams stream1 and stream2 in which the elements are in order.

1) Now when I do keyBy on each of these streams, will the order be maintained ? (Since every group here will be sent to one task manager only )
My understanding is that the records will be in order for a group, correct me here

2) After the keyBy on both of the streams I am doing co-group to get the matching and non-matching records. Will the order be maintained here also ?, since this also works on KeyedStream.
I am using EventTime, and AscendingTimestampExtractor for generating timestamp and watermark.

3) Now I want to perform the sequence check on the matching_nonMatchingStream I get from 2) using map/flatmap.
Do I need to again perform the keyBy here , or if I keep in chain will the matching_nonMatchingStream run in same TaskManager ?
My understanding here is that the chain will work here, correct me , getting confused.

4) slotSharingGroup - can you please describe more about this
according to the doc : Sets the slot sharing group of this operation. Parallel instances of operations that are in the same slot sharing group will be co-located in the same TaskManager slot, if possible.


Regards,
Vinay Patil
Reply | Threaded
Open this post in threaded view
|

Re: [Discuss] Ordering of Records

rmetzger0
There is a parallel thread answering the questions going on here already: http://stackoverflow.com/questions/38354713/ordering-of-records-in-stream


On Tue, Jul 12, 2016 at 7:12 PM, vinay patil <[hidden email]> wrote:
Hi,

Here are some of the queries I have :

I have two different streams stream1 and stream2 in which the elements are
in order.

1) Now when I do keyBy on each of these streams, will the order be
maintained ? (Since every group here will be sent to one task manager only )
My understanding is that the records will be in order for a group, correct
me here

2) After the keyBy on both of the streams I am doing co-group to get the
matching and non-matching records. Will the order be maintained here also ?,
since this also works on KeyedStream.
I am using EventTime, and AscendingTimestampExtractor for generating
timestamp and watermark.

3) Now I want to perform the sequence check on the
matching_nonMatchingStream I get from 2) using map/flatmap.
Do I need to again perform the keyBy here , or if I keep in chain will the
matching_nonMatchingStream run in same TaskManager ?
My understanding here is that the chain will work here, correct me , getting
confused.

4) slotSharingGroup - can you please describe more about this
according to the doc : Sets the slot sharing group of this operation.
Parallel instances of operations that are in the same slot sharing group
will be co-located in the same TaskManager slot, if possible.


Regards,
Vinay Patil



--
View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Discuss-Ordering-of-Records-tp7933.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.

Reply | Threaded
Open this post in threaded view
|

Re: [Discuss] Ordering of Records

Vinay Patil
Hi Robert,

This was the same posted by me in stack overflow since I was not getting any reply here :)

Regards,
Vinay Patil

On Thu, Jul 14, 2016 at 6:55 PM, rmetzger0 [via Apache Flink User Mailing List archive.] <[hidden email]> wrote:
There is a parallel thread answering the questions going on here already: http://stackoverflow.com/questions/38354713/ordering-of-records-in-stream


On Tue, Jul 12, 2016 at 7:12 PM, vinay patil <[hidden email]> wrote:
Hi,

Here are some of the queries I have :

I have two different streams stream1 and stream2 in which the elements are
in order.

1) Now when I do keyBy on each of these streams, will the order be
maintained ? (Since every group here will be sent to one task manager only )
My understanding is that the records will be in order for a group, correct
me here

2) After the keyBy on both of the streams I am doing co-group to get the
matching and non-matching records. Will the order be maintained here also ?,
since this also works on KeyedStream.
I am using EventTime, and AscendingTimestampExtractor for generating
timestamp and watermark.

3) Now I want to perform the sequence check on the
matching_nonMatchingStream I get from 2) using map/flatmap.
Do I need to again perform the keyBy here , or if I keep in chain will the
matching_nonMatchingStream run in same TaskManager ?
My understanding here is that the chain will work here, correct me , getting
confused.

4) slotSharingGroup - can you please describe more about this
according to the doc : Sets the slot sharing group of this operation.
Parallel instances of operations that are in the same slot sharing group
will be co-located in the same TaskManager slot, if possible.


Regards,
Vinay Patil



--
View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Discuss-Ordering-of-Records-tp7933.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.




If you reply to this email, your message will be added to the discussion below:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Discuss-Ordering-of-Records-in-Stream-tp7933p7963.html
To start a new topic under Apache Flink User Mailing List archive., email [hidden email]
To unsubscribe from [Discuss] Ordering of Records in Stream, click here.
NAML