Events B33/35 in Parallel Streams Diagram

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Events B33/35 in Parallel Streams Diagram

Neil Derraugh
Hi,

I’m confused about the meaning of event(s?) B33 and B35 in the Parallel Streams Diagram (https://ci.apache.org/projects/flink/flink-docs-master/dev/event_time.html#watermarks-in-parallel-streams). Why are there are two events with the same id on the diagram?  Is this supposed to represent an event emitted twice from the source with differing timestamps?

Thanks,
Neil
Reply | Threaded
Open this post in threaded view
|

Re: Events B33/35 in Parallel Streams Diagram

Fabian Hueske-2
Hi Neil,

"B" only refers to the key-part of the record, the number is the timestamp (as you assumed out). The payload of the record is not displayed in the figure. So B35 and B31 are two different records with identical key.
The keyBy() operation sends all records with the same key to the same subtask.

Does that answer you question?

Best, Fabian

2016-09-29 17:22 GMT+02:00 Neil Derraugh <[hidden email]>:
Hi,

I’m confused about the meaning of event(s?) B33 and B35 in the Parallel Streams Diagram (https://ci.apache.org/projects/flink/flink-docs-master/dev/event_time.html#watermarks-in-parallel-streams). Why are there are two events with the same id on the diagram?  Is this supposed to represent an event emitted twice from the source with differing timestamps?

Thanks,
Neil

Reply | Threaded
Open this post in threaded view
|

Re: Events B33/35 in Parallel Streams Diagram

Neil Derraugh
Hi Fabian,

Yes.  Thanks!  I think it would be helpful to indicate that on the graph.  Call it “key” or “key_id" instead of just “id”, as it is in fact the key of the stream and not the id of the event?  Probably seems trivial, but I struggled with this one. haha.  I’ll submit a PR for the docs if there’s interest. 

Neil

On Sep 29, 2016, at 11:36 AM, Fabian Hueske <[hidden email]> wrote:

Hi Neil,

"B" only refers to the key-part of the record, the number is the timestamp (as you assumed out). The payload of the record is not displayed in the figure. So B35 and B31 are two different records with identical key.
The keyBy() operation sends all records with the same key to the same subtask.

Does that answer you question?

Best, Fabian

2016-09-29 17:22 GMT+02:00 Neil Derraugh <[hidden email]>:
Hi,

I’m confused about the meaning of event(s?) B33 and B35 in the Parallel Streams Diagram (https://ci.apache.org/projects/flink/flink-docs-master/dev/event_time.html#watermarks-in-parallel-streams). Why are there are two events with the same id on the diagram?  Is this supposed to represent an event emitted twice from the source with differing timestamps?

Thanks,
Neil


Reply | Threaded
Open this post in threaded view
|

Re: Events B33/35 in Parallel Streams Diagram

Fabian Hueske-2
Sure, that would be great!
Thanks!

2016-09-29 17:43 GMT+02:00 Neil Derraugh <[hidden email]>:
Hi Fabian,

Yes.  Thanks!  I think it would be helpful to indicate that on the graph.  Call it “key” or “key_id" instead of just “id”, as it is in fact the key of the stream and not the id of the event?  Probably seems trivial, but I struggled with this one. haha.  I’ll submit a PR for the docs if there’s interest. 

Neil

On Sep 29, 2016, at 11:36 AM, Fabian Hueske <[hidden email]> wrote:

Hi Neil,

"B" only refers to the key-part of the record, the number is the timestamp (as you assumed out). The payload of the record is not displayed in the figure. So B35 and B31 are two different records with identical key.
The keyBy() operation sends all records with the same key to the same subtask.

Does that answer you question?

Best, Fabian

2016-09-29 17:22 GMT+02:00 Neil Derraugh <[hidden email]>:
Hi,

I’m confused about the meaning of event(s?) B33 and B35 in the Parallel Streams Diagram (https://ci.apache.org/projects/flink/flink-docs-master/dev/event_time.html#watermarks-in-parallel-streams). Why are there are two events with the same id on the diagram?  Is this supposed to represent an event emitted twice from the source with differing timestamps?

Thanks,
Neil