Why assignTimestampsAndWatermarks parallelism same as map,it will not fired?

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Why assignTimestampsAndWatermarks parallelism same as map,it will not fired?

潘 功森

Hi all,


I use the same parallelism between map and assignTimestampsAndWatermarks , and it not fired, I saw the extractTimestamp and generateWatermark all is fine, but watermark is always not change and keep as min long value.

And then I changed parallelism and different with map, and windows fired.

I used Flink 1.3.2.

Is it a Flink bug?or others can give me why it not fired. It troubled me the whole day.


Best regards,

September

Reply | Threaded
Open this post in threaded view
|

Re: Why assignTimestampsAndWatermarks parallelism same as map,it will not fired?

Timo Walther
Hi,

did you set your time characteristics to even-time?

env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);

Regards,
Timo

Am 25.04.18 um 05:15 schrieb 潘 功森:

> Hi all,
>
>
> I use the same parallelism between map and assignTimestampsAndWatermarks , and it not fired, I saw the extractTimestamp and generateWatermark all is fine, but watermark is always not change and keep as min long value.
>
> And then I changed parallelism and different with map, and windows fired.
>
> I used Flink 1.3.2.
>
> Is it a Flink bug?or others can give me why it not fired. It troubled me the whole day.
>
>
> Best regards,
>
> September


Reply | Threaded
Open this post in threaded view
|

Re: Why assignTimestampsAndWatermarks parallelism same as map,it will not fired?

Fabian Hueske-2
Hi,

This sounds like one of the partitions does not receive data. Watermark generation is data driven, i.e., the watermark can only advance if the TimestampAndWatermarkAssigner sees events.
By changing the parallelism between the map and the assigner, the events are shuffled across and hence there is no "empty" partition anymore.

I would check if one instance of your sources does not emit events.

Best, Fabian

2018-04-25 10:43 GMT+02:00 Timo Walther <[hidden email]>:
Hi,

did you set your time characteristics to even-time?

env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);

Regards,
Timo

Am 25.04.18 um 05:15 schrieb 潘 功森:

Hi all,


I use the same parallelism between map and assignTimestampsAndWatermarks , and it not fired, I saw the extractTimestamp and generateWatermark all is fine, but watermark is always not change and keep as min long value.

And then I changed parallelism and different with map, and windows fired.

I used Flink 1.3.2.

Is it a Flink bug?or others can give me why it not fired. It troubled me the whole day.


Best regards,

September



Reply | Threaded
Open this post in threaded view
|

答复: Why assignTimestampsAndWatermarks parallelism same as map,it will not fired?

潘 功森
In reply to this post by Timo Walther

Yes.




发件人: Timo Walther <[hidden email]>
发送时间: 2018年4月25日 8:43
收件人: [hidden email]
主题: Re: Why assignTimestampsAndWatermarks parallelism same as map,it will not fired?
 
Hi,

did you set your time characteristics to even-time?

env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);

Regards,
Timo

Am 25.04.18 um 05:15 schrieb 潘 功森:
> Hi all,
>
>
> I use the same parallelism between map and assignTimestampsAndWatermarks , and it not fired, I saw the extractTimestamp and generateWatermark all is fine, but watermark is always not change and keep as min long value.
>
> And then I changed parallelism and different with map, and windows fired.
>
> I used Flink 1.3.2.
>
> Is it a Flink bug?or others can give me why it not fired. It troubled me the whole day.
>
>
> Best regards,
>
> September


Reply | Threaded
Open this post in threaded view
|

答复: Why assignTimestampsAndWatermarks parallelism same as map,it will not fired?

潘 功森
In reply to this post by Fabian Hueske-2

The event is running all the time in order,I don't know why one of the partitions does not receive data if not change parallelism?




发件人: Fabian Hueske <[hidden email]>
发送时间: 2018年4月25日 10:56
收件人: Timo Walther
抄送: user
主题: Re: Why assignTimestampsAndWatermarks parallelism same as map,it will not fired?
 
Hi,

This sounds like one of the partitions does not receive data. Watermark generation is data driven, i.e., the watermark can only advance if the TimestampAndWatermarkAssigner sees events.
By changing the parallelism between the map and the assigner, the events are shuffled across and hence there is no "empty" partition anymore.

I would check if one instance of your sources does not emit events.

Best, Fabian

2018-04-25 10:43 GMT+02:00 Timo Walther <[hidden email]>:
Hi,

did you set your time characteristics to even-time?

env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);

Regards,
Timo

Am 25.04.18 um 05:15 schrieb 潘 功森:

Hi all,


I use the same parallelism between map and assignTimestampsAndWatermarks , and it not fired, I saw the extractTimestamp and generateWatermark all is fine, but watermark is always not change and keep as min long value.

And then I changed parallelism and different with map, and windows fired.

I used Flink 1.3.2.

Is it a Flink bug?or others can give me why it not fired. It troubled me the whole day.


Best regards,

September



Reply | Threaded
Open this post in threaded view
|

Re: Why assignTimestampsAndWatermarks parallelism same as map,it will not fired?

Michael Latta
If you are using keyed messages in Kafka, or keyed streams in flink, then only partitions that get hashed to the proper value will get data.  If not keyed messages, then yes they should all get data.

Michael

On Apr 25, 2018, at 8:25 PM, 潘 功森 <[hidden email]> wrote:

The event is running all the time in order,I don't know why one of the partitions does not receive data if not change parallelism?



发件人: Fabian Hueske <[hidden email]>
发送时间: 2018年4月25日 10:56
收件人: Timo Walther
抄送: user
主题: Re: Why assignTimestampsAndWatermarks parallelism same as map,it will not fired?
 
Hi,

This sounds like one of the partitions does not receive data. Watermark generation is data driven, i.e., the watermark can only advance if the TimestampAndWatermarkAssigner sees events.
By changing the parallelism between the map and the assigner, the events are shuffled across and hence there is no "empty" partition anymore.

I would check if one instance of your sources does not emit events.

Best, Fabian

2018-04-25 10:43 GMT+02:00 Timo Walther <[hidden email]>:
Hi,

did you set your time characteristics to even-time?

env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);

Regards,
Timo

Am 25.04.18 um 05:15 schrieb 潘 功森:

Hi all,


I use the same parallelism between map and assignTimestampsAndWatermarks , and it not fired, I saw the extractTimestamp and generateWatermark all is fine, but watermark is always not change and keep as min long value.

And then I changed parallelism and different with map, and windows fired.

I used Flink 1.3.2.

Is it a Flink bug?or others can give me why it not fired. It troubled me the whole day.


Best regards,

September

Reply | Threaded
Open this post in threaded view
|

Re: Why assignTimestampsAndWatermarks parallelism same as map,it will not fired?

gerryzhou
Hi 潘,
could you please check the number of kafka's partitions, I think if the {{number of kafka partition}} <  {{parallelism of source node}}) then there can be some idle parallel which won't recevice any data...

Best Regards,
Sihua Zhou



On 04/26/2018 10:44[hidden email] wrote:
If you are using keyed messages in Kafka, or keyed streams in flink, then only partitions that get hashed to the proper value will get data.  If not keyed messages, then yes they should all get data.

Michael

On Apr 25, 2018, at 8:25 PM, 潘 功森 <[hidden email]> wrote:

The event is running all the time in order,I don't know why one of the partitions does not receive data if not change parallelism?



发件人: Fabian Hueske <[hidden email]>
发送时间: 2018年4月25日 10:56
收件人: Timo Walther
抄送: user
主题: Re: Why assignTimestampsAndWatermarks parallelism same as map,it will not fired?
 
Hi,

This sounds like one of the partitions does not receive data. Watermark generation is data driven, i.e., the watermark can only advance if the TimestampAndWatermarkAssigner sees events.
By changing the parallelism between the map and the assigner, the events are shuffled across and hence there is no "empty" partition anymore.

I would check if one instance of your sources does not emit events.

Best, Fabian

2018-04-25 10:43 GMT+02:00 Timo Walther <[hidden email]>:
Hi,

did you set your time characteristics to even-time?

env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);

Regards,
Timo

Am 25.04.18 um 05:15 schrieb 潘 功森:

Hi all,


I use the same parallelism between map and assignTimestampsAndWatermarks , and it not fired, I saw the extractTimestamp and generateWatermark all is fine, but watermark is always not change and keep as min long value.

And then I changed parallelism and different with map, and windows fired.

I used Flink 1.3.2.

Is it a Flink bug?or others can give me why it not fired. It troubled me the whole day.


Best regards,

September

Reply | Threaded
Open this post in threaded view
|

答复: Why assignTimestampsAndWatermarks parallelism same as map,it will not fired?

潘 功森

I got you.

Thanks

 

发件人: [hidden email]
发送时间: 2018426 10:50
收件人: [hidden email]
抄送: [hidden email]; [hidden email]; [hidden email]; [hidden email]
主题: Re: Why assignTimestampsAndWatermarks parallelism same as mapit will not fired

 

Hi ,

could you please check the number of kafka's partitions, I think if the {{number of kafka partition}} <  {{parallelism of source node}}) then there can be some idle parallel which won't recevice any data...

 

Best Regards,

Sihua Zhou

 

 

 

On 04/26/2018 10:44[hidden email] wrote

If you are using keyed messages in Kafka, or keyed streams in flink, then only partitions that get hashed to the proper value will get data.  If not keyed messages, then yes they should all get data.

 

Michael



On Apr 25, 2018, at 8:25 PM, 功森 <[hidden email]> wrote:

 

The event is running all the time in orderI don't know why one of the partitions does not receive data if not change parallelism?

 

发件人: Fabian Hueske <[hidden email]>
发送时间: 2018425 10:56
收件人: Timo Walther
抄送: user
主题: Re: Why assignTimestampsAndWatermarks parallelism same as mapit will not fired

 

Hi,

This sounds like one of the partitions does not receive data. Watermark generation is data driven, i.e., the watermark can only advance if the TimestampAndWatermarkAssigner sees events.

By changing the parallelism between the map and the assigner, the events are shuffled across and hence there is no "empty" partition anymore.

I would check if one instance of your sources does not emit events.

Best, Fabian

 

2018-04-25 10:43 GMT+02:00 Timo Walther <[hidden email]>:

Hi,

did you set your time characteristics to even-time?

env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);

Regards,
Timo

Am 25.04.18 um 05:15 schrieb
功森:

 

Hi all,


I use the same parallelism between map and assignTimestampsAndWatermarks , and it not fired, I saw the extractTimestamp and generateWatermark all is fine, but watermark is always not change and keep as min long value.

And then I changed parallelism and different with map, and windows fired.

I used Flink 1.3.2.

Is it a Flink bug?or others can give me why it not fired. It troubled me the whole day.


Best regards,

September