(DEPRECATED) Apache Flink User Mailing List archive.

Kafka watermarks

Classic

List

Threaded

5 messages Options

nragon

Kafka watermarks

When consuming from kafka should we use BOOTE inside consumer or after?
Thanks

Nico Kruber

Re: Kafka watermarks

Can you clarify a bit more on what you want to achieve? Also, what is "BOOTE"?

Nico

On Tuesday, 20 June 2017 13:45:06 CEST nragon wrote:

> When consuming from kafka should we use BOOTE inside consumer or after?
> Thanks
>
>
>
> --
> View this message in context:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Kafka-w
> atermarks-tp13849.html Sent from the Apache Flink User Mailing List archive.
> mailing list archive at Nabble.com.

signature.asc (201 bytes) Download Attachment

nragon

Re: Kafka watermarks

This post was updated on .

So, in order to work with event time I have to options, inside kafka consumer or after kafka consumer.
The first I can use:
FlinkKafkaConsumer09

Nico Kruber

Re: Kafka watermarks

according to the javadoc of
FlinkKafkaConsumerBase#assignTimestampsAndWatermarks():

"Specifies an {@link AssignerWithPunctuatedWatermarks} to emit watermarks in a
punctuated manner. The watermark extractor will run per Kafka partition,
watermarks will be merged across partitions in the same way as in the Flink
runtime, when streams are merged.

When a subtask of a FlinkKafkaConsumer source reads multiple Kafka partitions,
the streams from the partitions are unioned in a "first come first serve"
fashion. Per-partition characteristics are usually lost that way. For example,
if the timestamps are strictly ascending per Kafka partition, they will not be
strictly ascending in the resulting Flink DataStream, if the parallel source
subtask reads more that one partition.

Running timestamp extractors / watermark generators directly inside the Kafka
source, per Kafka partition, allows users to let them exploit the per-
partition characteristics."

Thus, if you can leverage Kafka per-partition characteristics, do it there,
otherwise it probably does not matter.

Nico

On Tuesday, 20 June 2017 17:46:23 CEST nragon wrote:

> So, in order to work with event time I have to options, inside kafka
> consumer or after kafka consumer.
> The first I can use:
> FlinkKafkaConsumer09<DataParameterMap> consumer.....
> consumer. assignTimestampsAndWatermarks()
>
> The other option:
> FlinkKafkaConsumer09<DataParameterMap> consumer.....
> DataStream dataStream =env.addSource(consumer); dataStream.
> assignTimestampsAndWatermarks()
>
> Any recommendation?
>
>
>
> --
> View this message in context:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Kafka-w
> atermarks-tp13849p13872.html Sent from the Apache Flink User Mailing List
> archive. mailing list archive at Nabble.com.

signature.asc (201 bytes) Download Attachment

nragon

Re: Kafka watermarks

I guess using BoundedOutOfOrdernessTimestampExtractor inside consumer will work.
Thanks