Why FlinkKafkaConsumer08 does not support exactly once or at least once semantic?

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Why FlinkKafkaConsumer08 does not support exactly once or at least once semantic?

徐涛
Hi All,
In document of Flink 1.6, it says that "Before 0.9 Kafka did not provide any mechanisms to guarantee at-least-once or exactly-once semantics”
I read the source code of FlinkKafkaConsumer08, and the comment says:
“Please note that Flink snapshots the offsets internally as part of its distributed checkpoints. The offsets
 * committed to Kafka / ZooKeeper are only to bring the outside view of progress in sync with Flink's view
 * of the progress. That way, monitoring and other jobs can get a view of how far the Flink Kafka consumer
 * has consumed a topic"
Obviously, the kafka partition offsets are checkpointed periodically. And when some error happens, the data are read from kafka, continued from the checkpointed offset. Then source and other operator states restart from the same checkpoint. Then why does the document say “Before 0.9 Kafka did not provide any mechanisms to guarantee at-least-once or exactly-once semantics” ?

Thanks a lot.


Best
Henry
Reply | Threaded
Open this post in threaded view
|

Re: Why FlinkKafkaConsumer08 does not support exactly once or at least once semantic?

Stefan Richter
Hi,

I think this part of the documentation is talking about KafkaProducer, and you are reading in the source code of KafkaConsumer.

Best,
Stefan

Am 20.09.2018 um 10:48 schrieb 徐涛 <[hidden email]>:

Hi All,
In document of Flink 1.6, it says that "Before 0.9 Kafka did not provide any mechanisms to guarantee at-least-once or exactly-once semantics”
I read the source code of FlinkKafkaConsumer08, and the comment says:
“Please note that Flink snapshots the offsets internally as part of its distributed checkpoints. The offsets
 * committed to Kafka / ZooKeeper are only to bring the outside view of progress in sync with Flink's view
 * of the progress. That way, monitoring and other jobs can get a view of how far the Flink Kafka consumer
 * has consumed a topic"
Obviously, the kafka partition offsets are checkpointed periodically. And when some error happens, the data are read from kafka, continued from the checkpointed offset. Then source and other operator states restart from the same checkpoint. Then why does the document say “Before 0.9 Kafka did not provide any mechanisms to guarantee at-least-once or exactly-once semantics” ?

Thanks a lot.


Best
Henry

Reply | Threaded
Open this post in threaded view
|

Re: Why FlinkKafkaConsumer08 does not support exactly once or at least once semantic?

徐涛
Hi Stefan,
I didn`t notice that in the webpage..  Thanks a lot for your help!
That makes me understand the delivery guarantees more clearly.

Best
Henry


在 2018年9月20日,下午5:02,Stefan Richter <[hidden email]> 写道:

Hi,

I think this part of the documentation is talking about KafkaProducer, and you are reading in the source code of KafkaConsumer.

Best,
Stefan

Am 20.09.2018 um 10:48 schrieb 徐涛 <[hidden email]>:

Hi All,
In document of Flink 1.6, it says that "Before 0.9 Kafka did not provide any mechanisms to guarantee at-least-once or exactly-once semantics”
I read the source code of FlinkKafkaConsumer08, and the comment says:
“Please note that Flink snapshots the offsets internally as part of its distributed checkpoints. The offsets
 * committed to Kafka / ZooKeeper are only to bring the outside view of progress in sync with Flink's view
 * of the progress. That way, monitoring and other jobs can get a view of how far the Flink Kafka consumer
 * has consumed a topic"
Obviously, the kafka partition offsets are checkpointed periodically. And when some error happens, the data are read from kafka, continued from the checkpointed offset. Then source and other operator states restart from the same checkpoint. Then why does the document say “Before 0.9 Kafka did not provide any mechanisms to guarantee at-least-once or exactly-once semantics” ?

Thanks a lot.


Best
Henry