> We have noticed that the Kafka offset auto-commit functionality seems to stop
> working after it encounters a timeout. It appears in the logs like this:
>
> 2018-03-04 07:02:54,779 INFO
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator - Marking
> the coordinator kafka06:9092 (id: 2147483641 rack: null) dead for group
> consumergroup01
> 2018-03-04 07:02:54,780 WARN
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator -
> Auto-commit of offsets {topic01-24=OffsetAndMetadata{offset=153237895,
> metadata=''}} failed for group consumergroup01: Offset commit failed with a
> retriable exception. You should retry committing offsets. The underlying
> error was: The request timed out.
>
> After this message is logged, no more offsets are committed by the job until
> it is restarted (and if the flink process ends abnormally, the offsets never
> get committed).
>
> This is using Flink 1.4.0 which uses kafka-clients 0.11.0.2. We are using
> the default kafka client settings for enable.auto.commit (true) and
> auto.commit.interval.ms (5000). We are not using Flink checkpointing, so the
> kafka client offset commit mode is OffsetCommitMode.KAFKA_PERIODIC (not
> OffsetCommitMode.ON_CHECKPOINTS).
>
> I'm wondering if others have encountered this?
>
> And if so, does enabling checkpointing resolve the issue, because
> Kafka09Fetcher.doCommitInternalOffsetsToKafka is called from the Flink code?
>
>
>
> --
> Sent from:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/>