questions regarding offset

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

questions regarding offset

avilevi
Hi Guys,
I understood that offset is kept as part of the checkpoint and persisted in the state (please correct me if I'm wrong)

1. If I copy my persisted state to another cluster (different kafka servers as well) how is the offset handled ? 
2. In a stateless job how is the offset managed ? since there is no persistency . I mean in aspect of exactly once, recovery ...

BR
Avi
Reply | Threaded
Open this post in threaded view
|

Re: questions regarding offset

Dawid Wysakowicz-2
Hi Avi,

Yes, you are right. Kafka offsets are kept in state.

Ad. 1 If you try to restore a state in a completely different
environment, and offsets are no longer compatible it will most probably
fail as it won't be able to derive up to which point we already
processed the records.

Ad.2 What do you mean by stateless job? Do you mean a job with
checkpoints disabled? If so then the job does not checkpoint kafka
offsets. They might be committed back to Kafka based on the internal
Kafka consumer configuration[1]. So in case of failover it will use
given start position configuration[2].

Best,

Dawid


[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.7/dev/connectors/kafka.html#kafka-consumers-offset-committing-behaviour-configuration

[2]
https://ci.apache.org/projects/flink/flink-docs-release-1.7/dev/connectors/kafka.html#kafka-consumers-start-position-configuration


On 28/03/2019 06:51, Avi Levi wrote:

> Hi Guys,
> I understood that offset is kept as part of the checkpoint and
> persisted in the state (please correct me if I'm wrong)
>
> 1. If I copy my persisted state to another cluster (different kafka
> servers as well) how is the offset handled ? 
> 2. In a stateless job how is the offset managed ? since there is no
> persistency . I mean in aspect of exactly once, recovery ...
>
> BR
> Avi


signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: questions regarding offset

avilevi
Thanks for answering. please see my comments below 

On Thu, Mar 28, 2019 at 12:32 PM Dawid Wysakowicz <[hidden email]> wrote:
Hi Avi,

Yes, you are right. Kafka offsets are kept in state.

Ad. 1 If you try to restore a state in a completely different
environment, and offsets are no longer compatible it will most probably
fail as it won't be able to derive up to which point we already
processed the records.
So there is no way to move state between clusters ? I thought that the offsets are managed also by job id. butI guess I was wrong 

Ad.2 What do you mean by stateless job? Do you mean a job with
checkpoints disabled? If so then the job does not checkpoint kafka
offsets. They might be committed back to Kafka based on the internal
Kafka consumer configuration[1]. So in case of failover it will use
given start position configuration[2].
 
By stateless I mean a job without need to persist a state but with checkpoints enabled.  

Best,

Dawid


[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.7/dev/connectors/kafka.html#kafka-consumers-offset-committing-behaviour-configuration

[2]
https://ci.apache.org/projects/flink/flink-docs-release-1.7/dev/connectors/kafka.html#kafka-consumers-start-position-configuration


On 28/03/2019 06:51, Avi Levi wrote:
> Hi Guys,
> I understood that offset is kept as part of the checkpoint and
> persisted in the state (please correct me if I'm wrong)
>
> 1. If I copy my persisted state to another cluster (different kafka
> servers as well) how is the offset handled ? 
> 2. In a stateless job how is the offset managed ? since there is no
> persistency . I mean in aspect of exactly once, recovery ...
>
> BR
> Avi