Deletion of topic from a pattern in KafkaConsumer if auto-create is true will always recreate the topic

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Deletion of topic from a pattern in KafkaConsumer if auto-create is true will always recreate the topic

Vishal Santoshi
Given


it seems that if

*  I have a regex pattern for consuming a bunch of topics 
* auto create is turned on

then even if I do this

* suspend the pipe with SP 
* delete the topic 
* start the pipe 

It will recreate the topic, b'coz the state has it and the check is against the pattern rather then against the cluster metadata....in case  auto create is turned off it will throw an exception ( I think ) 


The question thus is, as to how can we safely delete a topic from our kafka cluster without either a recreation ( auto create on ) or exception ( auto create off ) when restarting from a SP, where the restore  state had that topic..

Thanks much,

Visha.


Reply | Threaded
Open this post in threaded view
|

Re: Deletion of topic from a pattern in KafkaConsumer if auto-create is true will always recreate the topic

Vishal Santoshi
I guess, adjusting the pattern  ( blacklisting the topic/s ) would work....

On Thu, Jun 13, 2019 at 3:02 PM Vishal Santoshi <[hidden email]> wrote:
Given


it seems that if

*  I have a regex pattern for consuming a bunch of topics 
* auto create is turned on

then even if I do this

* suspend the pipe with SP 
* delete the topic 
* start the pipe 

It will recreate the topic, b'coz the state has it and the check is against the pattern rather then against the cluster metadata....in case  auto create is turned off it will throw an exception ( I think ) 


The question thus is, as to how can we safely delete a topic from our kafka cluster without either a recreation ( auto create on ) or exception ( auto create off ) when restarting from a SP, where the restore  state had that topic..

Thanks much,

Visha.


Reply | Threaded
Open this post in threaded view
|

Re: Deletion of topic from a pattern in KafkaConsumer if auto-create is true will always recreate the topic

Fabian Hueske-2
Hi Visha,

If I remember correctly, the behavior of the Kafka consumer was changed in Flink 1.8 to account for such situations.
Please check the release notes [1] and the corresponding Jira issue [2].

If this is not the behavior you need, please feel free to create a new Jira issue and start a discussion there.

Thanks, Fabian


Am Do., 13. Juni 2019 um 22:31 Uhr schrieb Vishal Santoshi <[hidden email]>:
I guess, adjusting the pattern  ( blacklisting the topic/s ) would work....

On Thu, Jun 13, 2019 at 3:02 PM Vishal Santoshi <[hidden email]> wrote:
Given


it seems that if

*  I have a regex pattern for consuming a bunch of topics 
* auto create is turned on

then even if I do this

* suspend the pipe with SP 
* delete the topic 
* start the pipe 

It will recreate the topic, b'coz the state has it and the check is against the pattern rather then against the cluster metadata....in case  auto create is turned off it will throw an exception ( I think ) 


The question thus is, as to how can we safely delete a topic from our kafka cluster without either a recreation ( auto create on ) or exception ( auto create off ) when restarting from a SP, where the restore  state had that topic..

Thanks much,

Visha.


Reply | Threaded
Open this post in threaded view
|

Re: Deletion of topic from a pattern in KafkaConsumer if auto-create is true will always recreate the topic

Vishal Santoshi
Yep, but 

"Consider this example: if you had a Kafka Consumer that was consuming from topic A, you did a savepoint, then changed your Kafka consumer to instead consume from topic B, and then restarted your job from the savepoint. Before this change, your consumer would now consume from both topic A and B because it was stored in state that the consumer was consuming from topic A. With the change, your consumer would only consume from topic B after restore because it now filters the topics that are stored in state using the configured topics."

Yep, the issue is that our pattern is all inclusive and thus configured topics implies all topics for the all inclusive pattern and thus all of our 1500 plus topics . We use flink to push all data in all topics to hdfs.  We now are in the process of pruning our topics ( and as you can imagine, they do not follow a specific pattern on name )  and thus removal of some 500 of them. We would have assumed that removing the topics from the cluster would be an indication to flink that though it has state associated with those topics and irrespective of pattern,  the topics should no longer be consumed, ver akin to auto discovery of partitions ( and thus topics ) but opposite.

Does that make sense ? 








On Fri, Jun 14, 2019 at 3:24 AM Fabian Hueske <[hidden email]> wrote:
Hi Visha,

If I remember correctly, the behavior of the Kafka consumer was changed in Flink 1.8 to account for such situations.
Please check the release notes [1] and the corresponding Jira issue [2].

If this is not the behavior you need, please feel free to create a new Jira issue and start a discussion there.

Thanks, Fabian


Am Do., 13. Juni 2019 um 22:31 Uhr schrieb Vishal Santoshi <[hidden email]>:
I guess, adjusting the pattern  ( blacklisting the topic/s ) would work....

On Thu, Jun 13, 2019 at 3:02 PM Vishal Santoshi <[hidden email]> wrote:
Given


it seems that if

*  I have a regex pattern for consuming a bunch of topics 
* auto create is turned on

then even if I do this

* suspend the pipe with SP 
* delete the topic 
* start the pipe 

It will recreate the topic, b'coz the state has it and the check is against the pattern rather then against the cluster metadata....in case  auto create is turned off it will throw an exception ( I think ) 


The question thus is, as to how can we safely delete a topic from our kafka cluster without either a recreation ( auto create on ) or exception ( auto create off ) when restarting from a SP, where the restore  state had that topic..

Thanks much,

Visha.


Reply | Threaded
Open this post in threaded view
|

Re: Deletion of topic from a pattern in KafkaConsumer if auto-create is true will always recreate the topic

Vishal Santoshi
I am sure there was a ticket open that allowed for clean manipulation of state ( that would have saved us a whole lot )......

On Fri, Jun 14, 2019 at 1:19 PM Vishal Santoshi <[hidden email]> wrote:
Yep, but 

"Consider this example: if you had a Kafka Consumer that was consuming from topic A, you did a savepoint, then changed your Kafka consumer to instead consume from topic B, and then restarted your job from the savepoint. Before this change, your consumer would now consume from both topic A and B because it was stored in state that the consumer was consuming from topic A. With the change, your consumer would only consume from topic B after restore because it now filters the topics that are stored in state using the configured topics."

Yep, the issue is that our pattern is all inclusive and thus configured topics implies all topics for the all inclusive pattern and thus all of our 1500 plus topics . We use flink to push all data in all topics to hdfs.  We now are in the process of pruning our topics ( and as you can imagine, they do not follow a specific pattern on name )  and thus removal of some 500 of them. We would have assumed that removing the topics from the cluster would be an indication to flink that though it has state associated with those topics and irrespective of pattern,  the topics should no longer be consumed, ver akin to auto discovery of partitions ( and thus topics ) but opposite.

Does that make sense ? 








On Fri, Jun 14, 2019 at 3:24 AM Fabian Hueske <[hidden email]> wrote:
Hi Visha,

If I remember correctly, the behavior of the Kafka consumer was changed in Flink 1.8 to account for such situations.
Please check the release notes [1] and the corresponding Jira issue [2].

If this is not the behavior you need, please feel free to create a new Jira issue and start a discussion there.

Thanks, Fabian


Am Do., 13. Juni 2019 um 22:31 Uhr schrieb Vishal Santoshi <[hidden email]>:
I guess, adjusting the pattern  ( blacklisting the topic/s ) would work....

On Thu, Jun 13, 2019 at 3:02 PM Vishal Santoshi <[hidden email]> wrote:
Given


it seems that if

*  I have a regex pattern for consuming a bunch of topics 
* auto create is turned on

then even if I do this

* suspend the pipe with SP 
* delete the topic 
* start the pipe 

It will recreate the topic, b'coz the state has it and the check is against the pattern rather then against the cluster metadata....in case  auto create is turned off it will throw an exception ( I think ) 


The question thus is, as to how can we safely delete a topic from our kafka cluster without either a recreation ( auto create on ) or exception ( auto create off ) when restarting from a SP, where the restore  state had that topic..

Thanks much,

Visha.