multi-tenancy without a kafka partition per tenant

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

multi-tenancy without a kafka partition per tenant

Constantinos Papadopoulos

We have a multi-tenancy scenario where:

  • the source will be Kafka, and a Kafka partition could contain data from multiple tenants
  • our sink will send data to a different DB instance, depending on the tenant


Is there a way to prevent slowness in one tenant from slowing other tenants, without assigning kafka partitions to tenants?


My understanding is that the answer is "no", but I'm curious whether I'm missing a cool way to accomplish this. 


In the absence of such a way, I 'believe' that slowness in one tenant’s DB instance will cause backpressure all the way back to the source (Kafka partition), and thus Flink will slow its reading from the given Kafka partition, thus also impacting the rest of the tenants that reside in that Kafka partition.



Reply | Threaded
Open this post in threaded view
|

Re: multi-tenancy without a kafka partition per tenant

vino yang
Hi Constantinos,

I think your analysis is correct, if you have a multi-tenant scenario, but there is no distinction in Kafka. Then Flink can't treat different tenants differently. It is easy to form a data hotspot problem for the difference in the data volume of different tenants.

A compromise is handled by Flink and Kafka to split your data source by tenant, and then let Flink distinguish between different tenants.

Best,
Vino

Constantinos Papadopoulos <[hidden email]> 于2019年10月21日周一 下午3:25写道:

We have a multi-tenancy scenario where:

  • the source will be Kafka, and a Kafka partition could contain data from multiple tenants
  • our sink will send data to a different DB instance, depending on the tenant


Is there a way to prevent slowness in one tenant from slowing other tenants, without assigning kafka partitions to tenants?


My understanding is that the answer is "no", but I'm curious whether I'm missing a cool way to accomplish this. 


In the absence of such a way, I 'believe' that slowness in one tenant’s DB instance will cause backpressure all the way back to the source (Kafka partition), and thus Flink will slow its reading from the given Kafka partition, thus also impacting the rest of the tenants that reside in that Kafka partition.