About SerializationSchema/DeserializationSchema's concurrency safety

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

About SerializationSchema/DeserializationSchema's concurrency safety

Chase Zhang
Hi there,

I'm writing for knowledge about SerializationSchema/DeserializationSchema's concurrency safety.

The problem is, I've implemented my own KafkaSerializationSchema/KafkaDeserializationKafka which I have added some internal state and object cache as to avoid heavy object(memory) allocation. My question is, if the two classes are safe in the following two situations:

1. Multiple sub-tasks are running in the same JVM.
2. One sub-task is outputting to multiple Kafka partitions.

Will there be shared instance (which lead to unsafe concurrency) or each sub-task/thread will have its own instance (safe)?
Reply | Threaded
Open this post in threaded view
|

Re: About SerializationSchema/DeserializationSchema's concurrency safety

Fabian Hueske-2
Hi,

Yes, multiple instances of the same De/SerializationSchema can be executed in the same JVM.
Regarding 2. I'm not 100%, but would suspect that one De/SerializationSchema instance handles multiple partitions.
Gordon (in CC) should know this for sure.

Best,
Fabian

Am Mo., 10. Juni 2019 um 05:25 Uhr schrieb Chase Zhang <[hidden email]>:
Hi there,

I'm writing for knowledge about SerializationSchema/DeserializationSchema's concurrency safety.

The problem is, I've implemented my own KafkaSerializationSchema/KafkaDeserializationKafka which I have added some internal state and object cache as to avoid heavy object(memory) allocation. My question is, if the two classes are safe in the following two situations:

1. Multiple sub-tasks are running in the same JVM.
2. One sub-task is outputting to multiple Kafka partitions.

Will there be shared instance (which lead to unsafe concurrency) or each sub-task/thread will have its own instance (safe)?