[QUERY] Multiple elastic search sinks for a single flink pipeline

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

[QUERY] Multiple elastic search sinks for a single flink pipeline

Vignesh Ramesh

My requirement is to send the data to a different ES sink (based on the data). Ex: If the data contains a particular info send it to sink1 else send it to sink2 etc(basically send it dynamically to any one sink based on the data). I also want to set parallelism separately for ES sink1, ES sink2, Es sink3 etc.

                                ->  Es sink1 (parallelism 4)
Kafka -> Map(Transformations)   ->  ES sink2 (parallelism 2)
                                ->  Es sink3 (parallelism 2)

Is there any simple way to achieve the above in flink ?

My solution: (but not satisfied with it)

I could come up with a solution but there are intermediate kafka topics which i write to (topic1,topic2,topic3) and then have separate pipelines for Essink1,Essink2 and ESsink3. I want to avoid writing to these intermediate kafka topics.

kafka -> Map(Transformations) -> Kafka topics (Insert into topic1,topic2,topic3 based on the data)

Kafka topic1 -> Essink1(parallelism 4)

Kafka topic2 -> Essink2(parallelism 2)

Kafka topic3 -> Essink3(parallelism 2)
Regards,
Vignesh
Reply | Threaded
Open this post in threaded view
|

Re: [QUERY] Multiple elastic search sinks for a single flink pipeline

Chesnay Schepler
Are the number of sinks fixed? If so, then you can just take the output of your map function and apply multiple filters, writing the output of each filter into a sync. You could also use a process function with side-outputs, and apply a source to each output.

On 10/14/2020 6:05 PM, Vignesh Ramesh wrote:

My requirement is to send the data to a different ES sink (based on the data). Ex: If the data contains a particular info send it to sink1 else send it to sink2 etc(basically send it dynamically to any one sink based on the data). I also want to set parallelism separately for ES sink1, ES sink2, Es sink3 etc.

                                ->  Es sink1 (parallelism 4)
Kafka -> Map(Transformations)   ->  ES sink2 (parallelism 2)
                                ->  Es sink3 (parallelism 2)

Is there any simple way to achieve the above in flink ?

My solution: (but not satisfied with it)

I could come up with a solution but there are intermediate kafka topics which i write to (topic1,topic2,topic3) and then have separate pipelines for Essink1,Essink2 and ESsink3. I want to avoid writing to these intermediate kafka topics.

kafka -> Map(Transformations) -> Kafka topics (Insert into topic1,topic2,topic3 based on the data)

Kafka topic1 -> Essink1(parallelism 4)

Kafka topic2 -> Essink2(parallelism 2)

Kafka topic3 -> Essink3(parallelism 2)
Regards,
Vignesh