Hi,
I am working on a use case where my Flink job needs to collect data from thousands of sources.
As an example, I want to collect data from more than 2000 File Directories, process(filter, transform) the data and distribute the processed data streams to 200 different directories.
Are there any caveats I should know with such large number of sources, also taking into account per operator parallelism?
Regards,
Chirag