Large number of sources in Flink Job

Posted by chiggi_dev on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/Large-number-of-sources-in-Flink-Job-tp20360.html

Hi,

I am working on a use case where my Flink job needs to collect data from thousands of sources. 

As an example, I want to collect data from more than 2000 File Directories, process(filter, transform) the data and distribute the processed data streams to 200 different directories.

Are there any caveats I should know with such large number of sources, also taking into account per operator parallelism? 

Regards,

Chirag