Flink Standalone cluster - production settings

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Flink Standalone cluster - production settings

simpleusr
I know this seems a silly question but I am trying to figure out optimal set
up for our flink jobs.
We are using standalone cluster with 5 jobs. Each job has 3 asynch operators
with Executors with thread counts of 20,20,100. Source is kafka and
cassandra and rest sinks exist.
Currently we are using parallelism = 1.  So at max load a single job spans
at least 140 threads. Also we are using netty based libraries for cassandra
and restcalls . (As I can see in thread dump flink also uses netty server).
 What we see is that total thread count adds up to ~ 500 for a single job.

Suddenly all jobs began to faıl ın production and we saw that it was mainly
due to ulimit user process. All jobs started in one server in cluster ( I do
not know why, as it is a cluster with 3 members)
It was set to around 1500 in that server. We then set a higher value and
problems seem to go away.

Can you recommend an optional prod setting for standalone cluster? Or should
there be a max limit on threads spawned by a single job?

Regards




--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/