|
Hi,
I have a job that uses the state processor to load data from checkpoints on google cloud storage to do some processing and then write the result to google cloud storage. The total data size is about 30-50 GB and the job may take more than 2 hours to finish. From the flame graph generated from the job, I found the job spent most of the time on pthread_cond_wait, pthread_cond_timedwait, epoll_wait. It looked like the state processor job is IO-bound. I found a very few articles on state processor performance. Because the job takes time and Flink has lots of parameters to adjust, I wonder whether anyone has experiences in improving the performance in such a case? Thanks for any comment.
Best wishes,
Chen-Che
|