same parallelism with different taskmanager and slots, skew occurs

Posted by varuy322 on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/same-parallelism-with-different-taskmanager-and-slots-skew-occurs-tp25281.html

Hi, there

Recently I run streaming benchmark with flink 1.5.2 standalone on the
cluster with 4 machines(1 as master and others as workers), it appears
different result as below:
(1). when I set the parallelism with 96, source, sink and middle operator
parallelism all set to 96, start 3 taskmanager and each taskmanager slot is
32, all goes well.
(2). when I change (1) to start 6 taskmanager, here 2 taskmanger on each
work and each taskmanager slot is 16. all goes well too. At this situation,
I find the subtask on each work processed same data size, but one worker
processed times than other worker, it seems data skew occur. How could this
happen?

Someone could explain to me that when set same parallelism, the performance
between multi taskmanager each worker with slots and one taskmanager with
more slots?
Thanks a lot!

Best Regards
Rui



-----
stay hungry, stay foolish.
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
stay hungry, stay foolish.