Re: same parallelism with different taskmanager and slots, skew occurs
Posted by
Till Rohrmann on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/same-parallelism-with-different-taskmanager-and-slots-skew-occurs-tp25281p25339.html
Hi,
could you tell me how exactly you started the cluster and with which parameters (configured memory, maybe vcores, etc.)?
Cheers,
Till
Hi, Till
It's very kind of your reply. I got your point, I'm sorry to not make it
clear about my issue.
I generated data by streaming benchmark just as the link:
https://github.com/dataArtisans/databricks-benchmark/blob/master/src/main/scala/com/databricks/benchmark/flink/EventGenerator.scala
.
What I wanna to say is that, let the parallelism is same assume to 96, just
changes the tm and slots/tm. The first test to configure tm 3 with 32
slots/tm, there does not occur data skew, three machine receive same data
and each partition processed approximate data. Then second test to configure
tm 6 with 16 slots/tm, I find each partition processed same data too, but
one machine processed data more than the other two machine.
I wonder whether the taskmanager(jvm) competes in one machine? What's more,
how does the streaming benchmark do with backpressure? I test on cluster
with 4 node, one for master and three for worker, each node with Intel Xeon
E5-2699 v4 @ 2.20GHz/3.60GHz, 256G memory, 88 cores, 10Gbps network, I could
not find the bottleneck. It confused me!
Best Regards & Thanks
Rui
-----
stay hungry, stay foolish.
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/