Hi, I'm confused about slots communication in same taskmanager. Assume only one job which running on per-job cluster with parallalism = 6. Each taskmanager with 3 slot. There are 6 slot: slot 1-1, slot 1-2, slot 1-3, slot 2-1, slot 2-2 , slot 2-3 Assume the job has 'KeyBy' operator, thus, eash 'keyby' in each slot will distribute data into all downstream opeators in every slots. I know that Tasks in the same JVM share TCP connections (via multiplexing) and heartbeat messages That means: slot 1-1 , slot 1-2, slot 1-3 share TCP connections communicate with slot 2-1, slot 2-2, slot 2-3 My question is: how about communication type between slot 1-1 and slot 1-2? by thread-to-thread? or by network?
|
Hi, If tasks end up in the same TaskManager, they us LocalInputChannel(s), which does not go through network, but reads directly from local partitions. I am also pulling in @Piotr who might give you some more
insights, or correct me if I am wrong. Best, Dawid On 26/01/2021 08:18, 耿延杰 wrote:
signature.asc (849 bytes) Download Attachment |
Hi, Yes Dawid is correct. Communications between two tasks on the same TaskManager are not going through the network, but via "local" channel (`LocalInputChannel`). It's still serialising and deserializing the data, but there are no network overheads, and local channels have only half of the memory consumption of the "remote" channels. FYI, if you look at the network metrics, there are a couple of metrics that make distinction between local/remote data (like `numBytesInLocal` counter). > slot 1-1 , slot 1-2, slot 1-3 share TCP connections communicate with slot 2-1, slot 2-2, slot 2-3 Yes, there is only a single TCP connection shared between a pair of TaskManagers. Piotrek wt., 26 sty 2021 o 12:43 Dawid Wysakowicz <[hidden email]> napisał(a):
|
Free forum by Nabble | Edit this page |