Re: All but one TMs connect when JM has more than 16G of memory
Posted by
Robert Schmidtke on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/All-but-one-TMs-connect-when-JM-has-more-than-16G-of-memory-tp2974p2977.html
Hi Robert,
thanks for your reply. It got me digging into my setup and I discovered that one TM was scheduled next to the JM. When specifying -yn 7 the documentation suggests that this is the number of TMs (of which I wanted 7), and I thought an additional container would be used for the JM (my YARN cluster has 8 containers). Anyway with this setup the memory added up to 56G and 1M (40G per TM and 16G 1M for the JM), but I set a hard maximum of 56G in my yarn-site.xml which is why the request could not be fulfilled. It is interesting to note that when I set both yarn.nodemanager.resource.memory-mb and yarn.scheduler.maximum-allocation-mb to 56G I get a proper error when requesting 56G and 1M, but when setting yarn.nodemanager.resource.memory-mb to 56G and yarn.scheduler.maximum-allocation-mb to 54G I don't get an error but the aforementioned endless loop. Note I have yarn.nodemanager.vmem-check-enabled set to false. This is probably a YARN issue then / my bad configuration.
I'm in a rush now (to get to the Flink meetup) and thus will check the documentation later to see how to deploy the TMs and JM on separate machines each, since that is not what's happening at the moment, but this is what I'd like to have. Thanks again and see you in an hour.
Cheers
Robert