Hi, We noticed that we couldn't parallelize our flink docker containers and this looks like an issue that other have experienced. In our environment we were not setting any hostname in the flink configuration. This worked for the single node, but it looks like the taskmanagers would have the exception also similar to others:
In our AWS environment we are only running one container per EC2 instance, and each instance has a "unique-dns-address" associated with it. uniquie-dns-address is similar to ip-XXX-XX-XX-XXX.aws-region-X . Then so that we don't have to do any additional DNS configuration, it would be convenient to exploit this dns address for each taskmanager to talk to each other. I tested that I could reach each taskmanager from the unique-dns-address via telnet to one of the taskmanager ports and I was able to connect. This made me think that setting taskmanager.hostname to the address would solve my issue. However when I tried t o set taskmanager.hostname : unique-dns-address in flink-conf.yaml I ended up with a java.net.BindException: Cannot assign requested address .I'm not entirely sure why this happened. But I looked around and found some other list message that mentioned https://doc.akka.io/docs/akka/ So I set akka.remote.netty.tcp.hostname I realize this is a complicated issue which varies for each environment. But I am asking for advice regarding other things I should try to tackle the issue. Furhtermore, if I'm on the right track, what taskmanager service port should correspond to akka.remote.netty.tcp.port ? |
Hi Colin,
Is each instance's "unique-dns-address" equal to the hostname of the instance or is the hostname something else? If it's different from the hostname, you're correct in assuming you need to configure each node to advertise its unique-dns-address intead. Are the unique-dns-addresses aliases for public or private IPs? I.e. in your example of a unique-dns-address do the X's map to the private IP of the instance or some public IP? If I recall correctly, in AWS (at least within a VPC), instance's public IPs are not actually bound to the instance itself and are more like a NAT/DMZ address, meaning you can't actually bind a port to them. This might work differently in EC2-Classic. If you ensure that each node advertises a bindable, resolvable name or IP address—with jobmanager.rpc.address on the jobmanager and taskmanager.hostname on the taskmanager—then they should all be able to discover, address, and communicate with each other with no problems. -- Patrick Lucas On Tue, Nov 21, 2017 at 6:44 AM, Colin Williams <[hidden email]> wrote:
|
Hi Patrick "unique-dns-address" is an alias for private IP. If XXX-XX-XX-XXX is the private IP, then ip-XXX-XX-XX-XXX.aws- What worked out with the givens above was setting docker --net=host , and then we saw "unique-dns-address" was set for the taskmanagers akka address. Given we are ok with 1 taskmanager container per host, this all worked out. Thanks, Colin Williams On Tue, Nov 21, 2017 at 6:14 AM, Patrick Lucas <[hidden email]> wrote:
|
Free forum by Nabble | Edit this page |