Hi,
I have a Flink 1.9.0 cluster deployed on AWS ECS. Cluster is running, but metrics are not showing in the UI.
For other services (RPC / Data) it works because the connection is initiated from the TM to the JM through a load-balancer. But it does not work for metrics where JM tries to initiate a connection with the TMs.
Currently, Flink uses taskmanager.host configuration as both 'bind address' and 'advertised address'. When TM starts, it binds to the internal Docker IP which is not accessible from the JM.
Also, the TM metrics.internal.query-service.port is set to a specific port which is dynamically bind to a random ECS host port.
It seems that I need a separate setting for bind-address/port vs advertised-address/port.
Can someone suggest a solution for this issue on AWS ECS?
Would appreciate your help.