Hi, I am running Flink in Amazon EMR. In flink-conf.yaml, I have `metrics.reporter.prom.port: 9249-9250` Depending whether the job manager and task manager are running in the same node, the task manager metrics are reported on port 9250 (if running on same node as job manager), or on port 9249 (if running on a different node). Is there a way to configure so that the task manager metrics are always reported on port 9250? I saw a post that we can "provide each *Manager with a separate configuration." How to do that? Thanks
|
Hi Deirdre, If you run multiple Flink component (jm/tm) processes on one physical node, it is recommended to specify the port range to avoid conflicts[1], I guess this is based on the same Flink binary installation package. If you want to always have the TM reporter running on the same port, you can specify a separate Flink installation package for it, explicitly specifying this port in the configuration file for this installation package. However, you still need to pay attention to port conflicts with other components. The issue you provided is handled by Chesnay, so maybe Chesnay opinion is more appropriate. Ping Chesnay for you. Thanks, vino. Deirdre Kong <[hidden email]> 于2018年8月31日周五 上午5:25写道:
|
Hi Vino/Chesnay, Thank you for the info. I am actually using Yarn for deployment. Flink is installed in AWS EMR, so sometimes jm and tm processes are deployed in the same container, sometimes they are deployed in different containers. I would need to configure Prometheus to listen on a specify port for JM and TM reporter. So even if JM is bounced, I can get the JM metrics on the same port each time. Can you elaborate what do you mean by specify a separate Flink installation package for it? Chesnay, do you have any insights on this? Thanks, Deirdre On Thu, Aug 30, 2018 at 7:18 PM vino yang <[hidden email]> wrote:
|
Hi Deirdre, Sorry, I thought you were using a Standalone cluster environment. If you are based on YARN, then it seems that the way I said does not work. Maybe you specify a larger port range and will not cause conflicts. I am curious, why do you want to fix the run port of the reporter? Maybe you can expect Chesnay to give you advice. Thanks, vino. Deirdre Kong <[hidden email]> 于2018年8月31日周五 下午1:26写道:
|
I don't know how/whether you can
provide different flink-conf.yaml files when using YARN.
But this is the only way to map specific ports to a specific JM/TM process. On 31.08.2018 08:15, vino yang wrote:
|
@Chesnay, can you elaborate on how to map specific ports to a specific JM/TM process? @Vino, I can only update Prometheus configuration once. Say I set my port to be 9249-9250 in flink-conf.yml, and configure Prometheus to listen on <JM-IP>:9249 for JM metrics and <TM-IP>:9250 for TM metrics. If JM and TM are deployed in the same container, then I have no issue. But if YARN deployed them in different containers then the TM metrics will expose on port 9249 instead. Thanks, Deirdre On Thu, Aug 30, 2018 at 11:47 PM Chesnay Schepler <[hidden email]> wrote:
|
Or is there a way to specify in the command line to have the jm and tm run in different containers on YARN? On Thu, Aug 30, 2018 at 11:51 PM Deirdre Kong <[hidden email]> wrote:
|
Hi Deirdre, Usually, we don't recommend JM and TM in a container. @Chesnay, right? I want to confirm, is your container here meaning node? Thanks, vino. Deirdre Kong <[hidden email]> 于2018年8月31日周五 下午3:03写道:
|
Hi Vino, Yeah, I mean node. Thanks, Deirdre On Fri, Aug 31, 2018 at 12:13 AM vino yang <[hidden email]> wrote:
|
Hi Deirdre, Flink does not support to control where Yarn containers are placed. This is the responsibility of Yarn as the cluster manager. In Yarn 3.1.0 it is possible to specify placement constraints for containers but also this won't fully solve your problem. Imagine that you have a single Yarn node cluster which can fulfil your container requirements. In this case all containers need to be placed on the same node. I think we could extend Vino's proposal for Yarn as well: Maybe it makes sense to allow to override certain configuration settings for the TaskManagers when deploying on Yarn. That way one could define a fixed port for the JM and a port range for the TMs. Having such a distinction you can configure your Prometheus to scrape for the single JM and the TMs individually. However, Flink does not yet support such a feature. You can open a JIRA issue to track the problem. At the moment you would need to distinguish whether it is a TM or JM based on the reported metrics. Cheers, Till On Fri, Aug 31, 2018 at 9:46 AM Deirdre Kong <[hidden email]> wrote:
|
Free forum by Nabble | Edit this page |