Hi, I'm running flink jobmanagers/taskmanagers with yarn. I've turned on the JMX reporter in my flink-conf.yaml as follows:
metrics.reporters: jmx metrics.reporter.jmx.class: org.apache.flink.metrics.jmx.JMXReporter I was wondering: Is there a JMX server with the aggregated stats across all jobs / tasks? If so, where is it located? It appears that a JMX starts for every single taskmanager and the jobmanagers do not have the data reported from the taskmanagers. I'm not sure if this is related, but when I try to specify a port for the jmx reporter, like this: metrics.reporter.jmx.port: 8789 I'm receiving an error where JMX servers from different task managers fight for that port, and fail to start. |
Sorry: neglected to include the stack trace for JMX failing to instantiate from a taskmanager: 017-08-05 00:59:09,388 INFO org.apache.flink.runtime.metrics.MetricRegistry - Configuring JMXReporter with {port=8789, class=org.apache.flink.metrics.jmx.JMXReporter}. 2017-08-05 00:59:09,402 ERROR org.apache.flink.runtime.metrics.MetricRegistry - Could not instantiate metrics reporter jmx. Metrics might not be exposed/reported. java.lang.RuntimeException: Could not start JMX server on any configured port. Ports: 8789 at org.apache.flink.metrics.jmx.JMXReporter.open(JMXReporter.java:127) at org.apache.flink.runtime.metrics.MetricRegistry.<init>(MetricRegistry.java:120) at org.apache.flink.runtime.taskmanager.TaskManager$.createTaskManagerComponents(TaskManager.scala:2114) at org.apache.flink.runtime.taskmanager.TaskManager$.startTaskManagerComponentsAndActor(TaskManager.scala:1873) at org.apache.flink.runtime.taskmanager.TaskManager$.runTaskManager(TaskManager.scala:1769) at org.apache.flink.runtime.taskmanager.TaskManager$.selectNetworkInterfaceAndRunTaskManager(TaskManager.scala:1637) at org.apache.flink.runtime.taskmanager.TaskManager.selectNetworkInterfaceAndRunTaskManager(TaskManager.scala) at org.apache.flink.yarn.YarnTaskManagerRunner$1.call(YarnTaskManagerRunner.java:146) at org.apache.flink.yarn.YarnTaskManagerRunner$1.call(YarnTaskManagerRunner.java:142) at org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40) at org.apache.flink.yarn.YarnTaskManagerRunner.runYarnTaskManager(YarnTaskManagerRunner.java:142) at org.apache.flink.yarn.YarnTaskManager$.main(YarnTaskManager.scala:64) at org.apache.flink.yarn.YarnTaskManager.main(YarnTaskManager.scala) On Fri, Aug 4, 2017 at 3:51 PM, Ajay Tripathy <[hidden email]> wrote:
|
Hello,
there is no central place where JMX metrics are aggregated. You can configure a port range for the reporter to prevent port conflicts on the same machine. metrics.reporter.jmx.port:8789-8790 You can find out which port was used by checking the logs. Regards, Chesnay On 05.08.2017 03:06, Ajay Tripathy wrote:
|
Free forum by Nabble | Edit this page |