Hello Everyone,
I'm new to flink and i'm trying to upgrade from flink 1.8 to flink 1.11 on an emr cluster. after upgrading to flink1.11 One of the differences that i see is i don't get any metrics. I found out that flink 1.11 does not have org.apache.flink.metrics.statsd.StatsDReporterFactory jar in /usr/lib/flink/opt which was the case for flink 1.8. Could anyone have any pointer to locate org.apache.flink.metrics.statsd.StatsDReporterFactory jar or how to use metrics in flink.1.11? Things i tried : a) the below setup metrics.reporters: stsd b) I tried downloading the statsd jar from https://mvnrepository.com/artifact/org.apache.flink/flink-metrics-statsd putting it inside plugins/statsd directory. Best, Diwakar Jha. |
With Flink 1.11 reporters were
refactored to plugins, and are now accessible by default (so you
no longer have to bother with copying jars around).
Your configuration appears to be
correct, so I suggest to take a look at the log files.
On 10/25/2020 9:52 PM, Diwakar Jha
wrote:
|
This is what I see on the WebUI. 23:19:24.263 [flink-akka.actor.default-dispatcher-1865] ERROR org.apache.flink.runtime.rest.handler.taskmanager.TaskManagerLogFileHandler - Failed to transfer file from TaskExecutor container_1603649952937_0002_01_000004.
java.util.concurrent.CompletionException: org.apache.flink.util.FlinkException: The file LOG does not exist on the TaskExecutor.
at org.apache.flink.runtime.taskexecutor.TaskExecutor.lambda$requestFileUploadByFilePath$25(TaskExecutor.java:1742) ~[flink-dist_2.12-1.11.0.jar:1.11.0]
at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604) ~[?:1.8.0_252]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_252]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_252]
at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_252]
Caused by: org.apache.flink.util.FlinkException: The file LOG does not exist on the TaskExecutor.
... 5 more
23:19:24.275 [flink-akka.actor.default-dispatcher-1865] ERROR org.apache.flink.runtime.rest.handler.taskmanager.TaskManagerLogFileHandler - Unhandled exception.
org.apache.flink.util.FlinkException: The file LOG does not exist on the TaskExecutor.
at org.apache.flink.runtime.taskexecutor.TaskExecutor.lambda$requestFileUploadByFilePath$25(TaskExecutor.java:1742) ~[flink-dist_2.12-1.11.0.jar:1.11.0]
at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604) ~[?:1.8.0_252]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_252]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_252]
at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_252] Appreciate if anyone has any pointer for this. On Mon, Oct 26, 2020 at 10:45 AM Chesnay Schepler <[hidden email]> wrote:
Best, Diwakar Jha. |
Hey Diwakar, how are you deploying Flink on EMR? Are you using YARN? If so, you could also use log aggregation to see all the logs at once (from both JobManager and TaskManagers). (yarn logs -applicationId <Application ID>) Could you post (or upload somewhere) all logs you have of one run? It is much easier for us to debug something if we have the full logs (the logs show for example the classpath that you are using, we would see how you are deploying Flink, etc.) From the information available, my guess is that you have modified your deployment in some way (use of a custom logging version, custom deployment method, version mixup with jars from both Flink 1.8 and 1.11, ...). Best, Robert On Tue, Oct 27, 2020 at 12:41 AM Diwakar Jha <[hidden email]> wrote:
|
Hi Robert, Could please correct me. I'm not able to stop the app. Also, i stopped flink job already. sh-4.2$ yarn app -stop application_1603649952937_0002 2020-10-27 20:04:25,543 INFO client.RMProxy: Connecting to ResourceManager at ip-10-0-55-50.ec2.internal/10.0.55.50:8032 2020-10-27 20:04:25,717 INFO client.AHSProxy: Connecting to Application History server at ip-10-0-55-50.ec2.internal/10.0.55.50:10200 Exception in thread "main" java.lang.IllegalArgumentException: App admin client class name not specified for type Apache Flink at org.apache.hadoop.yarn.client.api.AppAdminClient.createAppAdminClient(AppAdminClient.java:76) at org.apache.hadoop.yarn.client.cli.ApplicationCLI.run(ApplicationCLI.java:597) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) at org.apache.hadoop.yarn.client.cli.ApplicationCLI.main(ApplicationCLI.java:126) sh-4.2$ On Tue, Oct 27, 2020 at 9:34 AM Robert Metzger <[hidden email]> wrote:
Best, Diwakar Jha. |
Hello Everyone, I'm able to get my Flink UI up and running (it was related to the session manager plugin on my local laptop) but I'm not seeing any taskmanager/jobmanager logs in my Flink application. I have attached some yarn application logs while it's running but am not able to figure out how to stop and get more logs. Could someone please help me figure this out? I'm running Flink 1.11 on the EMR 6.1 cluster. On Tue, Oct 27, 2020 at 1:06 PM Diwakar Jha <[hidden email]> wrote:
f_kgvkk5a80 (63K) Download Attachment |
Hello, I see that in my class path (below) I have both log4j-1 and lo4j-api-2. is this because of which i'm not seeing any logs. If so, could someone suggest how to fix it? export CLASSPATH=":lib/flink-csv-1.11.0.jar:lib/flink-json-1.11.0.jar:lib/flink-shaded-zookeeper-3.4.14.jar:lib/flink-table-blink_2.12-1.11.0.jar:lib/flink-table_2.12-1.11.0.jar:lib/log4j-1.2-api-2.12.1.jar:lib/log4j-api-2.12.1.jar:lib/log4j-core-2.12.1.jar:lib/ export _FLINK_CLASSPATH=":lib/flink-csv-1.11.0.jar:lib/flink-json-1.11.0.jar:lib/flink-shaded-zookeeper-3.4.14.jar:lib/flink-table-blink_2.12-1.11.0.jar:lib/flink-table_2.12-1.11.0.jar:lib/log4j-1.2-api-2.12.1.jar:lib/log4j-api-2.12.1.jar:lib/log4j-core-2.12.1.jar:lib/log4j-slf4j-impl-2.12.1.jar:flink-dist_2.12-1.11.0.jar:flink-conf.yaml:" thanks. On Thu, Oct 29, 2020 at 6:21 PM Diwakar Jha <[hidden email]> wrote:
|
Hi, I wanted to check if anyone can help me with the logs. I have sent several emails but not getting any response. I'm running Flink 1.11 on EMR 6.1. I don't see any logs though I get this stdout error. I'm trying to upgrade Flink 1.8 to Flink 1.11 18:29:19.834 [flink-akka.actor.default-dispatcher-28] ERROR org.apache.flink.runtime.rest.handler.taskmanager.TaskManagerLogFileHandler - Failed to transfer file from TaskExecutor container_1604033334508_0001_01_000004. java.util.concurrent.CompletionException: org.apache.flink.util.FlinkException: The file LOG does not exist on the TaskExecutor. Thanks! On Fri, Oct 30, 2020 at 9:04 AM Diwakar Jha <[hidden email]> wrote:
|
Hey Diwakar, the logs you are providing still don't contain the full Flink logs. You can not stop the Flink on YARN using "yarn app -stop application_1603649952937_0002". To stop Flink on YARN, use: "yarn application -kill <appId>". On Sat, Oct 31, 2020 at 6:26 PM Diwakar Jha <[hidden email]> wrote:
|
HI Robert, I'm able to see taskmanage and jobmanager logs after I changed the log4j.properties file (/usr/lib/flink/conf). It seems to be a problem with EMR 6.1 distribution. the log4j.properties files is different in the Flink package that I downloaded and the one that comes with EMR 6.1. I replaced the log4j.properties and it's working. Thanks for helping me debug the issue. Best, Diwakar On Tue, Nov 3, 2020 at 11:36 AM Robert Metzger <[hidden email]> wrote:
|
Free forum by Nabble | Edit this page |