Hi all,
I have a Flink cluster (1.4.0) built from flink's docker image, with 1 job manager and 2 task managers. I'm trying to use logback isntead of log4j and as far as the cluster configurations goes, seems alright. Following https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/best_practices.html I have removed log4j-1.2.xx.jar and sfl4j-log4j12-xxx.jar from lib/ and added logback-classic.jar, logback-cor.jar and log4j-over-slf4j.jar. This is a snippet from the logs: 2018-06-26 11:02:47.760 [main] DEBUG org.apache.hadoop.security.UserGroupInformation - hadoop login 2018-06-26 11:02:47.761 [main] DEBUG org.apache.hadoop.security.UserGroupInformation - hadoop login commit 2018-06-26 11:02:47.764 [main] DEBUG org.apache.hadoop.security.UserGroupInformation - using local user:UnixPrincipal: flink 2018-06-26 11:02:47.764 [main] DEBUG org.apache.hadoop.security.UserGroupInformation - Using user: "UnixPrincipal: flink" with name flink 2018-06-26 11:02:47.765 [main] DEBUG org.apache.hadoop.security.UserGroupInformation - User entry: "flink" 2018-06-26 11:02:47.765 [main] DEBUG org.apache.hadoop.security.UserGroupInformation - Assuming keytab is managed externally since logged in from subject. 2018-06-26 11:02:47.766 [main] DEBUG org.apache.hadoop.security.UserGroupInformation - UGI loginUser:flink (auth:SIMPLE) 2018-06-26 11:02:47.766 [main] INFO org.apache.flink.runtime.jobmanager.JobManager - Current Hadoop/Kerberos user: flink 2018-06-26 11:02:47.766 [main] INFO org.apache.flink.runtime.jobmanager.JobManager - JVM: OpenJDK 64-Bit Server VM - Oracle Corporation - 1.8/25.151-b12 2018-06-26 11:02:47.766 [main] INFO org.apache.flink.runtime.jobmanager.JobManager - Maximum heap size: 981 MiBytes 2018-06-26 11:02:47.766 [main] INFO org.apache.flink.runtime.jobmanager.JobManager - JAVA_HOME: /docker-java-home/jre 2018-06-26 11:02:47.771 [main] INFO org.apache.flink.runtime.jobmanager.JobManager - Hadoop version: 2.8.0 2018-06-26 11:02:47.771 [main] INFO org.apache.flink.runtime.jobmanager.JobManager - JVM Options: 2018-06-26 11:02:47.771 [main] INFO org.apache.flink.runtime.jobmanager.JobManager - -Xms1024m 2018-06-26 11:02:47.771 [main] INFO org.apache.flink.runtime.jobmanager.JobManager - -Xmx1024m 2018-06-26 11:02:47.771 [main] INFO org.apache.flink.runtime.jobmanager.JobManager - -Dlog4j.configuration=file:/opt/flink/conf/log4j-console.properties 2018-06-26 11:02:47.772 [main] INFO org.apache.flink.runtime.jobmanager.JobManager - -Dlogback.configurationFile=file:/opt/flink/conf/logback-console.xml 2018-06-26 11:02:47.772 [main] INFO org.apache.flink.runtime.jobmanager.JobManager - Program Arguments: 2018-06-26 11:02:47.772 [main] INFO org.apache.flink.runtime.jobmanager.JobManager - --configDir 2018-06-26 11:02:47.772 [main] INFO org.apache.flink.runtime.jobmanager.JobManager - /opt/flink/conf 2018-06-26 11:02:47.772 [main] INFO org.apache.flink.runtime.jobmanager.JobManager - --executionMode 2018-06-26 11:02:47.772 [main] INFO org.apache.flink.runtime.jobmanager.JobManager - cluster 2018-06-26 11:02:47.772 [main] INFO org.apache.flink.runtime.jobmanager.JobManager - Classpath: /opt/flink/lib/flink-python_2.11-1.4.0.jar:/opt/flink/lib/flink-shaded-hadoop2-uber-1.4.0.jar:/opt/flink/lib/log4j-over-slf4j.jar:/opt/flink/lib/logback-classic.jar:/opt/flink/lib/logback-core.jar:/opt/flink/lib/flink-dist_2.11-1.4.0.jar::: Above you can see the logback jars are properly in the classpath and that is using logback-console.xml as the configuration file. This is the content logback-console.xml: <configuration> <appender name="console" class="ch.qos.logback.core.ConsoleAppender"> <encoder> <pattern>%d{yyyy-MM-dd HH:mm:ss.SSS} [%thread] %-5level %logger{60} %X{sourceThread} - %msg%n</pattern> </encoder> </appender> <root level="DEBUG"> <appender-ref ref="console"/> </root> </configuration> Which justify the DEBUG logs seem above as well, setting the level to DEBUG is working fine. Now comes the weirdest part. This is part of the pipeline: import org.slf4j.Logger; As you can see, I'm important slf4j logger and logger factory and instantiating the logger. Then, I just log all 5 levels. I can't see ANY of these logs in the job manager / task manager logs. They work fine locally though: 13:16:11.030 [main] INFO Pipeline - **************************************************** INFO 13:16:11.035 [main] WARN Pipeline - **************************************************** WARN 13:16:16.239 [main] ERROR Pipeline - **************************************************** ERROR 13:16:16.239 [main] DEBUG Pipeline - **************************************************** DEBUG It's not the log level as I can clearly see DEBUG and INFO logs from Flink, and the logback-console.xml is alright. The logs are not getting to the cluster for some reason. Then I tried this: ch.qos.logback.classic.Logger testLog = (ch.qos.logback.classic.Logger)LOGGER;and got this: 2018-06-26 11:24:27.268 [jobmanager-future-thread-5] DEBUG org.apache.flink.runtime.webmonitor.RuntimeMonitorHandler - Error while handling request. java.util.concurrent.CompletionException: org.apache.flink.util.FlinkException: Could not run the jar. at org.apache.flink.runtime.webmonitor.handlers.JarRunHandler.lambda$handleJsonRequest$0(JarRunHandler.java:90) at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.flink.util.FlinkException: Could not run the jar. ... 9 common frames omitted Caused by: org.apache.flink.client.program.ProgramInvocationException: The main method caused an error. at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:542) at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:417) at org.apache.flink.client.program.OptimizerPlanEnvironment.getOptimizedPlan(OptimizerPlanEnvironment.java:83) at org.apache.flink.client.program.ClusterClient.getOptimizedPlan(ClusterClient.java:334) at org.apache.flink.runtime.webmonitor.handlers.JarActionHandler.getJobGraphAndClassLoader(JarActionHandler.java:76) at org.apache.flink.runtime.webmonitor.handlers.JarRunHandler.lambda$handleJsonRequest$0(JarRunHandler.java:69) ... 8 common frames omitted Caused by: java.lang.Exception: ************* logger is not NULL DEBUG 1656864061 at io.intellisense.MongoPipeline.main(MongoPipeline.java:79) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:525) ... 13 common frames omitted This means the logger was successfully set and received the DEBUG level as the conf/logback-console.yml and again, I was only able to see this error because it came from org.apache.flink.runtime.webmonitor.RuntimeMonitorHandler, not from my pipeline. I'm running out of options, does anyone have any idea? |
Free forum by Nabble | Edit this page |