Thank you~
Xintong Song
hi, Arvidthanks for the advice , I removed the quotes and it do created a yarn session on EMR , but I didn't find any jit log file generated .The config with quotes is working on standalone cluster . I also tried to dynamic pass the property within the yarn session command :
flink-yarn-session -n 1 -d -nm testSession -yD env.java.opts="-XX:+UnlockDiagnosticVMOptions -XX:+TraceClassLoading -XX:+LogCompilation -XX:LogFile=${FLINK_LOG_PREFIX}.jit -XX:+PrintAssembly"
but get same result , session created , but can not find any jit log file under container log .
Thanks
Jacky
Arvid Heise <[hidden email]> 于2020年5月12日周二 下午12:57写道:Hi Jacky,I suspect that the quotes are the actual issue. Could you try to remove them? See also [1].On Tue, May 12, 2020 at 4:03 PM Jacky D <[hidden email]> wrote:hi, XintongThanks for reply , I attached those lines below for application master start command :2020-05-11 21:16:16,635 DEBUG org.apache.hadoop.util.PerformanceAdvisory - Crypto codec org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec is not available.2020-05-11 21:16:16,635 DEBUG org.apache.hadoop.util.PerformanceAdvisory - Using crypto codec org.apache.hadoop.crypto.JceAesCtrCryptoCodec.2020-05-11 21:16:16,636 DEBUG org.apache.hadoop.hdfs.DataStreamer - DataStreamer block BP-1519523618-98.94.65.144-1581106168138:blk_1073745139_4315 sending packet packet seqno: 0 offsetInBlock: 0 lastPacketInBlock: false lastByteOffsetInBlock: 16972020-05-11 21:16:16,637 DEBUG org.apache.hadoop.hdfs.DataStreamer - DFSClient seqno: 0 reply: SUCCESS downstreamAckTimeNanos: 0 flag: 02020-05-11 21:16:16,637 DEBUG org.apache.hadoop.hdfs.DataStreamer - DataStreamer block BP-1519523618-98.94.65.144-1581106168138:blk_1073745139_4315 sending packet packet seqno: 1 offsetInBlock: 1697 lastPacketInBlock: true lastByteOffsetInBlock: 16972020-05-11 21:16:16,638 DEBUG org.apache.hadoop.hdfs.DataStreamer - DFSClient seqno: 1 reply: SUCCESS downstreamAckTimeNanos: 0 flag: 02020-05-11 21:16:16,638 DEBUG org.apache.hadoop.hdfs.DataStreamer - Closing old block BP-1519523618-98.94.65.144-1581106168138:blk_1073745139_43152020-05-11 21:16:16,641 DEBUG org.apache.hadoop.ipc.Client - IPC Client (1954985045) connection to ip-98-94-65-144.ec2.internal/98.94.65.144:8020 from hadoop sending #70 org.apache.hadoop.hdfs.protocol.ClientProtocol.complete2020-05-11 21:16:16,643 DEBUG org.apache.hadoop.ipc.Client - IPC Client (1954985045) connection to ip-98-94-65-144.ec2.internal/98.94.65.144:8020 from hadoop got value #702020-05-11 21:16:16,643 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine - Call: complete took 2ms2020-05-11 21:16:16,643 DEBUG org.apache.hadoop.ipc.Client - IPC Client (1954985045) connection to ip-98-94-65-144.ec2.internal/98.94.65.144:8020 from hadoop sending #71 org.apache.hadoop.hdfs.protocol.ClientProtocol.setTimes2020-05-11 21:16:16,645 DEBUG org.apache.hadoop.ipc.Client - IPC Client (1954985045) connection to ip-98-94-65-144.ec2.internal/98.94.65.144:8020 from hadoop got value #712020-05-11 21:16:16,645 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine - Call: setTimes took 2ms2020-05-11 21:16:16,647 DEBUG org.apache.hadoop.ipc.Client - IPC Client (1954985045) connection to ip-98-94-65-144.ec2.internal/98.94.65.144:8020 from hadoop sending #72 org.apache.hadoop.hdfs.protocol.ClientProtocol.setPermission2020-05-11 21:16:16,648 DEBUG org.apache.hadoop.ipc.Client - IPC Client (1954985045) connection to ip-98-94-65-144.ec2.internal/98.94.65.144:8020 from hadoop got value #722020-05-11 21:16:16,648 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine - Call: setPermission took 2ms2020-05-11 21:16:16,654 DEBUG org.apache.flink.yarn.AbstractYarnClusterDescriptor - Application Master start command: $JAVA_HOME/bin/java -Xmx424m "-XX:+UnlockDiagnosticVMOptions -XX:+TraceClassLoading -XX:+LogCompilation -XX:LogFile=${FLINK_LOG_PREFIX}.jit -XX:+PrintAssembly" -Dlog.file="<LOG_DIR>/jobmanager.log" -Dlog4j.configuration=file:log4j.properties org.apache.flink.yarn.entrypoint.YarnSessionClusterEntrypoint 1> <LOG_DIR>/jobmanager.out 2> <LOG_DIR>/jobmanager.err2020-05-11 21:16:16,654 DEBUG org.apache.hadoop.ipc.Client - stopping client from cache: org.apache.hadoop.ipc.Client@28194a502020-05-11 21:16:16,656 DEBUG org.apache.flink.yarn.AbstractYarnClusterDescriptor$ApplicationSubmissionContextReflector - org.apache.hadoop.yarn.api.records.ApplicationSubmissionContext supports method setApplicationTags.2020-05-11 21:16:16,656 DEBUG org.apache.flink.yarn.AbstractYarnClusterDescriptor$ApplicationSubmissionContextReflector - org.apache.hadoop.yarn.api.records.ApplicationSubmissionContext supports method setAttemptFailuresValidityInterval.2020-05-11 21:16:16,656 DEBUG org.apache.flink.yarn.AbstractYarnClusterDescriptor$ApplicationSubmissionContextReflector - org.apache.hadoop.yarn.api.records.ApplicationSubmissionContext supports method setKeepContainersAcrossApplicationAttempts.2020-05-11 21:16:16,656 DEBUG org.apache.flink.yarn.AbstractYarnClusterDescriptor$ApplicationSubmissionContextReflector - org.apache.hadoop.yarn.api.records.ApplicationSubmissionContext supports method setNodeLabelExpression.Xintong Song <[hidden email]> 于2020年5月11日周一 下午10:11写道:Hi Jacky,Could you search for "Application Master start command:" in the debug log and post the result and a few lines before & after that? This is not included in the clip of attached log file.Thank you~
Xintong Song
On Tue, May 12, 2020 at 5:33 AM Jacky D <[hidden email]> wrote:hi, RobertThanks so much for quick reply , I changed the log level to debug and attach the log file .ThanksJackyRobert Metzger <[hidden email]> 于2020年5月11日周一 下午4:14写道:Thanks a lot for posting the full output.It seems that Flink is passing an invalid list of arguments to the JVM.Can you- set the root log level in conf/log4j-yarn-session.properties to DEBUG- then launch the YARN session- share the log file of the yarn session on the mailing list?I'm particularly interested in the line printed here, as it shows the JVM invocation.On Mon, May 11, 2020 at 9:56 PM Jacky D <[hidden email]> wrote:Hi,RobertYes , I tried to retrieve more log info from yarn UI , the full logs showing below , this happens when I try to create a flink yarn session on emr when set up jitwatch configuration .2020-05-11 19:06:09,552 ERROR org.apache.flink.yarn.cli.FlinkYarnSessionCli - Error while running the Flink Yarn session.java.lang.reflect.UndeclaredThrowableExceptionat org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1862)at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)at org.apache.flink.yarn.cli.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:813)Caused by: org.apache.flink.client.deployment.ClusterDeploymentException: Couldn't deploy Yarn session clusterat org.apache.flink.yarn.AbstractYarnClusterDescriptor.deploySessionCluster(AbstractYarnClusterDescriptor.java:429)at org.apache.flink.yarn.cli.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:610)at org.apache.flink.yarn.cli.FlinkYarnSessionCli.lambda$main$2(FlinkYarnSessionCli.java:813)at java.security.AccessController.doPrivileged(Native Method)at javax.security.auth.Subject.doAs(Subject.java:422)at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844)... 2 moreCaused by: org.apache.flink.yarn.AbstractYarnClusterDescriptor$YarnDeploymentException: The YARN application unexpectedly switched to state FAILED during deployment.Diagnostics from YARN: Application application_1584459865196_0165 failed 1 times (global limit =2; local limit is =1) due to AM Container for appattempt_1584459865196_0165_000001 exited with exitCode: 1Failing this attempt.Diagnostics: Exception from container-launch.Container id: container_1584459865196_0165_01_000001Exit code: 1Exception message: Usage: java [-options] class [args...](to execute a class)or java [-options] -jar jarfile [args...](to execute a jar file)where options include:-d32 use a 32-bit data model if available-d64 use a 64-bit data model if available-server to select the "server" VMThe default VM is server,because you are running on a server-class machine.-cp <class search path of directories and zip/jar files>-classpath <class search path of directories and zip/jar files>A : separated list of directories, JAR archives,and ZIP archives to search for class files.-D<name>=<value>set a system property-verbose:[class|gc|jni]enable verbose output-version print product version and exit-version:<value>Warning: this feature is deprecated and will be removedin a future release.require the specified version to run-showversion print product version and continue-jre-restrict-search | -no-jre-restrict-searchWarning: this feature is deprecated and will be removedin a future release.include/exclude user private JREs in the version search-? -help print this help message-X print help on non-standard options-ea[:<packagename>...|:<classname>]-enableassertions[:<packagename>...|:<classname>]enable assertions with specified granularity-da[:<packagename>...|:<classname>]-disableassertions[:<packagename>...|:<classname>]disable assertions with specified granularity-esa | -enablesystemassertionsenable system assertions-dsa | -disablesystemassertionsdisable system assertions-agentlib:<libname>[=<options>]load native agent library <libname>, e.g. -agentlib:hprofsee also, -agentlib:jdwp=help and -agentlib:hprof=help-agentpath:<pathname>[=<options>]load native agent library by full pathname-javaagent:<jarpath>[=<options>]load Java programming language agent, see java.lang.instrument-splash:<imagepath>show splash screen with specified imageSee http://www.oracle.com/technetwork/java/javase/documentation/index.html for more details.ThanksJackyRobert Metzger <[hidden email]> 于2020年5月11日周一 下午3:42写道:Hey Jacky,The error says "The YARN application unexpectedly switched to state FAILED during deployment.".Have you tried retrieving the YARN application logs?Does the YARN UI / resource manager logs reveal anything on the reason for the deployment to fail?Best,RobertOn Mon, May 11, 2020 at 9:34 PM Jacky D <[hidden email]> wrote:---------- Forwarded message ---------
发件人: Jacky D <[hidden email]>
Date: 2020年5月11日周一 下午3:12
Subject: Re: Flink Memory analyze on AWS EMR
To: Khachatryan Roman <[hidden email]>Hi, RomanThanks for quick response , I tried without logFIle option but failed with same error , I'm currently using flink 1.6 https://ci.apache.org/projects/flink/flink-docs-release-1.6/monitoring/application_profiling.html, so I can only use Jitwatch or JMC . I guess those tools only available on Standalone cluster ? as document mentioned "Each standalone JobManager, TaskManager, HistoryServer, and ZooKeeper daemon redirectsstdout
andstderr
to a file with a.out
filename suffix and writes internal logging to a file with a.log
suffix. Java options configured by the user inenv.java.opts
" ?ThanksJacky
--Arvid Heise | Senior Java Developer
Follow us @VervericaData
--
Join Flink Forward - The Apache Flink Conference
Stream Processing | Event Driven | Real Time
--
Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
--
Ververica GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Toni) Cheng
Free forum by Nabble | Edit this page |