Hi, I am wondering why, despite the fact that my java main() methods runs OK and exit with 0 code value, the Yarn container status set by the englobing flink execution is FAILED with diagnostic "Flink
YARN Client requested shutdown."? Command line : flink run -m yarn-cluster -yn 20 -ytm 8192 -yqu batch1 -ys 8 --class <myMain> <myJar> <myParams> End of yarn log : Status of job 6ac47ddc8331ffd0b1fa9a3b5a551f86 (KUBERA-GEO-BRUT2SEGMENT) changed to FINISHED. 10:03:00,618 INFO org.apache.flink.yarn.ApplicationMaster$$anonfun$2$$anon$1 - Stopping YARN JobManager with status FAILED and diagnostic Flink YARN Client requested shutdown. 10:03:00,625 INFO org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl - Waiting for application to be successfully unregistered. 10:03:00,874 INFO org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy - Closing proxy : h1r2dn12.bpa.bouyguestelecom.fr:45454 (… more closing proxy …) 10:03:00,877 INFO org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy - Closing proxy : h1r2dn01.bpa.bouyguestelecom.fr:45454 10:03:00,883 INFO org.apache.flink.yarn.ApplicationMaster$$anonfun$2$$anon$1 - Stopping JobManager akka://flink/user/jobmanager#1737010364. 10:03:00,895 INFO akka.remote.RemoteActorRefProvider$RemotingTerminator - Shutting down remote daemon. 10:03:00,896 INFO akka.remote.RemoteActorRefProvider$RemotingTerminator - Remote daemon shut down; proceeding with flushing remote transports. 10:03:00,918 INFO akka.remote.RemoteActorRefProvider$RemotingTerminator - Remoting shut down. End of log4j log: 2015:09:03 10:03:00 (main) - INFO - com.bouygtel.kuberasdk.main.Application.mainMethod - Fin ok traitement 2015:09:03 10:03:00 (Thread-14) - INFO - Classe Inconnue.Methode Inconnue - Shutting down FlinkYarnCluster from the client shutdown hook 2015:09:03 10:03:00 (Thread-14) - INFO - Classe Inconnue.Methode Inconnue - Sending shutdown request to the Application Master 2015:09:03 10:03:00 (flink-akka.actor.default-dispatcher-2) - INFO - Classe Inconnue.Methode Inconnue - Sending StopYarnSession request to ApplicationMaster. 2015:09:03 10:03:00 (flink-akka.actor.default-dispatcher-2) - INFO - Classe Inconnue.Methode Inconnue - Remote JobManager has been stopped successfully. Stopping local application
client 2015:09:03 10:03:00 (flink-akka.actor.default-dispatcher-2) - INFO - Classe Inconnue.Methode Inconnue - Stopped Application client. 2015:09:03 10:03:00 (flink-akka.actor.default-dispatcher-15) - INFO - Classe Inconnue.Methode Inconnue - Shutting down remote daemon. 2015:09:03 10:03:00 (flink-akka.actor.default-dispatcher-15) - INFO - Classe Inconnue.Methode Inconnue - Remote daemon shut down; proceeding with flushing remote transports. 2015:09:03 10:03:00 (flink-akka.actor.default-dispatcher-15) - INFO - Classe Inconnue.Methode Inconnue - Remoting shut down. 2015:09:03 10:03:00 (Thread-14) - INFO - Classe Inconnue.Methode Inconnue - Deleting files in hdfs://h1r1nn01.bpa.bouyguestelecom.fr:8020/user/datcrypt/.flink/application_1441011714087_0730 2015:09:03 10:03:00 (Thread-15) - INFO - Classe Inconnue.Methode Inconnue - Application application_1441011714087_0730 finished with state FINISHED and final state FAILED at 1441267380623 2015:09:03 10:03:00 (Thread-14) - WARN - Classe Inconnue.Methode Inconnue - The short-circuit local reads feature cannot be used because libhadoop cannot be loaded. 2015:09:03 10:03:01 (Thread-14) - INFO - Classe Inconnue.Methode Inconnue - YARN Client is shutting down Greetings, Arnaud L'intégrité de ce message n'étant pas assurée sur internet, la société expéditrice ne peut être tenue responsable de son contenu ni de ses pièces jointes. Toute utilisation ou diffusion non autorisée est interdite. Si vous n'êtes pas destinataire de ce message, merci de le détruire et d'avertir l'expéditeur. The integrity of this message cannot be guaranteed on the Internet. The company that sent this message cannot therefore be held liable for its content nor attachments. Any unauthorized use or dissemination is prohibited. If you are not the intended recipient of this message, then please delete it and notify the sender. |
Hi Arnaud, I think that's a bug ;) I'll file a JIRA to fix it for the next release. On Thu, Sep 3, 2015 at 10:26 AM, LINZ, Arnaud <[hidden email]> wrote:
|
Hi Arnaud,
I've looked into the problem but I couldn't reproduce it using Flink 0.9.0, Flink 0.9.1 and the current master snapshot (f332fa5). I always ended up with the final state SUCCEEDED. Which version of Flink were you using? Best regards, Max On Thu, Sep 3, 2015 at 10:48 AM, Robert Metzger <[hidden email]> wrote: > Hi Arnaud, > > I think that's a bug ;) > I'll file a JIRA to fix it for the next release. > > On Thu, Sep 3, 2015 at 10:26 AM, LINZ, Arnaud <[hidden email]> > wrote: >> >> Hi, >> >> >> >> I am wondering why, despite the fact that my java main() methods runs OK >> and exit with 0 code value, the Yarn container status set by the englobing >> flink execution is FAILED with diagnostic "Flink YARN Client requested >> shutdown."? >> >> >> >> Command line : >> >> flink run -m yarn-cluster -yn 20 -ytm 8192 -yqu batch1 -ys 8 --class >> <myMain> <myJar> <myParams> >> >> >> >> End of yarn log : >> >> >> >> Status of job 6ac47ddc8331ffd0b1fa9a3b5a551f86 (KUBERA-GEO-BRUT2SEGMENT) >> changed to FINISHED. >> >> 10:03:00,618 INFO >> org.apache.flink.yarn.ApplicationMaster$$anonfun$2$$anon$1 - Stopping >> YARN JobManager with status FAILED and diagnostic Flink YARN Client >> requested shutdown. >> >> 10:03:00,625 INFO org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl >> - Waiting for application to be successfully unregistered. >> >> 10:03:00,874 INFO >> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy - >> Closing proxy : h1r2dn12.bpa.bouyguestelecom.fr:45454 >> >> (… more closing proxy …) >> >> 10:03:00,877 INFO >> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy - >> Closing proxy : h1r2dn01.bpa.bouyguestelecom.fr:45454 >> >> 10:03:00,883 INFO >> org.apache.flink.yarn.ApplicationMaster$$anonfun$2$$anon$1 - Stopping >> JobManager akka://flink/user/jobmanager#1737010364. >> >> 10:03:00,895 INFO akka.remote.RemoteActorRefProvider$RemotingTerminator >> - Shutting down remote daemon. >> >> 10:03:00,896 INFO akka.remote.RemoteActorRefProvider$RemotingTerminator >> - Remote daemon shut down; proceeding with flushing remote transports. >> >> 10:03:00,918 INFO akka.remote.RemoteActorRefProvider$RemotingTerminator >> - Remoting shut down. >> >> >> >> End of log4j log: >> >> >> >> 2015:09:03 10:03:00 (main) - INFO - >> com.bouygtel.kuberasdk.main.Application.mainMethod - Fin ok traitement >> >> 2015:09:03 10:03:00 (Thread-14) - INFO - Classe Inconnue.Methode Inconnue >> - Shutting down FlinkYarnCluster from the client shutdown hook >> >> 2015:09:03 10:03:00 (Thread-14) - INFO - Classe Inconnue.Methode Inconnue >> - Sending shutdown request to the Application Master >> >> 2015:09:03 10:03:00 (flink-akka.actor.default-dispatcher-2) - INFO - >> Classe Inconnue.Methode Inconnue - Sending StopYarnSession request to >> ApplicationMaster. >> >> 2015:09:03 10:03:00 (flink-akka.actor.default-dispatcher-2) - INFO - >> Classe Inconnue.Methode Inconnue - Remote JobManager has been stopped >> successfully. Stopping local application client >> >> 2015:09:03 10:03:00 (flink-akka.actor.default-dispatcher-2) - INFO - >> Classe Inconnue.Methode Inconnue - Stopped Application client. >> >> 2015:09:03 10:03:00 (flink-akka.actor.default-dispatcher-15) - INFO - >> Classe Inconnue.Methode Inconnue - Shutting down remote daemon. >> >> 2015:09:03 10:03:00 (flink-akka.actor.default-dispatcher-15) - INFO - >> Classe Inconnue.Methode Inconnue - Remote daemon shut down; proceeding with >> flushing remote transports. >> >> 2015:09:03 10:03:00 (flink-akka.actor.default-dispatcher-15) - INFO - >> Classe Inconnue.Methode Inconnue - Remoting shut down. >> >> 2015:09:03 10:03:00 (Thread-14) - INFO - Classe Inconnue.Methode Inconnue >> - Deleting files in >> hdfs://h1r1nn01.bpa.bouyguestelecom.fr:8020/user/datcrypt/.flink/application_1441011714087_0730 >> >> 2015:09:03 10:03:00 (Thread-15) - INFO - Classe Inconnue.Methode Inconnue >> - Application application_1441011714087_0730 finished with state FINISHED >> and final state FAILED at 1441267380623 >> >> 2015:09:03 10:03:00 (Thread-14) - WARN - Classe Inconnue.Methode Inconnue >> - The short-circuit local reads feature cannot be used because libhadoop >> cannot be loaded. >> >> 2015:09:03 10:03:01 (Thread-14) - INFO - Classe Inconnue.Methode Inconnue >> - YARN Client is shutting down >> >> >> >> Greetings, >> >> Arnaud >> >> >> ________________________________ >> >> L'intégrité de ce message n'étant pas assurée sur internet, la société >> expéditrice ne peut être tenue responsable de son contenu ni de ses pièces >> jointes. Toute utilisation ou diffusion non autorisée est interdite. Si vous >> n'êtes pas destinataire de ce message, merci de le détruire et d'avertir >> l'expéditeur. >> >> The integrity of this message cannot be guaranteed on the Internet. The >> company that sent this message cannot therefore be held liable for its >> content nor attachments. Any unauthorized use or dissemination is >> prohibited. If you are not the intended recipient of this message, then >> please delete it and notify the sender. > > |
Hi,
Sorry for the long delay, I've missed this mail. I was using the 0.10 snapshot. I've upgraded it today and it seems to work now, I do have a SUCCEEDED too. Best regards, Arnaud -----Message d'origine----- De : Maximilian Michels [mailto:[hidden email]] Envoyé : jeudi 8 octobre 2015 14:34 À : [hidden email]; LINZ, Arnaud <[hidden email]> Objet : Re: Flink batch runs OK but Yarn container fails in batch mode with -m yarn-cluster Hi Arnaud, I've looked into the problem but I couldn't reproduce it using Flink 0.9.0, Flink 0.9.1 and the current master snapshot (f332fa5). I always ended up with the final state SUCCEEDED. Which version of Flink were you using? Best regards, Max On Thu, Sep 3, 2015 at 10:48 AM, Robert Metzger <[hidden email]> wrote: > Hi Arnaud, > > I think that's a bug ;) > I'll file a JIRA to fix it for the next release. > > On Thu, Sep 3, 2015 at 10:26 AM, LINZ, Arnaud > <[hidden email]> > wrote: >> >> Hi, >> >> >> >> I am wondering why, despite the fact that my java main() methods runs >> OK and exit with 0 code value, the Yarn container status set by the >> englobing flink execution is FAILED with diagnostic "Flink YARN >> Client requested shutdown."? >> >> >> >> Command line : >> >> flink run -m yarn-cluster -yn 20 -ytm 8192 -yqu batch1 -ys 8 --class >> <myMain> <myJar> <myParams> >> >> >> >> End of yarn log : >> >> >> >> Status of job 6ac47ddc8331ffd0b1fa9a3b5a551f86 >> (KUBERA-GEO-BRUT2SEGMENT) changed to FINISHED. >> >> 10:03:00,618 INFO >> org.apache.flink.yarn.ApplicationMaster$$anonfun$2$$anon$1 - Stopping >> YARN JobManager with status FAILED and diagnostic Flink YARN Client >> requested shutdown. >> >> 10:03:00,625 INFO >> org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl >> - Waiting for application to be successfully unregistered. >> >> 10:03:00,874 INFO >> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolPro >> xy - Closing proxy : h1r2dn12.bpa.bouyguestelecom.fr:45454 >> >> (… more closing proxy …) >> >> 10:03:00,877 INFO >> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolPro >> xy - Closing proxy : h1r2dn01.bpa.bouyguestelecom.fr:45454 >> >> 10:03:00,883 INFO >> org.apache.flink.yarn.ApplicationMaster$$anonfun$2$$anon$1 - Stopping >> JobManager akka://flink/user/jobmanager#1737010364. >> >> 10:03:00,895 INFO >> akka.remote.RemoteActorRefProvider$RemotingTerminator >> - Shutting down remote daemon. >> >> 10:03:00,896 INFO >> akka.remote.RemoteActorRefProvider$RemotingTerminator >> - Remote daemon shut down; proceeding with flushing remote transports. >> >> 10:03:00,918 INFO >> akka.remote.RemoteActorRefProvider$RemotingTerminator >> - Remoting shut down. >> >> >> >> End of log4j log: >> >> >> >> 2015:09:03 10:03:00 (main) - INFO - >> com.bouygtel.kuberasdk.main.Application.mainMethod - Fin ok >> traitement >> >> 2015:09:03 10:03:00 (Thread-14) - INFO - Classe Inconnue.Methode >> Inconnue >> - Shutting down FlinkYarnCluster from the client shutdown hook >> >> 2015:09:03 10:03:00 (Thread-14) - INFO - Classe Inconnue.Methode >> Inconnue >> - Sending shutdown request to the Application Master >> >> 2015:09:03 10:03:00 (flink-akka.actor.default-dispatcher-2) - INFO - >> Classe Inconnue.Methode Inconnue - Sending StopYarnSession request to >> ApplicationMaster. >> >> 2015:09:03 10:03:00 (flink-akka.actor.default-dispatcher-2) - INFO - >> Classe Inconnue.Methode Inconnue - Remote JobManager has been stopped >> successfully. Stopping local application client >> >> 2015:09:03 10:03:00 (flink-akka.actor.default-dispatcher-2) - INFO - >> Classe Inconnue.Methode Inconnue - Stopped Application client. >> >> 2015:09:03 10:03:00 (flink-akka.actor.default-dispatcher-15) - INFO - >> Classe Inconnue.Methode Inconnue - Shutting down remote daemon. >> >> 2015:09:03 10:03:00 (flink-akka.actor.default-dispatcher-15) - INFO - >> Classe Inconnue.Methode Inconnue - Remote daemon shut down; >> proceeding with flushing remote transports. >> >> 2015:09:03 10:03:00 (flink-akka.actor.default-dispatcher-15) - INFO - >> Classe Inconnue.Methode Inconnue - Remoting shut down. >> >> 2015:09:03 10:03:00 (Thread-14) - INFO - Classe Inconnue.Methode >> Inconnue >> - Deleting files in >> hdfs://h1r1nn01.bpa.bouyguestelecom.fr:8020/user/datcrypt/.flink/appl >> ication_1441011714087_0730 >> >> 2015:09:03 10:03:00 (Thread-15) - INFO - Classe Inconnue.Methode >> Inconnue >> - Application application_1441011714087_0730 finished with state >> FINISHED and final state FAILED at 1441267380623 >> >> 2015:09:03 10:03:00 (Thread-14) - WARN - Classe Inconnue.Methode >> Inconnue >> - The short-circuit local reads feature cannot be used because >> libhadoop cannot be loaded. >> >> 2015:09:03 10:03:01 (Thread-14) - INFO - Classe Inconnue.Methode >> Inconnue >> - YARN Client is shutting down >> >> >> >> Greetings, >> >> Arnaud >> >> >> ________________________________ >> >> L'intégrité de ce message n'étant pas assurée sur internet, la >> société expéditrice ne peut être tenue responsable de son contenu ni >> de ses pièces jointes. Toute utilisation ou diffusion non autorisée >> est interdite. Si vous n'êtes pas destinataire de ce message, merci >> de le détruire et d'avertir l'expéditeur. >> >> The integrity of this message cannot be guaranteed on the Internet. >> The company that sent this message cannot therefore be held liable >> for its content nor attachments. Any unauthorized use or >> dissemination is prohibited. If you are not the intended recipient of >> this message, then please delete it and notify the sender. > > |
Hi Arnaud,
No problem. Good to hear it is resolved :) Best, Max On Tue, Oct 20, 2015 at 4:37 PM, LINZ, Arnaud <[hidden email]> wrote: > Hi, > Sorry for the long delay, I've missed this mail. > I was using the 0.10 snapshot. I've upgraded it today and it seems to work now, I do have a SUCCEEDED too. > > Best regards, > Arnaud > > -----Message d'origine----- > De : Maximilian Michels [mailto:[hidden email]] > Envoyé : jeudi 8 octobre 2015 14:34 > À : [hidden email]; LINZ, Arnaud <[hidden email]> > Objet : Re: Flink batch runs OK but Yarn container fails in batch mode with -m yarn-cluster > > Hi Arnaud, > > I've looked into the problem but I couldn't reproduce it using Flink 0.9.0, Flink 0.9.1 and the current master snapshot (f332fa5). I always ended up with the final state SUCCEEDED. > > Which version of Flink were you using? > > Best regards, > Max > > On Thu, Sep 3, 2015 at 10:48 AM, Robert Metzger <[hidden email]> wrote: >> Hi Arnaud, >> >> I think that's a bug ;) >> I'll file a JIRA to fix it for the next release. >> >> On Thu, Sep 3, 2015 at 10:26 AM, LINZ, Arnaud >> <[hidden email]> >> wrote: >>> >>> Hi, >>> >>> >>> >>> I am wondering why, despite the fact that my java main() methods runs >>> OK and exit with 0 code value, the Yarn container status set by the >>> englobing flink execution is FAILED with diagnostic "Flink YARN >>> Client requested shutdown."? >>> >>> >>> >>> Command line : >>> >>> flink run -m yarn-cluster -yn 20 -ytm 8192 -yqu batch1 -ys 8 --class >>> <myMain> <myJar> <myParams> >>> >>> >>> >>> End of yarn log : >>> >>> >>> >>> Status of job 6ac47ddc8331ffd0b1fa9a3b5a551f86 >>> (KUBERA-GEO-BRUT2SEGMENT) changed to FINISHED. >>> >>> 10:03:00,618 INFO >>> org.apache.flink.yarn.ApplicationMaster$$anonfun$2$$anon$1 - Stopping >>> YARN JobManager with status FAILED and diagnostic Flink YARN Client >>> requested shutdown. >>> >>> 10:03:00,625 INFO >>> org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl >>> - Waiting for application to be successfully unregistered. >>> >>> 10:03:00,874 INFO >>> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolPro >>> xy - Closing proxy : h1r2dn12.bpa.bouyguestelecom.fr:45454 >>> >>> (… more closing proxy …) >>> >>> 10:03:00,877 INFO >>> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolPro >>> xy - Closing proxy : h1r2dn01.bpa.bouyguestelecom.fr:45454 >>> >>> 10:03:00,883 INFO >>> org.apache.flink.yarn.ApplicationMaster$$anonfun$2$$anon$1 - Stopping >>> JobManager akka://flink/user/jobmanager#1737010364. >>> >>> 10:03:00,895 INFO >>> akka.remote.RemoteActorRefProvider$RemotingTerminator >>> - Shutting down remote daemon. >>> >>> 10:03:00,896 INFO >>> akka.remote.RemoteActorRefProvider$RemotingTerminator >>> - Remote daemon shut down; proceeding with flushing remote transports. >>> >>> 10:03:00,918 INFO >>> akka.remote.RemoteActorRefProvider$RemotingTerminator >>> - Remoting shut down. >>> >>> >>> >>> End of log4j log: >>> >>> >>> >>> 2015:09:03 10:03:00 (main) - INFO - >>> com.bouygtel.kuberasdk.main.Application.mainMethod - Fin ok >>> traitement >>> >>> 2015:09:03 10:03:00 (Thread-14) - INFO - Classe Inconnue.Methode >>> Inconnue >>> - Shutting down FlinkYarnCluster from the client shutdown hook >>> >>> 2015:09:03 10:03:00 (Thread-14) - INFO - Classe Inconnue.Methode >>> Inconnue >>> - Sending shutdown request to the Application Master >>> >>> 2015:09:03 10:03:00 (flink-akka.actor.default-dispatcher-2) - INFO - >>> Classe Inconnue.Methode Inconnue - Sending StopYarnSession request to >>> ApplicationMaster. >>> >>> 2015:09:03 10:03:00 (flink-akka.actor.default-dispatcher-2) - INFO - >>> Classe Inconnue.Methode Inconnue - Remote JobManager has been stopped >>> successfully. Stopping local application client >>> >>> 2015:09:03 10:03:00 (flink-akka.actor.default-dispatcher-2) - INFO - >>> Classe Inconnue.Methode Inconnue - Stopped Application client. >>> >>> 2015:09:03 10:03:00 (flink-akka.actor.default-dispatcher-15) - INFO - >>> Classe Inconnue.Methode Inconnue - Shutting down remote daemon. >>> >>> 2015:09:03 10:03:00 (flink-akka.actor.default-dispatcher-15) - INFO - >>> Classe Inconnue.Methode Inconnue - Remote daemon shut down; >>> proceeding with flushing remote transports. >>> >>> 2015:09:03 10:03:00 (flink-akka.actor.default-dispatcher-15) - INFO - >>> Classe Inconnue.Methode Inconnue - Remoting shut down. >>> >>> 2015:09:03 10:03:00 (Thread-14) - INFO - Classe Inconnue.Methode >>> Inconnue >>> - Deleting files in >>> hdfs://h1r1nn01.bpa.bouyguestelecom.fr:8020/user/datcrypt/.flink/appl >>> ication_1441011714087_0730 >>> >>> 2015:09:03 10:03:00 (Thread-15) - INFO - Classe Inconnue.Methode >>> Inconnue >>> - Application application_1441011714087_0730 finished with state >>> FINISHED and final state FAILED at 1441267380623 >>> >>> 2015:09:03 10:03:00 (Thread-14) - WARN - Classe Inconnue.Methode >>> Inconnue >>> - The short-circuit local reads feature cannot be used because >>> libhadoop cannot be loaded. >>> >>> 2015:09:03 10:03:01 (Thread-14) - INFO - Classe Inconnue.Methode >>> Inconnue >>> - YARN Client is shutting down >>> >>> >>> >>> Greetings, >>> >>> Arnaud >>> >>> >>> ________________________________ >>> >>> L'intégrité de ce message n'étant pas assurée sur internet, la >>> société expéditrice ne peut être tenue responsable de son contenu ni >>> de ses pièces jointes. Toute utilisation ou diffusion non autorisée >>> est interdite. Si vous n'êtes pas destinataire de ce message, merci >>> de le détruire et d'avertir l'expéditeur. >>> >>> The integrity of this message cannot be guaranteed on the Internet. >>> The company that sent this message cannot therefore be held liable >>> for its content nor attachments. Any unauthorized use or >>> dissemination is prohibited. If you are not the intended recipient of >>> this message, then please delete it and notify the sender. >> >> |
Free forum by Nabble | Edit this page |