回复:Jobmanager was killed when disk less 10% in yarn

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

回复:Jobmanager was killed when disk less 10% in yarn

Zhijiang(wangzhijiang999)
The log just indicates the SignalHandler handles the kill signal and the process of JobManager exit , and it can not get the reason from it.
You may check the container log from node manager why it was killed.

Best,

Zhijiang
------------------------------------------------------------------
发件人:lining jing <[hidden email]>
发送时间:2017年2月20日(星期一) 10:13
收件人:user <[hidden email]>
主 题:Jobmanager was killed when disk less 10% in yarn

Hi,

I use yarn manager resource. Recently when disk less 10% , JobManager was killed. I want to know whether the reason is the disk problem.


log : 


2017-02-19 03:20:37,087 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             - RECEIVED SIGNAL 15: SIGTERM. Shutting down as requested.
2017-02-19 03:20:37,088 INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator     - Stopping checkpoint coordinator for job 1b45608e30808183913eeffbb4d855da
2017-02-19 03:20:37,088 INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator     - Stopping checkpoint coordinator for job 1b45608e30808183913eeffbb4d855da
2017-02-19 03:20:37,089 INFO  org.apache.flink.runtime.blob.BlobCache                       - Shutting down BlobCache
2017-02-19 03:20:37,089 INFO  org.apache.flink.runtime.webmonitor.WebRuntimeMonitor         - Removing web dashboard root cache directory /tmp/flink-web-dfa2b369-44ea-4e35-8011-672a1e627a10
2017-02-19 03:20:37,089 INFO  org.apache.flink.runtime.blob.BlobCache                       - Shutting down BlobCache
2017-02-19 03:20:37,137 INFO  org.apache.flink.runtime.webmonitor.WebRuntimeMonitor         - Removing web dashboard jar upload directory /tmp/flink-web-upload-d6edb5ea-5894-489b-89f7-f2972fc9433d
2017-02-19 03:20:37,138 INFO  org.apache.flink.runtime.blob.BlobServer                      - Stopped BLOB server at 0.0.0.0:54513



Reply | Threaded
Open this post in threaded view
|

Re: Jobmanager was killed when disk less 10% in yarn

lining jing
I have seen the log, did not find any information. Just get some information about the machine run this node. Disk less 10%

2017-02-20 14:03 GMT+08:00 wangzhijiang999 <[hidden email]>:
The log just indicates the SignalHandler handles the kill signal and the process of JobManager exit , and it can not get the reason from it.
You may check the container log from node manager why it was killed.

Best,

Zhijiang
------------------------------------------------------------------
发件人:lining jing <[hidden email]>
发送时间:2017年2月20日(星期一) 10:13
收件人:user <[hidden email]>
主 题:Jobmanager was killed when disk less 10% in yarn

Hi,

I use yarn manager resource. Recently when disk less 10% , JobManager was killed. I want to know whether the reason is the disk problem.


log : 


2017-02-19 03:20:37,087 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             - RECEIVED SIGNAL 15: SIGTERM. Shutting down as requested.
2017-02-19 03:20:37,088 INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator     - Stopping checkpoint coordinator for job 1b45608e30808183913eeffbb4d855da
2017-02-19 03:20:37,088 INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator     - Stopping checkpoint coordinator for job 1b45608e30808183913eeffbb4d855da
2017-02-19 03:20:37,089 INFO  org.apache.flink.runtime.blob.BlobCache                       - Shutting down BlobCache
2017-02-19 03:20:37,089 INFO  org.apache.flink.runtime.webmonitor.WebRuntimeMonitor         - Removing web dashboard root cache directory /tmp/flink-web-dfa2b369-44ea-4e35-8011-672a1e627a10
2017-02-19 03:20:37,089 INFO  org.apache.flink.runtime.blob.BlobCache                       - Shutting down BlobCache
2017-02-19 03:20:37,137 INFO  org.apache.flink.runtime.webmonitor.WebRuntimeMonitor         - Removing web dashboard jar upload directory /tmp/flink-web-upload-d6edb5ea-5894-489b-89f7-f2972fc9433d
2017-02-19 03:20:37,138 INFO  org.apache.flink.runtime.blob.BlobServer                      - Stopped BLOB server at 0.0.0.0:54513




Reply | Threaded
Open this post in threaded view
|

Re: Jobmanager was killed when disk less 10% in yarn

Stephan Ewen
Zhijiang is right, it is not possible to tell this from these logs.

The Yarn logs probably hold the cause for this.

On Mon, Feb 20, 2017 at 9:21 AM, lining jing <[hidden email]> wrote:
I have seen the log, did not find any information. Just get some information about the machine run this node. Disk less 10%

2017-02-20 14:03 GMT+08:00 wangzhijiang999 <[hidden email]>:
The log just indicates the SignalHandler handles the kill signal and the process of JobManager exit , and it can not get the reason from it.
You may check the container log from node manager why it was killed.

Best,

Zhijiang
------------------------------------------------------------------
发件人:lining jing <[hidden email]>
发送时间:2017年2月20日(星期一) 10:13
收件人:user <[hidden email]>
主 题:Jobmanager was killed when disk less 10% in yarn

Hi,

I use yarn manager resource. Recently when disk less 10% , JobManager was killed. I want to know whether the reason is the disk problem.


log : 


2017-02-19 03:20:37,087 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             - RECEIVED SIGNAL 15: SIGTERM. Shutting down as requested.
2017-02-19 03:20:37,088 INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator     - Stopping checkpoint coordinator for job 1b45608e30808183913eeffbb4d855da
2017-02-19 03:20:37,088 INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator     - Stopping checkpoint coordinator for job 1b45608e30808183913eeffbb4d855da
2017-02-19 03:20:37,089 INFO  org.apache.flink.runtime.blob.BlobCache                       - Shutting down BlobCache
2017-02-19 03:20:37,089 INFO  org.apache.flink.runtime.webmonitor.WebRuntimeMonitor         - Removing web dashboard root cache directory /tmp/flink-web-dfa2b369-44ea-4e35-8011-672a1e627a10
2017-02-19 03:20:37,089 INFO  org.apache.flink.runtime.blob.BlobCache                       - Shutting down BlobCache
2017-02-19 03:20:37,137 INFO  org.apache.flink.runtime.webmonitor.WebRuntimeMonitor         - Removing web dashboard jar upload directory /tmp/flink-web-upload-d6edb5ea-5894-489b-89f7-f2972fc9433d
2017-02-19 03:20:37,138 INFO  org.apache.flink.runtime.blob.BlobServer                      - Stopped BLOB server at 0.0.0.0:54513