Logs are not easy to read through webUI

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Logs are not easy to read through webUI

Xinyu Zhang
Hi all

We use flink on yarn and flink version is 1.4. 

When a streaming job run for a long time, the webUI cannot show logs. This may be becasue the log size is too large. 

However, if we use the DailyRollingAppender to divide logs (granularity is `day`) in log4j.properties, we will never see the log of yesterday.

Is there any ideas can make read logs easier?

Maybe, we should add an interface that support for reading log by time interval. Besides, when we get the taskmanager logs through webUI, jobmanager can redirect to a URL of the taskmanager, which users can get the logs directly (Just like MR task), other than downloading the logs from taskmanager and then sending logs to users.

Thanks!

Xinyu Zhang


Reply | Threaded
Open this post in threaded view
|

Re: Logs are not easy to read through webUI

vino yang
Hi Xinyu,

This is indeed a problem. Especially when the amount of logs is large, it may even cause the UI to stall for a long time. The same is true for YRAN, and there is really no good way to do it at the moment.
Thank you for your suggestion, do you mean "periodic reading" refers to full or incremental? If it is a full reading, I personally do not recommend this, which will increase the burden on JM and client, and it is appropriate to manually trigger it by the user. Maybe we can consider loading only a few logs at a time, such as incremental reads, paged displays, and so on.
Regarding the second question, Flink did this because its TM does not provide any web services to display information, and the Web UI is currently bundled with JM.

Thanks, vino.

2018-07-30 16:33 GMT+08:00 Xinyu Zhang <[hidden email]>:
Hi all

We use flink on yarn and flink version is 1.4. 

When a streaming job run for a long time, the webUI cannot show logs. This may be becasue the log size is too large. 

However, if we use the DailyRollingAppender to divide logs (granularity is `day`) in log4j.properties, we will never see the log of yesterday.

Is there any ideas can make read logs easier?

Maybe, we should add an interface that support for reading log by time interval. Besides, when we get the taskmanager logs through webUI, jobmanager can redirect to a URL of the taskmanager, which users can get the logs directly (Just like MR task), other than downloading the logs from taskmanager and then sending logs to users.

Thanks!

Xinyu Zhang



Reply | Threaded
Open this post in threaded view
|

Logs are not easy to read through webUI

Xinyu Zhang
Thanks for your reply.  "periodic reading" means reading all logs in a given time interval. For example, my logs is daily divided, I can get all logs of yesterday through a parameter like '2018-07-29/2018-07-30'.

TM which provides a web service to display information will lessen the burden of jobmanager, especially when there are many taskManagers in the flink cluster.


2018年7月30日星期一,vino yang <[hidden email]> 写道:
Hi Xinyu,

This is indeed a problem. Especially when the amount of logs is large, it may even cause the UI to stall for a long time. The same is true for YRAN, and there is really no good way to do it at the moment.
Thank you for your suggestion, do you mean "periodic reading" refers to full or incremental? If it is a full reading, I personally do not recommend this, which will increase the burden on JM and client, and it is appropriate to manually trigger it by the user. Maybe we can consider loading only a few logs at a time, such as incremental reads, paged displays, and so on.
Regarding the second question, Flink did this because its TM does not provide any web services to display information, and the Web UI is currently bundled with JM.

Thanks, vino.

2018-07-30 16:33 GMT+08:00 Xinyu Zhang <[hidden email]>:
Hi all

We use flink on yarn and flink version is 1.4. 

When a streaming job run for a long time, the webUI cannot show logs. This may be becasue the log size is too large. 

However, if we use the DailyRollingAppender to divide logs (granularity is `day`) in log4j.properties, we will never see the log of yesterday.

Is there any ideas can make read logs easier?

Maybe, we should add an interface that support for reading log by time interval. Besides, when we get the taskmanager logs through webUI, jobmanager can redirect to a URL of the taskmanager, which users can get the logs directly (Just like MR task), other than downloading the logs from taskmanager and then sending logs to users.

Thanks!

Xinyu Zhang



Reply | Threaded
Open this post in threaded view
|

Re: Logs are not easy to read through webUI

vino yang
Hi Xinyu,

Thank you for your clarification on "periodic reading". If Flink considers developing an API for reading logs, I think this is a good idea.

Regarding the problem of TM reading logs, your idea is good from a performance perspective. 
But Flink didn't provide any web services for the TM from the beginning. All the requests were passed through the JM proxy. 
Just because of the log read performance changes, there will be major changes to the architecture, and this will make the TM take on more responsibilities.

Thanks, vino.


2018-07-30 17:34 GMT+08:00 Xinyu Zhang <[hidden email]>:
Thanks for your reply.  "periodic reading" means reading all logs in a given time interval. For example, my logs is daily divided, I can get all logs of yesterday through a parameter like '2018-07-29/2018-07-30'.

TM which provides a web service to display information will lessen the burden of jobmanager, especially when there are many taskManagers in the flink cluster.


2018年7月30日星期一,vino yang <[hidden email]> 写道:
Hi Xinyu,

This is indeed a problem. Especially when the amount of logs is large, it may even cause the UI to stall for a long time. The same is true for YRAN, and there is really no good way to do it at the moment.
Thank you for your suggestion, do you mean "periodic reading" refers to full or incremental? If it is a full reading, I personally do not recommend this, which will increase the burden on JM and client, and it is appropriate to manually trigger it by the user. Maybe we can consider loading only a few logs at a time, such as incremental reads, paged displays, and so on.
Regarding the second question, Flink did this because its TM does not provide any web services to display information, and the Web UI is currently bundled with JM.

Thanks, vino.

2018-07-30 16:33 GMT+08:00 Xinyu Zhang <[hidden email]>:
Hi all

We use flink on yarn and flink version is 1.4. 

When a streaming job run for a long time, the webUI cannot show logs. This may be becasue the log size is too large. 

However, if we use the DailyRollingAppender to divide logs (granularity is `day`) in log4j.properties, we will never see the log of yesterday.

Is there any ideas can make read logs easier?

Maybe, we should add an interface that support for reading log by time interval. Besides, when we get the taskmanager logs through webUI, jobmanager can redirect to a URL of the taskmanager, which users can get the logs directly (Just like MR task), other than downloading the logs from taskmanager and then sending logs to users.

Thanks!

Xinyu Zhang




Reply | Threaded
Open this post in threaded view
|

Re: Logs are not easy to read through webUI

Xinyu Zhang
Hi vino

Yes, it's only from the perspective of performance of reading log or metrics.  If the logs with timestamps(e.g. jobmanager.log.2018-07-29) will never change, maybe blob store can cache some of them to improve performance.

BTW, please considering to develop an API for reading logs. I think many flink users meet this problem. 

Thanks!

Xinyu Zhang

2018年7月30日星期一,vino yang <[hidden email]> 写道:
Hi Xinyu,

Thank you for your clarification on "periodic reading". If Flink considers developing an API for reading logs, I think this is a good idea.

Regarding the problem of TM reading logs, your idea is good from a performance perspective. 
But Flink didn't provide any web services for the TM from the beginning. All the requests were passed through the JM proxy. 
Just because of the log read performance changes, there will be major changes to the architecture, and this will make the TM take on more responsibilities.

Thanks, vino.


2018-07-30 17:34 GMT+08:00 Xinyu Zhang <[hidden email]>:
Thanks for your reply.  "periodic reading" means reading all logs in a given time interval. For example, my logs is daily divided, I can get all logs of yesterday through a parameter like '2018-07-29/2018-07-30'.

TM which provides a web service to display information will lessen the burden of jobmanager, especially when there are many taskManagers in the flink cluster.


2018年7月30日星期一,vino yang <[hidden email]> 写道:
Hi Xinyu,

This is indeed a problem. Especially when the amount of logs is large, it may even cause the UI to stall for a long time. The same is true for YRAN, and there is really no good way to do it at the moment.
Thank you for your suggestion, do you mean "periodic reading" refers to full or incremental? If it is a full reading, I personally do not recommend this, which will increase the burden on JM and client, and it is appropriate to manually trigger it by the user. Maybe we can consider loading only a few logs at a time, such as incremental reads, paged displays, and so on.
Regarding the second question, Flink did this because its TM does not provide any web services to display information, and the Web UI is currently bundled with JM.

Thanks, vino.

2018-07-30 16:33 GMT+08:00 Xinyu Zhang <[hidden email]>:
Hi all

We use flink on yarn and flink version is 1.4. 

When a streaming job run for a long time, the webUI cannot show logs. This may be becasue the log size is too large. 

However, if we use the DailyRollingAppender to divide logs (granularity is `day`) in log4j.properties, we will never see the log of yesterday.

Is there any ideas can make read logs easier?

Maybe, we should add an interface that support for reading log by time interval. Besides, when we get the taskmanager logs through webUI, jobmanager can redirect to a URL of the taskmanager, which users can get the logs directly (Just like MR task), other than downloading the logs from taskmanager and then sending logs to users.

Thanks!

Xinyu Zhang




Reply | Threaded
Open this post in threaded view
|

Re: Logs are not easy to read through webUI

vino yang
Hi Xinyu,

Thanks for your suggestion. I will CC this suggestion to some PMC members of Flink.

Thanks, vino.

2018-07-30 18:03 GMT+08:00 Xinyu Zhang <[hidden email]>:
Hi vino

Yes, it's only from the perspective of performance of reading log or metrics.  If the logs with timestamps(e.g. jobmanager.log.2018-07-29) will never change, maybe blob store can cache some of them to improve performance.

BTW, please considering to develop an API for reading logs. I think many flink users meet this problem. 

Thanks!

Xinyu Zhang


2018年7月30日星期一,vino yang <[hidden email]> 写道:
Hi Xinyu,

Thank you for your clarification on "periodic reading". If Flink considers developing an API for reading logs, I think this is a good idea.

Regarding the problem of TM reading logs, your idea is good from a performance perspective. 
But Flink didn't provide any web services for the TM from the beginning. All the requests were passed through the JM proxy. 
Just because of the log read performance changes, there will be major changes to the architecture, and this will make the TM take on more responsibilities.

Thanks, vino.


2018-07-30 17:34 GMT+08:00 Xinyu Zhang <[hidden email]>:
Thanks for your reply.  "periodic reading" means reading all logs in a given time interval. For example, my logs is daily divided, I can get all logs of yesterday through a parameter like '2018-07-29/2018-07-30'.

TM which provides a web service to display information will lessen the burden of jobmanager, especially when there are many taskManagers in the flink cluster.


2018年7月30日星期一,vino yang <[hidden email]> 写道:
Hi Xinyu,

This is indeed a problem. Especially when the amount of logs is large, it may even cause the UI to stall for a long time. The same is true for YRAN, and there is really no good way to do it at the moment.
Thank you for your suggestion, do you mean "periodic reading" refers to full or incremental? If it is a full reading, I personally do not recommend this, which will increase the burden on JM and client, and it is appropriate to manually trigger it by the user. Maybe we can consider loading only a few logs at a time, such as incremental reads, paged displays, and so on.
Regarding the second question, Flink did this because its TM does not provide any web services to display information, and the Web UI is currently bundled with JM.

Thanks, vino.

2018-07-30 16:33 GMT+08:00 Xinyu Zhang <[hidden email]>:
Hi all

We use flink on yarn and flink version is 1.4. 

When a streaming job run for a long time, the webUI cannot show logs. This may be becasue the log size is too large. 

However, if we use the DailyRollingAppender to divide logs (granularity is `day`) in log4j.properties, we will never see the log of yesterday.

Is there any ideas can make read logs easier?

Maybe, we should add an interface that support for reading log by time interval. Besides, when we get the taskmanager logs through webUI, jobmanager can redirect to a URL of the taskmanager, which users can get the logs directly (Just like MR task), other than downloading the logs from taskmanager and then sending logs to users.

Thanks!

Xinyu Zhang





Reply | Threaded
Open this post in threaded view
|

Re: Logs are not easy to read through webUI

Till Rohrmann
Hi Xinyu,

thanks for starting this discussion. I think you should open a JIRA issue for this feature. I can see the benefit of such a feature if the DailyRollingAppender is activated.

Cheers,
Till

On Mon, Jul 30, 2018 at 1:47 PM vino yang <[hidden email]> wrote:
Hi Xinyu,

Thanks for your suggestion. I will CC this suggestion to some PMC members of Flink.

Thanks, vino.

2018-07-30 18:03 GMT+08:00 Xinyu Zhang <[hidden email]>:
Hi vino

Yes, it's only from the perspective of performance of reading log or metrics.  If the logs with timestamps(e.g. jobmanager.log.2018-07-29) will never change, maybe blob store can cache some of them to improve performance.

BTW, please considering to develop an API for reading logs. I think many flink users meet this problem. 

Thanks!

Xinyu Zhang


2018年7月30日星期一,vino yang <[hidden email]> 写道:
Hi Xinyu,

Thank you for your clarification on "periodic reading". If Flink considers developing an API for reading logs, I think this is a good idea.

Regarding the problem of TM reading logs, your idea is good from a performance perspective. 
But Flink didn't provide any web services for the TM from the beginning. All the requests were passed through the JM proxy. 
Just because of the log read performance changes, there will be major changes to the architecture, and this will make the TM take on more responsibilities.

Thanks, vino.


2018-07-30 17:34 GMT+08:00 Xinyu Zhang <[hidden email]>:
Thanks for your reply.  "periodic reading" means reading all logs in a given time interval. For example, my logs is daily divided, I can get all logs of yesterday through a parameter like '2018-07-29/2018-07-30'.

TM which provides a web service to display information will lessen the burden of jobmanager, especially when there are many taskManagers in the flink cluster.


2018年7月30日星期一,vino yang <[hidden email]> 写道:
Hi Xinyu,

This is indeed a problem. Especially when the amount of logs is large, it may even cause the UI to stall for a long time. The same is true for YRAN, and there is really no good way to do it at the moment.
Thank you for your suggestion, do you mean "periodic reading" refers to full or incremental? If it is a full reading, I personally do not recommend this, which will increase the burden on JM and client, and it is appropriate to manually trigger it by the user. Maybe we can consider loading only a few logs at a time, such as incremental reads, paged displays, and so on.
Regarding the second question, Flink did this because its TM does not provide any web services to display information, and the Web UI is currently bundled with JM.

Thanks, vino.

2018-07-30 16:33 GMT+08:00 Xinyu Zhang <[hidden email]>:
Hi all

We use flink on yarn and flink version is 1.4. 

When a streaming job run for a long time, the webUI cannot show logs. This may be becasue the log size is too large. 

However, if we use the DailyRollingAppender to divide logs (granularity is `day`) in log4j.properties, we will never see the log of yesterday.

Is there any ideas can make read logs easier?

Maybe, we should add an interface that support for reading log by time interval. Besides, when we get the taskmanager logs through webUI, jobmanager can redirect to a URL of the taskmanager, which users can get the logs directly (Just like MR task), other than downloading the logs from taskmanager and then sending logs to users.

Thanks!

Xinyu Zhang