Hello, a question about Dashborad in Flink

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Hello, a question about Dashborad in Flink

Philip Lee
Hello,

According to http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Apache-Flink-Web-Dashboard-Completed-Job-history-td4067.html, I cannot retrieve the job history from Dashboard after turnning off JM.

But as Fabian mentioned here, 
"However, you can query all stats that are displayed by the dashboard via a REST API [1] while the JM is running and save them yourself. This way you can analyze the data also after the JM was stopped" could you explain about this sentence in detail.

I want to evaluate timeline view of each function after a job is done.

Thanks,
Phil
Reply | Threaded
Open this post in threaded view
|

Re: Hello, a question about Dashborad in Flink

Fabian Hueske-2
You can start a job and then periodically request and store information about the running job and vertices from using corresponding REST calls [1].
The data will be in JSON format.
After the job finished, you can stop requesting data.

Next you parse the JSON, extract the information you need and give it to some plotting library.
As I said, it is not possible to pass this data back into Flink's dashboard, but you have to process and plot it yourself.

Best, Fabian

2016-01-25 16:15 GMT+01:00 Philip Lee <[hidden email]>:
Hello,

According to http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Apache-Flink-Web-Dashboard-Completed-Job-history-td4067.html, I cannot retrieve the job history from Dashboard after turnning off JM.

But as Fabian mentioned here, 
"However, you can query all stats that are displayed by the dashboard via a REST API [1] while the JM is running and save them yourself. This way you can analyze the data also after the JM was stopped" could you explain about this sentence in detail.

I want to evaluate timeline view of each function after a job is done.

Thanks,
Phil

Reply | Threaded
Open this post in threaded view
|

Re: Hello, a question about Dashborad in Flink

Philip Lee
Thanks, 

Is there any way to measure shuffle data (read and write) on Flink or Dashboard?

I did not find the network usage metric in it.

Best,
Phil

On Mon, Jan 25, 2016 at 5:06 PM, Fabian Hueske <[hidden email]> wrote:
You can start a job and then periodically request and store information about the running job and vertices from using corresponding REST calls [1].
The data will be in JSON format.
After the job finished, you can stop requesting data.

Next you parse the JSON, extract the information you need and give it to some plotting library.
As I said, it is not possible to pass this data back into Flink's dashboard, but you have to process and plot it yourself.

Best, Fabian

2016-01-25 16:15 GMT+01:00 Philip Lee <[hidden email]>:
Hello,

According to http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Apache-Flink-Web-Dashboard-Completed-Job-history-td4067.html, I cannot retrieve the job history from Dashboard after turnning off JM.

But as Fabian mentioned here, 
"However, you can query all stats that are displayed by the dashboard via a REST API [1] while the JM is running and save them yourself. This way you can analyze the data also after the JM was stopped" could you explain about this sentence in detail.

I want to evaluate timeline view of each function after a job is done.

Thanks,
Phil


Reply | Threaded
Open this post in threaded view
|

Re: Hello, a question about Dashborad in Flink

Fabian Hueske-2
The REST interface does also provide metrics about the number of records and the size of the input and output of all tasks.
See:
- /jobs/<jobid>/vertices/<vertexid>
- /jobs/<jobid>/vertices/<vertexid>/subtasks/<subtasknum>/attempts/<attempt>
in https://ci.apache.org/projects/flink/flink-docs-release-0.10/internals/monitoring_rest_api.html#details-of-a-running-or-completed-job

However, not all of this data is going over the network because some tasks can be locally connected.

Best, Fabian

2016-01-29 8:50 GMT+01:00 Philip Lee <[hidden email]>:
Thanks, 

Is there any way to measure shuffle data (read and write) on Flink or Dashboard?

I did not find the network usage metric in it.

Best,
Phil

On Mon, Jan 25, 2016 at 5:06 PM, Fabian Hueske <[hidden email]> wrote:
You can start a job and then periodically request and store information about the running job and vertices from using corresponding REST calls [1].
The data will be in JSON format.
After the job finished, you can stop requesting data.

Next you parse the JSON, extract the information you need and give it to some plotting library.
As I said, it is not possible to pass this data back into Flink's dashboard, but you have to process and plot it yourself.

Best, Fabian

2016-01-25 16:15 GMT+01:00 Philip Lee <[hidden email]>:
Hello,

According to http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Apache-Flink-Web-Dashboard-Completed-Job-history-td4067.html, I cannot retrieve the job history from Dashboard after turnning off JM.

But as Fabian mentioned here, 
"However, you can query all stats that are displayed by the dashboard via a REST API [1] while the JM is running and save them yourself. This way you can analyze the data also after the JM was stopped" could you explain about this sentence in detail.

I want to evaluate timeline view of each function after a job is done.

Thanks,
Phil



Reply | Threaded
Open this post in threaded view
|

Re: Hello, a question about Dashborad in Flink

Philip Lee
Great,

you menat the difference between narrow shuffle and global shuffle?

I use Flink version 0.9, 
but it did not not work to access REST interface when I use "ssh tunnel" to remote server.

it is from version of probelm?

Best,
Phil



On Fri, Jan 29, 2016 at 9:46 AM, Fabian Hueske <[hidden email]> wrote:
The REST interface does also provide metrics about the number of records and the size of the input and output of all tasks.
See:
- /jobs/<jobid>/vertices/<vertexid>
- /jobs/<jobid>/vertices/<vertexid>/subtasks/<subtasknum>/attempts/<attempt>
in https://ci.apache.org/projects/flink/flink-docs-release-0.10/internals/monitoring_rest_api.html#details-of-a-running-or-completed-job

However, not all of this data is going over the network because some tasks can be locally connected.

Best, Fabian

2016-01-29 8:50 GMT+01:00 Philip Lee <[hidden email]>:
Thanks, 

Is there any way to measure shuffle data (read and write) on Flink or Dashboard?

I did not find the network usage metric in it.

Best,
Phil

On Mon, Jan 25, 2016 at 5:06 PM, Fabian Hueske <[hidden email]> wrote:
You can start a job and then periodically request and store information about the running job and vertices from using corresponding REST calls [1].
The data will be in JSON format.
After the job finished, you can stop requesting data.

Next you parse the JSON, extract the information you need and give it to some plotting library.
As I said, it is not possible to pass this data back into Flink's dashboard, but you have to process and plot it yourself.

Best, Fabian

2016-01-25 16:15 GMT+01:00 Philip Lee <[hidden email]>:
Hello,

According to http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Apache-Flink-Web-Dashboard-Completed-Job-history-td4067.html, I cannot retrieve the job history from Dashboard after turnning off JM.

But as Fabian mentioned here, 
"However, you can query all stats that are displayed by the dashboard via a REST API [1] while the JM is running and save them yourself. This way you can analyze the data also after the JM was stopped" could you explain about this sentence in detail.

I want to evaluate timeline view of each function after a job is done.

Thanks,
Phil




Reply | Threaded
Open this post in threaded view
|

Re: Hello, a question about Dashborad in Flink

Stephan Ewen
Hi!

The REST monitoring interface and extended web dashboard were added in version 0.10

Greetings,
Stephan


On Fri, Jan 29, 2016 at 9:55 AM, Philip Lee <[hidden email]> wrote:
Great,

you menat the difference between narrow shuffle and global shuffle?

I use Flink version 0.9, 
but it did not not work to access REST interface when I use "ssh tunnel" to remote server.

it is from version of probelm?

Best,
Phil



On Fri, Jan 29, 2016 at 9:46 AM, Fabian Hueske <[hidden email]> wrote:
The REST interface does also provide metrics about the number of records and the size of the input and output of all tasks.
See:
- /jobs/<jobid>/vertices/<vertexid>
- /jobs/<jobid>/vertices/<vertexid>/subtasks/<subtasknum>/attempts/<attempt>
in https://ci.apache.org/projects/flink/flink-docs-release-0.10/internals/monitoring_rest_api.html#details-of-a-running-or-completed-job

However, not all of this data is going over the network because some tasks can be locally connected.

Best, Fabian

2016-01-29 8:50 GMT+01:00 Philip Lee <[hidden email]>:
Thanks, 

Is there any way to measure shuffle data (read and write) on Flink or Dashboard?

I did not find the network usage metric in it.

Best,
Phil

On Mon, Jan 25, 2016 at 5:06 PM, Fabian Hueske <[hidden email]> wrote:
You can start a job and then periodically request and store information about the running job and vertices from using corresponding REST calls [1].
The data will be in JSON format.
After the job finished, you can stop requesting data.

Next you parse the JSON, extract the information you need and give it to some plotting library.
As I said, it is not possible to pass this data back into Flink's dashboard, but you have to process and plot it yourself.

Best, Fabian

2016-01-25 16:15 GMT+01:00 Philip Lee <[hidden email]>:
Hello,

According to http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Apache-Flink-Web-Dashboard-Completed-Job-history-td4067.html, I cannot retrieve the job history from Dashboard after turnning off JM.

But as Fabian mentioned here, 
"However, you can query all stats that are displayed by the dashboard via a REST API [1] while the JM is running and save them yourself. This way you can analyze the data also after the JM was stopped" could you explain about this sentence in detail.

I want to evaluate timeline view of each function after a job is done.

Thanks,
Phil