Flink stress testing and metrics

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Flink stress testing and metrics

Ladhari Sadok
Hi All,

I want to do a stress testing of my Flink app implementation: event generation with ParallelSourceFunction then measuring the latency ,throughput, CPU & memry leak ...

But when testing, I noticed that :
  • the maximum of CPU usage is 30-33%
  • latency is always NaNd NaNh in the dashboard ( even I have set this configuration executionConfig.setLatencyTrackingInterval(1); )


Can some one help me find the best solution to smoke testing Flink ?

Note: I'm using Flink 1.3 and the Flink Web UI to visualize the metrics.
Also my PC have a 12Go RAM and 8 Core CPU.

Regards,
Sadok
Reply | Threaded
Open this post in threaded view
|

Re: Flink stress testing and metrics

Timo Walther
Hi Sadok,

it would be helpful if you could tell us a bit more about your job. E.g. a skewed key distribution where keys are only sent to one third of your operators can not use your CPUs full capabilities.

The latency tracking interval is in milliseconds. Can you try if 1000 would fix your problem? I could not find an open issue describing your problem. Maybe more information about your environment can help. How are you executing your Flink application? Are you using a parallelism of 8?

Regards,
Timo


Am 11/22/17 um 9:49 AM schrieb Ladhari Sadok:
Hi All,

I want to do a stress testing of my Flink app implementation: event generation with ParallelSourceFunction then measuring the latency ,throughput, CPU & memry leak ...

But when testing, I noticed that :
  • the maximum of CPU usage is 30-33%
  • latency is always NaNd NaNh in the dashboard ( even I have set this configuration executionConfig.setLatencyTrackingInterval(1); )


Can some one help me find the best solution to smoke testing Flink ?

Note: I'm using Flink 1.3 and the Flink Web UI to visualize the metrics.
Also my PC have a 12Go RAM and 8 Core CPU.

Regards,
Sadok


Reply | Threaded
Open this post in threaded view
|

Re: Flink stress testing and metrics

Ladhari Sadok
Thanks Timo for your answer.

I have tried to setLatencyTrackingInterval(1000) but I have got the same result ( latency : NaN )

My Flink Job is a geofencing pattern :
  •  [Latitude,Langitude ] < IN | OUT > Location ? Send Notification : None

In my stress test I'm using data that always send notifications (condition always matched). So I want to measure the latency of my implementation.

I'm working with parallelism of 8 , all tasks are working and notifications are correctly generated but when testing I have noticed that the latency metric don't work (take a look at the screen-shot in attach). All other metrics are working.

Please help me finding the best way to do the stress testing correctly.

Regards,

Sadok



2017-11-22 14:52 GMT+01:00 Timo Walther <[hidden email]>:
Hi Sadok,

it would be helpful if you could tell us a bit more about your job. E.g. a skewed key distribution where keys are only sent to one third of your operators can not use your CPUs full capabilities.

The latency tracking interval is in milliseconds. Can you try if 1000 would fix your problem? I could not find an open issue describing your problem. Maybe more information about your environment can help. How are you executing your Flink application? Are you using a parallelism of 8?

Regards,
Timo


Am 11/22/17 um 9:49 AM schrieb Ladhari Sadok:
Hi All,

I want to do a stress testing of my Flink app implementation: event generation with ParallelSourceFunction then measuring the latency ,throughput, CPU & memry leak ...

But when testing, I noticed that :
  • the maximum of CPU usage is 30-33%
  • latency is always NaNd NaNh in the dashboard ( even I have set this configuration executionConfig.setLatencyTrackingInterval(1); )


Can some one help me find the best solution to smoke testing Flink ?

Note: I'm using Flink 1.3 and the Flink Web UI to visualize the metrics.
Also my PC have a 12Go RAM and 8 Core CPU.

Regards,
Sadok




tasks.png (39K) Download Attachment
tasks_latency_metric.png (23K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Flink stress testing and metrics

Timo Walther
At a first glance I would say that your data size is very small. Flink is able to process millions of records on a single machine. It might be that the records are produced to quickly to be used for latency measuring.

Is you data generator never-ending?


Am 11/22/17 um 4:13 PM schrieb Ladhari Sadok:
Thanks Timo for your answer.

I have tried to setLatencyTrackingInterval(1000) but I have got the same result ( latency : NaN )

My Flink Job is a geofencing pattern :
  •  [Latitude,Langitude ] < IN | OUT > Location ? Send Notification : None

In my stress test I'm using data that always send notifications (condition always matched). So I want to measure the latency of my implementation.

I'm working with parallelism of 8 , all tasks are working and notifications are correctly generated but when testing I have noticed that the latency metric don't work (take a look at the screen-shot in attach). All other metrics are working.

Please help me finding the best way to do the stress testing correctly.

Regards,

Sadok



2017-11-22 14:52 GMT+01:00 Timo Walther <[hidden email]>:
Hi Sadok,

it would be helpful if you could tell us a bit more about your job. E.g. a skewed key distribution where keys are only sent to one third of your operators can not use your CPUs full capabilities.

The latency tracking interval is in milliseconds. Can you try if 1000 would fix your problem? I could not find an open issue describing your problem. Maybe more information about your environment can help. How are you executing your Flink application? Are you using a parallelism of 8?

Regards,
Timo


Am 11/22/17 um 9:49 AM schrieb Ladhari Sadok:
Hi All,

I want to do a stress testing of my Flink app implementation: event generation with ParallelSourceFunction then measuring the latency ,throughput, CPU & memry leak ...

But when testing, I noticed that :
  • the maximum of CPU usage is 30-33%
  • latency is always NaNd NaNh in the dashboard ( even I have set this configuration executionConfig.setLatencyTrackingInterval(1); )


Can some one help me find the best solution to smoke testing Flink ?

Note: I'm using Flink 1.3 and the Flink Web UI to visualize the metrics.
Also my PC have a 12Go RAM and 8 Core CPU.

Regards,
Sadok




Reply | Threaded
Open this post in threaded view
|

Re: Flink stress testing and metrics

Ladhari Sadok
Normally it should return 0ms in case of no latency not NaN, and my real data size is 1kb, but for now I'm using 200 bytes, I will try it with the real size later.

For the data generator, it is an infinite for loop.

Thanks.

2017-11-22 18:11 GMT+01:00 Timo Walther <[hidden email]>:
At a first glance I would say that your data size is very small. Flink is able to process millions of records on a single machine. It might be that the records are produced to quickly to be used for latency measuring.

Is you data generator never-ending?


Am 11/22/17 um 4:13 PM schrieb Ladhari Sadok:
Thanks Timo for your answer.

I have tried to setLatencyTrackingInterval(1000) but I have got the same result ( latency : NaN )

My Flink Job is a geofencing pattern :
  •  [Latitude,Langitude ] < IN | OUT > Location ? Send Notification : None

In my stress test I'm using data that always send notifications (condition always matched). So I want to measure the latency of my implementation.

I'm working with parallelism of 8 , all tasks are working and notifications are correctly generated but when testing I have noticed that the latency metric don't work (take a look at the screen-shot in attach). All other metrics are working.

Please help me finding the best way to do the stress testing correctly.

Regards,

Sadok



2017-11-22 14:52 GMT+01:00 Timo Walther <[hidden email]>:
Hi Sadok,

it would be helpful if you could tell us a bit more about your job. E.g. a skewed key distribution where keys are only sent to one third of your operators can not use your CPUs full capabilities.

The latency tracking interval is in milliseconds. Can you try if 1000 would fix your problem? I could not find an open issue describing your problem. Maybe more information about your environment can help. How are you executing your Flink application? Are you using a parallelism of 8?

Regards,
Timo


Am 11/22/17 um 9:49 AM schrieb Ladhari Sadok:
Hi All,

I want to do a stress testing of my Flink app implementation: event generation with ParallelSourceFunction then measuring the latency ,throughput, CPU & memry leak ...

But when testing, I noticed that :
  • the maximum of CPU usage is 30-33%
  • latency is always NaNd NaNh in the dashboard ( even I have set this configuration executionConfig.setLatencyTrackingInterval(1); )


Can some one help me find the best solution to smoke testing Flink ?

Note: I'm using Flink 1.3 and the Flink Web UI to visualize the metrics.
Also my PC have a 12Go RAM and 8 Core CPU.

Regards,
Sadok





Reply | Threaded
Open this post in threaded view
|

Re: Flink stress testing and metrics

Timo Walther
Yes, I agree that this looks like a bug. You can open an issue about that. Maybe with a small reproduceble example to give others the chance to fix it.


Am 11/22/17 um 10:18 PM schrieb Ladhari Sadok:
Normally it should return 0ms in case of no latency not NaN, and my real data size is 1kb, but for now I'm using 200 bytes, I will try it with the real size later.

For the data generator, it is an infinite for loop.

Thanks.

2017-11-22 18:11 GMT+01:00 Timo Walther <[hidden email]>:
At a first glance I would say that your data size is very small. Flink is able to process millions of records on a single machine. It might be that the records are produced to quickly to be used for latency measuring.

Is you data generator never-ending?


Am 11/22/17 um 4:13 PM schrieb Ladhari Sadok:
Thanks Timo for your answer.

I have tried to setLatencyTrackingInterval(1000) but I have got the same result ( latency : NaN )

My Flink Job is a geofencing pattern :
  •  [Latitude,Langitude ] < IN | OUT > Location ? Send Notification : None

In my stress test I'm using data that always send notifications (condition always matched). So I want to measure the latency of my implementation.

I'm working with parallelism of 8 , all tasks are working and notifications are correctly generated but when testing I have noticed that the latency metric don't work (take a look at the screen-shot in attach). All other metrics are working.

Please help me finding the best way to do the stress testing correctly.

Regards,

Sadok



2017-11-22 14:52 GMT+01:00 Timo Walther <[hidden email]>:
Hi Sadok,

it would be helpful if you could tell us a bit more about your job. E.g. a skewed key distribution where keys are only sent to one third of your operators can not use your CPUs full capabilities.

The latency tracking interval is in milliseconds. Can you try if 1000 would fix your problem? I could not find an open issue describing your problem. Maybe more information about your environment can help. How are you executing your Flink application? Are you using a parallelism of 8?

Regards,
Timo


Am 11/22/17 um 9:49 AM schrieb Ladhari Sadok:
Hi All,

I want to do a stress testing of my Flink app implementation: event generation with ParallelSourceFunction then measuring the latency ,throughput, CPU & memry leak ...

But when testing, I noticed that :
  • the maximum of CPU usage is 30-33%
  • latency is always NaNd NaNh in the dashboard ( even I have set this configuration executionConfig.setLatencyTrackingInterval(1); )


Can some one help me find the best solution to smoke testing Flink ?

Note: I'm using Flink 1.3 and the Flink Web UI to visualize the metrics.
Also my PC have a 12Go RAM and 8 Core CPU.

Regards,
Sadok






Reply | Threaded
Open this post in threaded view
|

Re: Flink stress testing and metrics

Ladhari Sadok
Thanks Timo for your answer.
Can any one else confirm the bug ?

2017-11-23 9:26 GMT+01:00 Timo Walther <[hidden email]>:
Yes, I agree that this looks like a bug. You can open an issue about that. Maybe with a small reproduceble example to give others the chance to fix it.


Am 11/22/17 um 10:18 PM schrieb Ladhari Sadok:
Normally it should return 0ms in case of no latency not NaN, and my real data size is 1kb, but for now I'm using 200 bytes, I will try it with the real size later.

For the data generator, it is an infinite for loop.

Thanks.

2017-11-22 18:11 GMT+01:00 Timo Walther <[hidden email]>:
At a first glance I would say that your data size is very small. Flink is able to process millions of records on a single machine. It might be that the records are produced to quickly to be used for latency measuring.

Is you data generator never-ending?


Am 11/22/17 um 4:13 PM schrieb Ladhari Sadok:
Thanks Timo for your answer.

I have tried to setLatencyTrackingInterval(1000) but I have got the same result ( latency : NaN )

My Flink Job is a geofencing pattern :
  •  [Latitude,Langitude ] < IN | OUT > Location ? Send Notification : None

In my stress test I'm using data that always send notifications (condition always matched). So I want to measure the latency of my implementation.

I'm working with parallelism of 8 , all tasks are working and notifications are correctly generated but when testing I have noticed that the latency metric don't work (take a look at the screen-shot in attach). All other metrics are working.

Please help me finding the best way to do the stress testing correctly.

Regards,

Sadok



2017-11-22 14:52 GMT+01:00 Timo Walther <[hidden email]>:
Hi Sadok,

it would be helpful if you could tell us a bit more about your job. E.g. a skewed key distribution where keys are only sent to one third of your operators can not use your CPUs full capabilities.

The latency tracking interval is in milliseconds. Can you try if 1000 would fix your problem? I could not find an open issue describing your problem. Maybe more information about your environment can help. How are you executing your Flink application? Are you using a parallelism of 8?

Regards,
Timo


Am 11/22/17 um 9:49 AM schrieb Ladhari Sadok:
Hi All,

I want to do a stress testing of my Flink app implementation: event generation with ParallelSourceFunction then measuring the latency ,throughput, CPU & memry leak ...

But when testing, I noticed that :
  • the maximum of CPU usage is 30-33%
  • latency is always NaNd NaNh in the dashboard ( even I have set this configuration executionConfig.setLatencyTrackingInterval(1); )


Can some one help me find the best solution to smoke testing Flink ?

Note: I'm using Flink 1.3 and the Flink Web UI to visualize the metrics.
Also my PC have a 12Go RAM and 8 Core CPU.

Regards,
Sadok







Reply | Threaded
Open this post in threaded view
|

Re: Flink stress testing and metrics

Aljoscha Krettek
Hi,

This is a known issue: the latency metrics are reported in a format that the web dashboard does not understand. This is the Jira issue for fixing it: https://issues.apache.org/jira/browse/FLINK-7608

Best,
Aljoscha 

On 27. Nov 2017, at 09:47, Ladhari Sadok <[hidden email]> wrote:

Thanks Timo for your answer.
Can any one else confirm the bug ?

2017-11-23 9:26 GMT+01:00 Timo Walther <[hidden email]>:
Yes, I agree that this looks like a bug. You can open an issue about that. Maybe with a small reproduceble example to give others the chance to fix it.


Am 11/22/17 um 10:18 PM schrieb Ladhari Sadok:
Normally it should return 0ms in case of no latency not NaN, and my real data size is 1kb, but for now I'm using 200 bytes, I will try it with the real size later.

For the data generator, it is an infinite for loop.

Thanks.

2017-11-22 18:11 GMT+01:00 Timo Walther <[hidden email]>:
At a first glance I would say that your data size is very small. Flink is able to process millions of records on a single machine. It might be that the records are produced to quickly to be used for latency measuring.

Is you data generator never-ending?


Am 11/22/17 um 4:13 PM schrieb Ladhari Sadok:
Thanks Timo for your answer.

I have tried to setLatencyTrackingInterval(1000) but I have got the same result ( latency : NaN )

My Flink Job is a geofencing pattern :
  •  [Latitude,Langitude ] < IN | OUT > Location ? Send Notification : None

In my stress test I'm using data that always send notifications (condition always matched). So I want to measure the latency of my implementation.

I'm working with parallelism of 8 , all tasks are working and notifications are correctly generated but when testing I have noticed that the latency metric don't work (take a look at the screen-shot in attach). All other metrics are working.

Please help me finding the best way to do the stress testing correctly.

Regards,

Sadok



2017-11-22 14:52 GMT+01:00 Timo Walther <[hidden email]>:
Hi Sadok,

it would be helpful if you could tell us a bit more about your job. E.g. a skewed key distribution where keys are only sent to one third of your operators can not use your CPUs full capabilities.

The latency tracking interval is in milliseconds. Can you try if 1000 would fix your problem? I could not find an open issue describing your problem. Maybe more information about your environment can help. How are you executing your Flink application? Are you using a parallelism of 8?

Regards,
Timo


Am 11/22/17 um 9:49 AM schrieb Ladhari Sadok:
Hi All,

I want to do a stress testing of my Flink app implementation: event generation with ParallelSourceFunction then measuring the latency ,throughput, CPU & memry leak ...

But when testing, I noticed that :
  • the maximum of CPU usage is 30-33%
  • latency is always NaNd NaNh in the dashboard ( even I have set this configuration executionConfig.setLatencyTrackingInterval(1); )


Can some one help me find the best solution to smoke testing Flink ?

Note: I'm using Flink 1.3 and the Flink Web UI to visualize the metrics.
Also my PC have a 12Go RAM and 8 Core CPU.

Regards,
Sadok








Reply | Threaded
Open this post in threaded view
|

Re: Flink stress testing and metrics

Ladhari Sadok
Thanks Aljoscha, as I see it is not fixed yet ( In Progress ) can you give me another solution to visualize the latency or exporting them to a file , ...

I want to get the latency in any way: file, graph, ... just to get idea of the latency.

Regards.

2017-11-27 13:17 GMT+01:00 Aljoscha Krettek <[hidden email]>:
Hi,

This is a known issue: the latency metrics are reported in a format that the web dashboard does not understand. This is the Jira issue for fixing it: https://issues.apache.org/jira/browse/FLINK-7608

Best,
Aljoscha 


On 27. Nov 2017, at 09:47, Ladhari Sadok <[hidden email]> wrote:

Thanks Timo for your answer.
Can any one else confirm the bug ?

2017-11-23 9:26 GMT+01:00 Timo Walther <[hidden email]>:
Yes, I agree that this looks like a bug. You can open an issue about that. Maybe with a small reproduceble example to give others the chance to fix it.


Am 11/22/17 um 10:18 PM schrieb Ladhari Sadok:
Normally it should return 0ms in case of no latency not NaN, and my real data size is 1kb, but for now I'm using 200 bytes, I will try it with the real size later.

For the data generator, it is an infinite for loop.

Thanks.

2017-11-22 18:11 GMT+01:00 Timo Walther <[hidden email]>:
At a first glance I would say that your data size is very small. Flink is able to process millions of records on a single machine. It might be that the records are produced to quickly to be used for latency measuring.

Is you data generator never-ending?


Am 11/22/17 um 4:13 PM schrieb Ladhari Sadok:
Thanks Timo for your answer.

I have tried to setLatencyTrackingInterval(1000) but I have got the same result ( latency : NaN )

My Flink Job is a geofencing pattern :
  •  [Latitude,Langitude ] < IN | OUT > Location ? Send Notification : None

In my stress test I'm using data that always send notifications (condition always matched). So I want to measure the latency of my implementation.

I'm working with parallelism of 8 , all tasks are working and notifications are correctly generated but when testing I have noticed that the latency metric don't work (take a look at the screen-shot in attach). All other metrics are working.

Please help me finding the best way to do the stress testing correctly.

Regards,

Sadok



2017-11-22 14:52 GMT+01:00 Timo Walther <[hidden email]>:
Hi Sadok,

it would be helpful if you could tell us a bit more about your job. E.g. a skewed key distribution where keys are only sent to one third of your operators can not use your CPUs full capabilities.

The latency tracking interval is in milliseconds. Can you try if 1000 would fix your problem? I could not find an open issue describing your problem. Maybe more information about your environment can help. How are you executing your Flink application? Are you using a parallelism of 8?

Regards,
Timo


Am 11/22/17 um 9:49 AM schrieb Ladhari Sadok:
Hi All,

I want to do a stress testing of my Flink app implementation: event generation with ParallelSourceFunction then measuring the latency ,throughput, CPU & memry leak ...

But when testing, I noticed that :
  • the maximum of CPU usage is 30-33%
  • latency is always NaNd NaNh in the dashboard ( even I have set this configuration executionConfig.setLatencyTrackingInterval(1); )


Can some one help me find the best solution to smoke testing Flink ?

Note: I'm using Flink 1.3 and the Flink Web UI to visualize the metrics.
Also my PC have a 12Go RAM and 8 Core CPU.

Regards,
Sadok









Reply | Threaded
Open this post in threaded view
|

Re: Flink stress testing and metrics

Chesnay Schepler
The most reliable way to see the latency metric is configure a metric reporter.

However, only some reporters can properly work with the latency metric (about to change with FLINK-7608 though!).

The JMXReporter in particular will be pretty good. The slf4jReporter should work as well.

On 27.11.2017 16:03, Ladhari Sadok wrote:
Thanks Aljoscha, as I see it is not fixed yet ( In Progress ) can you give me another solution to visualize the latency or exporting them to a file , ...

I want to get the latency in any way: file, graph, ... just to get idea of the latency.

Regards.

2017-11-27 13:17 GMT+01:00 Aljoscha Krettek <[hidden email]>:
Hi,

This is a known issue: the latency metrics are reported in a format that the web dashboard does not understand. This is the Jira issue for fixing it: https://issues.apache.org/jira/browse/FLINK-7608

Best,
Aljoscha 


On 27. Nov 2017, at 09:47, Ladhari Sadok <[hidden email]> wrote:

Thanks Timo for your answer.
Can any one else confirm the bug ?

2017-11-23 9:26 GMT+01:00 Timo Walther <[hidden email]>:
Yes, I agree that this looks like a bug. You can open an issue about that. Maybe with a small reproduceble example to give others the chance to fix it.


Am 11/22/17 um 10:18 PM schrieb Ladhari Sadok:
Normally it should return 0ms in case of no latency not NaN, and my real data size is 1kb, but for now I'm using 200 bytes, I will try it with the real size later.

For the data generator, it is an infinite for loop.

Thanks.

2017-11-22 18:11 GMT+01:00 Timo Walther <[hidden email]>:
At a first glance I would say that your data size is very small. Flink is able to process millions of records on a single machine. It might be that the records are produced to quickly to be used for latency measuring.

Is you data generator never-ending?


Am 11/22/17 um 4:13 PM schrieb Ladhari Sadok:
Thanks Timo for your answer.

I have tried to setLatencyTrackingInterval(1000) but I have got the same result ( latency : NaN )

My Flink Job is a geofencing pattern :
  •  [Latitude,Langitude ] < IN | OUT > Location ? Send Notification : None

In my stress test I'm using data that always send notifications (condition always matched). So I want to measure the latency of my implementation.

I'm working with parallelism of 8 , all tasks are working and notifications are correctly generated but when testing I have noticed that the latency metric don't work (take a look at the screen-shot in attach). All other metrics are working.

Please help me finding the best way to do the stress testing correctly.

Regards,

Sadok



2017-11-22 14:52 GMT+01:00 Timo Walther <[hidden email]>:
Hi Sadok,

it would be helpful if you could tell us a bit more about your job. E.g. a skewed key distribution where keys are only sent to one third of your operators can not use your CPUs full capabilities.

The latency tracking interval is in milliseconds. Can you try if 1000 would fix your problem? I could not find an open issue describing your problem. Maybe more information about your environment can help. How are you executing your Flink application? Are you using a parallelism of 8?

Regards,
Timo


Am 11/22/17 um 9:49 AM schrieb Ladhari Sadok:
Hi All,

I want to do a stress testing of my Flink app implementation: event generation with ParallelSourceFunction then measuring the latency ,throughput, CPU & memry leak ...

But when testing, I noticed that :
  • the maximum of CPU usage is 30-33%
  • latency is always NaNd NaNh in the dashboard ( even I have set this configuration executionConfig.setLatencyTrackingInterval(1); )


Can some one help me find the best solution to smoke testing Flink ?

Note: I'm using Flink 1.3 and the Flink Web UI to visualize the metrics.
Also my PC have a 12Go RAM and 8 Core CPU.

Regards,
Sadok










Reply | Threaded
Open this post in threaded view
|

Re: Flink stress testing and metrics

Ladhari Sadok
It is working with FLINK-7608 , but just to know : how to implement it with slf4jReporter ? I didn't find an example !

2017-11-27 17:33 GMT+01:00 Chesnay Schepler <[hidden email]>:
The most reliable way to see the latency metric is configure a metric reporter.

However, only some reporters can properly work with the latency metric (about to change with FLINK-7608 though!).

The JMXReporter in particular will be pretty good. The slf4jReporter should work as well.


On 27.11.2017 16:03, Ladhari Sadok wrote:
Thanks Aljoscha, as I see it is not fixed yet ( In Progress ) can you give me another solution to visualize the latency or exporting them to a file , ...

I want to get the latency in any way: file, graph, ... just to get idea of the latency.

Regards.

2017-11-27 13:17 GMT+01:00 Aljoscha Krettek <[hidden email]>:
Hi,

This is a known issue: the latency metrics are reported in a format that the web dashboard does not understand. This is the Jira issue for fixing it: https://issues.apache.org/jira/browse/FLINK-7608

Best,
Aljoscha 


On 27. Nov 2017, at 09:47, Ladhari Sadok <[hidden email]> wrote:

Thanks Timo for your answer.
Can any one else confirm the bug ?

2017-11-23 9:26 GMT+01:00 Timo Walther <[hidden email]>:
Yes, I agree that this looks like a bug. You can open an issue about that. Maybe with a small reproduceble example to give others the chance to fix it.


Am 11/22/17 um 10:18 PM schrieb Ladhari Sadok:
Normally it should return 0ms in case of no latency not NaN, and my real data size is 1kb, but for now I'm using 200 bytes, I will try it with the real size later.

For the data generator, it is an infinite for loop.

Thanks.

2017-11-22 18:11 GMT+01:00 Timo Walther <[hidden email]>:
At a first glance I would say that your data size is very small. Flink is able to process millions of records on a single machine. It might be that the records are produced to quickly to be used for latency measuring.

Is you data generator never-ending?


Am 11/22/17 um 4:13 PM schrieb Ladhari Sadok:
Thanks Timo for your answer.

I have tried to setLatencyTrackingInterval(1000) but I have got the same result ( latency : NaN )

My Flink Job is a geofencing pattern :
  •  [Latitude,Langitude ] < IN | OUT > Location ? Send Notification : None

In my stress test I'm using data that always send notifications (condition always matched). So I want to measure the latency of my implementation.

I'm working with parallelism of 8 , all tasks are working and notifications are correctly generated but when testing I have noticed that the latency metric don't work (take a look at the screen-shot in attach). All other metrics are working.

Please help me finding the best way to do the stress testing correctly.

Regards,

Sadok



2017-11-22 14:52 GMT+01:00 Timo Walther <[hidden email]>:
Hi Sadok,

it would be helpful if you could tell us a bit more about your job. E.g. a skewed key distribution where keys are only sent to one third of your operators can not use your CPUs full capabilities.

The latency tracking interval is in milliseconds. Can you try if 1000 would fix your problem? I could not find an open issue describing your problem. Maybe more information about your environment can help. How are you executing your Flink application? Are you using a parallelism of 8?

Regards,
Timo


Am 11/22/17 um 9:49 AM schrieb Ladhari Sadok:
Hi All,

I want to do a stress testing of my Flink app implementation: event generation with ParallelSourceFunction then measuring the latency ,throughput, CPU & memry leak ...

But when testing, I noticed that :
  • the maximum of CPU usage is 30-33%
  • latency is always NaNd NaNh in the dashboard ( even I have set this configuration executionConfig.setLatencyTrackingInterval(1); )


Can some one help me find the best solution to smoke testing Flink ?

Note: I'm using Flink 1.3 and the Flink Web UI to visualize the metrics.
Also my PC have a 12Go RAM and 8 Core CPU.

Regards,
Sadok