(DEPRECATED) Apache Flink User Mailing List archive.

low performance in running queries

Classic

List

Threaded

15 messages Options

Habib Mostafaei

low performance in running queries

Hi all,

I am running Flink on a standalone cluster and getting very long
execution time for the streaming queries like WordCount for a fixed text
file. My VM runs on a Debian 10 with 16 cpu cores and 32GB of RAM. I
have a text file with size of 2GB. When I run the Flink on a standalone
cluster, i.e., one JobManager and one taskManager with 25GB of heapsize,
it took around two hours to finish counting this file while a simple
python script can do it in around 7 minutes. Just wondering what is
wrong with my setup. I ran the experiments on a cluster with six
taskManagers, but I still get very long execution time like 25 minutes
or so. I tried to increase the JVM heap size to have lower execution
time but it did not help. I attached the log file and the Flink
configuration file to this email.

Best,

Habib

flink-conf.yaml (13K) Download Attachment

flink-xxx-standalonesession-0-xxx.log (27K) Download Attachment

Zhenghua Gao

Re: low performance in running queries

The reason might be the parallelism of your task is only 1, that's too low.

See [1] to specify proper parallelism for your job, and the execution time should be reduced significantly.

[1] https://ci.apache.org/projects/flink/flink-docs-stable/dev/parallel.html

Best Regards,

Zhenghua Gao

On Tue, Oct 29, 2019 at 9:27 PM Habib Mostafaei <[hidden email]> wrote:

Hi all,

I am running Flink on a standalone cluster and getting very long
execution time for the streaming queries like WordCount for a fixed text
file. My VM runs on a Debian 10 with 16 cpu cores and 32GB of RAM. I
have a text file with size of 2GB. When I run the Flink on a standalone
cluster, i.e., one JobManager and one taskManager with 25GB of heapsize,
it took around two hours to finish counting this file while a simple
python script can do it in around 7 minutes. Just wondering what is
wrong with my setup. I ran the experiments on a cluster with six
taskManagers, but I still get very long execution time like 25 minutes
or so. I tried to increase the JVM heap size to have lower execution
time but it did not help. I attached the log file and the Flink
configuration file to this email.

Best,

Habib

Habib Mostafaei

Re: low performance in running queries

Thanks Gao for the reply. I used the parallelism parameter with different values like 6 and 8 but still the execution time is not comparable with a single threaded python script. What would be the reasonable value for the parallelism?

Best,

Habib

On 10/30/2019 1:17 PM, Zhenghua Gao wrote:

The reason might be the parallelism of your task is only 1, that's too low.
See [1] to specify proper parallelism for your job, and the execution time should be reduced significantly.

[1] https://ci.apache.org/projects/flink/flink-docs-stable/dev/parallel.html

Best Regards,

Zhenghua Gao

On Tue, Oct 29, 2019 at 9:27 PM Habib Mostafaei <[hidden email]> wrote:

Hi all,

I am running Flink on a standalone cluster and getting very long
execution time for the streaming queries like WordCount for a fixed text
file. My VM runs on a Debian 10 with 16 cpu cores and 32GB of RAM. I
have a text file with size of 2GB. When I run the Flink on a standalone
cluster, i.e., one JobManager and one taskManager with 25GB of heapsize,
it took around two hours to finish counting this file while a simple
python script can do it in around 7 minutes. Just wondering what is
wrong with my setup. I ran the experiments on a cluster with six
taskManagers, but I still get very long execution time like 25 minutes
or so. I tried to increase the JVM heap size to have lower execution
time but it did not help. I attached the log file and the Flink
configuration file to this email.

Best,

Habib

-- 
Habib Mostafaei, Ph.D.
Postdoctoral researcher
TU Berlin,
FG INET, MAR 4.003
Marchstraße 23, 10587 Berlin

Chris Miller-2

Re: low performance in running queries

I haven't run any benchmarks with Flink or even used it enough to directly help with your question, however I suspect that the following article might be relevant:

http://dsrg.pdos.csail.mit.edu/2016/06/26/scalability-cost/

Given the computation you're performing is trivial, it's possible that the additional overhead of serialisation, interprocess communication, state management etc that distributed systems like Flink require are dominating the runtime here. 2 hours (or even 25 minutes) still seems too long to me however, so hopefully it really is just a configuration issue of some sort. Either way, if you do figure this out or anyone with good knowledge of the article above in relation to Flink is able to give their thoughts, I'd be very interested in hearing more.

Regards,

Chris

------ Original Message ------

From: "Habib Mostafaei" <[hidden email]>

To: "Zhenghua Gao" <[hidden email]>

Cc: "user" <[hidden email]>; "Georgios Smaragdakis" <[hidden email]>; "Niklas Semmler" <[hidden email]>

Sent: 30/10/2019 12:25:28

Subject: Re: low performance in running queries

Thanks Gao for the reply. I used the parallelism parameter with different values like 6 and 8 but still the execution time is not comparable with a single threaded python script. What would be the reasonable value for the parallelism?

Best,

Habib

On 10/30/2019 1:17 PM, Zhenghua Gao wrote:

The reason might be the parallelism of your task is only 1, that's too low.
See [1] to specify proper parallelism for your job, and the execution time should be reduced significantly.

[1] https://ci.apache.org/projects/flink/flink-docs-stable/dev/parallel.html

Best Regards,

Zhenghua Gao

On Tue, Oct 29, 2019 at 9:27 PM Habib Mostafaei <[hidden email]> wrote:

Hi all,

I am running Flink on a standalone cluster and getting very long
execution time for the streaming queries like WordCount for a fixed text
file. My VM runs on a Debian 10 with 16 cpu cores and 32GB of RAM. I
have a text file with size of 2GB. When I run the Flink on a standalone
cluster, i.e., one JobManager and one taskManager with 25GB of heapsize,
it took around two hours to finish counting this file while a simple
python script can do it in around 7 minutes. Just wondering what is
wrong with my setup. I ran the experiments on a cluster with six
taskManagers, but I still get very long execution time like 25 minutes
or so. I tried to increase the JVM heap size to have lower execution
time but it did not help. I attached the log file and the Flink
configuration file to this email.

Best,

Habib
-- 
Habib Mostafaei, Ph.D.
Postdoctoral researcher
TU Berlin,
FG INET, MAR 4.003
Marchstraße 23, 10587 Berlin

Piotr Nowojski-3

Re: low performance in running queries

Hi,

I would also suggest to just attach a code profiler to the process during those 2 hours and gather some results. It might answer some questions what is taking so long time.

Piotrek

On 30 Oct 2019, at 15:11, Chris Miller <[hidden email]> wrote:
I haven't run any benchmarks with Flink or even used it enough to directly help with your question, however I suspect that the following article might be relevant:

http://dsrg.pdos.csail.mit.edu/2016/06/26/scalability-cost/

Given the computation you're performing is trivial, it's possible that the additional overhead of serialisation, interprocess communication, state management etc that distributed systems like Flink require are dominating the runtime here. 2 hours (or even 25 minutes) still seems too long to me however, so hopefully it really is just a configuration issue of some sort. Either way, if you do figure this out or anyone with good knowledge of the article above in relation to Flink is able to give their thoughts, I'd be very interested in hearing more.

Regards,
Chris

------ Original Message ------
From: "Habib Mostafaei" <[hidden email]>
To: "Zhenghua Gao" <[hidden email]>
Cc: "user" <[hidden email]>; "Georgios Smaragdakis" <[hidden email]>; "Niklas Semmler" <[hidden email]>
Sent: 30/10/2019 12:25:28
Subject: Re: low performance in running queries
Thanks Gao for the reply. I used the parallelism parameter with different values like 6 and 8 but still the execution time is not comparable with a single threaded python script. What would be the reasonable value for the parallelism?
Best,
Habib
On 10/30/2019 1:17 PM, Zhenghua Gao wrote:
The reason might be the parallelism of your task is only 1, that's too low.
See [1] to specify proper parallelism for your job, and the execution time should be reduced significantly.

[1] https://ci.apache.org/projects/flink/flink-docs-stable/dev/parallel.html

Best Regards,
Zhenghua Gao

On Tue, Oct 29, 2019 at 9:27 PM Habib Mostafaei <[hidden email]> wrote:
Hi all,

I am running Flink on a standalone cluster and getting very long
execution time for the streaming queries like WordCount for a fixed text
file. My VM runs on a Debian 10 with 16 cpu cores and 32GB of RAM. I
have a text file with size of 2GB. When I run the Flink on a standalone
cluster, i.e., one JobManager and one taskManager with 25GB of heapsize,
it took around two hours to finish counting this file while a simple
python script can do it in around 7 minutes. Just wondering what is
wrong with my setup. I ran the experiments on a cluster with six
taskManagers, but I still get very long execution time like 25 minutes
or so. I tried to increase the JVM heap size to have lower execution
time but it did not help. I attached the log file and the Flink
configuration file to this email.

Best,

Habib
-- 
Habib Mostafaei, Ph.D.
Postdoctoral researcher
TU Berlin,
FG INET, MAR 4.003
Marchstraße 23, 10587 Berlin

Zhenghua Gao

Re: low performance in running queries

In reply to this post by Habib Mostafaei

I think more runtime information would help figure out where the problem is.

1) how many parallelisms actually working

2) the metrics for each operator

3) the jvm profiling information, etc

Best Regards,

Zhenghua Gao

On Wed, Oct 30, 2019 at 8:25 PM Habib Mostafaei <[hidden email]> wrote:

Thanks Gao for the reply. I used the parallelism parameter with different values like 6 and 8 but still the execution time is not comparable with a single threaded python script. What would be the reasonable value for the parallelism?

Best,

Habib

On 10/30/2019 1:17 PM, Zhenghua Gao wrote:

The reason might be the parallelism of your task is only 1, that's too low.
See [1] to specify proper parallelism for your job, and the execution time should be reduced significantly.

[1] https://ci.apache.org/projects/flink/flink-docs-stable/dev/parallel.html

Best Regards,

Zhenghua Gao

On Tue, Oct 29, 2019 at 9:27 PM Habib Mostafaei <[hidden email]> wrote:

Hi all,

I am running Flink on a standalone cluster and getting very long
execution time for the streaming queries like WordCount for a fixed text
file. My VM runs on a Debian 10 with 16 cpu cores and 32GB of RAM. I
have a text file with size of 2GB. When I run the Flink on a standalone
cluster, i.e., one JobManager and one taskManager with 25GB of heapsize,
it took around two hours to finish counting this file while a simple
python script can do it in around 7 minutes. Just wondering what is
wrong with my setup. I ran the experiments on a cluster with six
taskManagers, but I still get very long execution time like 25 minutes
or so. I tried to increase the JVM heap size to have lower execution
time but it did not help. I attached the log file and the Flink
configuration file to this email.

Best,

Habib
-- 
Habib Mostafaei, Ph.D.
Postdoctoral researcher
TU Berlin,
FG INET, MAR 4.003
Marchstraße 23, 10587 Berlin

Habib Mostafaei

Re: low performance in running queries

I enclosed all logs from the run and for this run I used parallelism one. However, for other runs I checked and found that all parallel workers were working properly. Is there a simple way to get profiling information in Flink?

Best,

Habib

On 10/31/2019 2:54 AM, Zhenghua Gao wrote:

I think more runtime information would help figure out where the problem is.
1) how many parallelisms actually working

2) the metrics for each operator

3) the jvm profiling information, etc

Best Regards,

Zhenghua Gao

On Wed, Oct 30, 2019 at 8:25 PM Habib Mostafaei <[hidden email]> wrote:

Thanks Gao for the reply. I used the parallelism parameter with different values like 6 and 8 but still the execution time is not comparable with a single threaded python script. What would be the reasonable value for the parallelism?

Best,

Habib

On 10/30/2019 1:17 PM, Zhenghua Gao wrote:

The reason might be the parallelism of your task is only 1, that's too low.
See [1] to specify proper parallelism for your job, and the execution time should be reduced significantly.

[1] https://ci.apache.org/projects/flink/flink-docs-stable/dev/parallel.html

Best Regards,

Zhenghua Gao

On Tue, Oct 29, 2019 at 9:27 PM Habib Mostafaei <[hidden email]> wrote:

Hi all,

I am running Flink on a standalone cluster and getting very long
execution time for the streaming queries like WordCount for a fixed text
file. My VM runs on a Debian 10 with 16 cpu cores and 32GB of RAM. I
have a text file with size of 2GB. When I run the Flink on a standalone
cluster, i.e., one JobManager and one taskManager with 25GB of heapsize,
it took around two hours to finish counting this file while a simple
python script can do it in around 7 minutes. Just wondering what is
wrong with my setup. I ran the experiments on a cluster with six
taskManagers, but I still get very long execution time like 25 minutes
or so. I tried to increase the JVM heap size to have lower execution
time but it did not help. I attached the log file and the Flink
configuration file to this email.

Best,

Habib

flink-xxx-client-xxx.log (13K) Download Attachment

flink-xxx-standalonesession-0-xxx.log (27K) Download Attachment

flink-xxx-taskexecutor-0-xxx.log (31K) Download Attachment

Zhenghua Gao

Re: low performance in running queries

2019-10-30 15:59:52,122 INFO  org.apache.flink.runtime.taskmanager.Task                     - Split Reader: Custom File Source -> Flat Map (1/1) (6a17c410c3e36f524bb774d2dffed4a4) switched from DEPLOYING to RUNNING.

2019-10-30 17:45:10,943 INFO  org.apache.flink.runtime.taskmanager.Task                     - Split Reader: Custom File Source -> Flat Map (1/1) (6a17c410c3e36f524bb774d2dffed4a4) switched from RUNNING to FINISHED.

It's surprise that the source task uses 95 mins to read a 2G file.

Could you give me your code snippets and some sample lines of the 2G file?

I will try to reproduce your scenario and dig the root causes.

Best Regards,

Zhenghua Gao

On Thu, Oct 31, 2019 at 9:05 PM Habib Mostafaei <[hidden email]> wrote:

I enclosed all logs from the run and for this run I used parallelism one. However, for other runs I checked and found that all parallel workers were working properly. Is there a simple way to get profiling information in Flink?

Best,

Habib

On 10/31/2019 2:54 AM, Zhenghua Gao wrote:

I think more runtime information would help figure out where the problem is.
1) how many parallelisms actually working

2) the metrics for each operator

3) the jvm profiling information, etc

Best Regards,

Zhenghua Gao

On Wed, Oct 30, 2019 at 8:25 PM Habib Mostafaei <[hidden email]> wrote:

Thanks Gao for the reply. I used the parallelism parameter with different values like 6 and 8 but still the execution time is not comparable with a single threaded python script. What would be the reasonable value for the parallelism?

Best,

Habib

On 10/30/2019 1:17 PM, Zhenghua Gao wrote:

The reason might be the parallelism of your task is only 1, that's too low.
See [1] to specify proper parallelism for your job, and the execution time should be reduced significantly.

[1] https://ci.apache.org/projects/flink/flink-docs-stable/dev/parallel.html

Best Regards,

Zhenghua Gao

On Tue, Oct 29, 2019 at 9:27 PM Habib Mostafaei <[hidden email]> wrote:

Hi all,

I am running Flink on a standalone cluster and getting very long
execution time for the streaming queries like WordCount for a fixed text
file. My VM runs on a Debian 10 with 16 cpu cores and 32GB of RAM. I
have a text file with size of 2GB. When I run the Flink on a standalone
cluster, i.e., one JobManager and one taskManager with 25GB of heapsize,
it took around two hours to finish counting this file while a simple
python script can do it in around 7 minutes. Just wondering what is
wrong with my setup. I ran the experiments on a cluster with six
taskManagers, but I still get very long execution time like 25 minutes
or so. I tried to increase the JVM heap size to have lower execution
time but it did not help. I attached the log file and the Flink
configuration file to this email.

Best,

Habib

Habib Mostafaei

Re: low performance in running queries

I used streaming WordCount provided by Flink and the file contains text like "This is some text...". I just copied several times.

Best,

Habib

On 11/1/2019 6:03 AM, Zhenghua Gao wrote:

2019-10-30 15:59:52,122 INFO  org.apache.flink.runtime.taskmanager.Task                     - Split Reader: Custom File Source -> Flat Map (1/1) (6a17c410c3e36f524bb774d2dffed4a4) switched from DEPLOYING to RUNNING.
2019-10-30 17:45:10,943 INFO  org.apache.flink.runtime.taskmanager.Task                     - Split Reader: Custom File Source -> Flat Map (1/1) (6a17c410c3e36f524bb774d2dffed4a4) switched from RUNNING to FINISHED.
It's surprise that the source task uses 95 mins to read a 2G file.
Could you give me your code snippets and some sample lines of the 2G file? 
I will try to reproduce your scenario and dig the root causes.
Best Regards,

Zhenghua Gao
On Thu, Oct 31, 2019 at 9:05 PM Habib Mostafaei <[hidden email]> wrote:

I enclosed all logs from the run and for this run I used parallelism one. However, for other runs I checked and found that all parallel workers were working properly. Is there a simple way to get profiling information in Flink?

Best,

Habib

On 10/31/2019 2:54 AM, Zhenghua Gao wrote:

I think more runtime information would help figure out where the problem is.
1) how many parallelisms actually working

2) the metrics for each operator

3) the jvm profiling information, etc

Best Regards,

Zhenghua Gao

On Wed, Oct 30, 2019 at 8:25 PM Habib Mostafaei <[hidden email]> wrote:

Thanks Gao for the reply. I used the parallelism parameter with different values like 6 and 8 but still the execution time is not comparable with a single threaded python script. What would be the reasonable value for the parallelism?

Best,

Habib

On 10/30/2019 1:17 PM, Zhenghua Gao wrote:

The reason might be the parallelism of your task is only 1, that's too low.
See [1] to specify proper parallelism for your job, and the execution time should be reduced significantly.

[1] https://ci.apache.org/projects/flink/flink-docs-stable/dev/parallel.html

Best Regards,

Zhenghua Gao

On Tue, Oct 29, 2019 at 9:27 PM Habib Mostafaei <[hidden email]> wrote:

Hi all,

I am running Flink on a standalone cluster and getting very long
execution time for the streaming queries like WordCount for a fixed text
file. My VM runs on a Debian 10 with 16 cpu cores and 32GB of RAM. I
have a text file with size of 2GB. When I run the Flink on a standalone
cluster, i.e., one JobManager and one taskManager with 25GB of heapsize,
it took around two hours to finish counting this file while a simple
python script can do it in around 7 minutes. Just wondering what is
wrong with my setup. I ran the experiments on a cluster with six
taskManagers, but I still get very long execution time like 25 minutes
or so. I tried to increase the JVM heap size to have lower execution
time but it did not help. I attached the log file and the Flink
configuration file to this email.

Best,

Habib

Piotr Nowojski-3

Re: low performance in running queries

In reply to this post by Habib Mostafaei

Hi,

> Is there a simple way to get profiling information in Flink?

Flink doesn’t provide any special tooling for that. Just use your chosen profiler, for example: Oracle’s Mission Control (free on non production clusters, no need to install anything if already using Oracle’s JVM), VisualVM (I think free), YourKit (paid). For each one of them there is a plenty of online support how to use them both for local and remote profiling.

Piotrek

On 31 Oct 2019, at 14:05, Habib Mostafaei <[hidden email]> wrote:

I enclosed all logs from the run and for this run I used parallelism one. However, for other runs I checked and found that all parallel workers were working properly. Is there a simple way to get profiling information in Flink?

Best,
Habib

On 10/31/2019 2:54 AM, Zhenghua Gao wrote:

I think more runtime information would help figure out where the problem is.
1) how many parallelisms actually working

2) the metrics for each operator

3) the jvm profiling information, etc

Best Regards,

Zhenghua Gao

On Wed, Oct 30, 2019 at 8:25 PM Habib Mostafaei <[hidden email]> wrote:

Thanks Gao for the reply. I used the parallelism parameter with different values like 6 and 8 but still the execution time is not comparable with a single threaded python script. What would be the reasonable value for the parallelism?

Best,
Habib

On 10/30/2019 1:17 PM, Zhenghua Gao wrote:

The reason might be the parallelism of your task is only 1, that's too low.
See [1] to specify proper parallelism for your job, and the execution time should be reduced significantly.

[1] https://ci.apache.org/projects/flink/flink-docs-stable/dev/parallel.html

Best Regards,

Zhenghua Gao

On Tue, Oct 29, 2019 at 9:27 PM Habib Mostafaei <[hidden email]> wrote:

Hi all,

I am running Flink on a standalone cluster and getting very long
execution time for the streaming queries like WordCount for a fixed text
file. My VM runs on a Debian 10 with 16 cpu cores and 32GB of RAM. I
have a text file with size of 2GB. When I run the Flink on a standalone
cluster, i.e., one JobManager and one taskManager with 25GB of heapsize,
it took around two hours to finish counting this file while a simple
python script can do it in around 7 minutes. Just wondering what is
wrong with my setup. I ran the experiments on a cluster with six
taskManagers, but I still get very long execution time like 25 minutes
or so. I tried to increase the JVM heap size to have lower execution
time but it did not help. I attached the log file and the Flink
configuration file to this email.

Best,

Habib

<flink-xxx-client-xxx.log><flink-xxx-standalonesession-0-xxx.log><flink-xxx-taskexecutor-0-xxx.log>

Habib Mostafaei

Re: low performance in running queries

Hi Piotrek,

Thanks for the list of profilers. I used VisualVM and here is the resource usage for taskManager.

Habib

On 11/1/2019 9:48 AM, Piotr Nowojski wrote:

Hi,

> Is there a simple way to get profiling information in Flink?

Flink doesn’t provide any special tooling for that. Just use your chosen profiler, for example: Oracle’s Mission Control (free on non production clusters, no need to install anything if already using Oracle’s JVM), VisualVM (I think free), YourKit (paid). For each one of them there is a plenty of online support how to use them both for local and remote profiling.

Piotrek

On 31 Oct 2019, at 14:05, Habib Mostafaei <[hidden email]> wrote:

I enclosed all logs from the run and for this run I used parallelism one. However, for other runs I checked and found that all parallel workers were working properly. Is there a simple way to get profiling information in Flink?

Best,

Habib

On 10/31/2019 2:54 AM, Zhenghua Gao wrote:

I think more runtime information would help figure out where the problem is.
1) how many parallelisms actually working

2) the metrics for each operator

3) the jvm profiling information, etc

Best Regards,

Zhenghua Gao

On Wed, Oct 30, 2019 at 8:25 PM Habib Mostafaei <[hidden email]> wrote:

Thanks Gao for the reply. I used the parallelism parameter with different values like 6 and 8 but still the execution time is not comparable with a single threaded python script. What would be the reasonable value for the parallelism?

Best,

Habib

On 10/30/2019 1:17 PM, Zhenghua Gao wrote:

The reason might be the parallelism of your task is only 1, that's too low.
See [1] to specify proper parallelism for your job, and the execution time should be reduced significantly.

[1] https://ci.apache.org/projects/flink/flink-docs-stable/dev/parallel.html

Best Regards,

Zhenghua Gao

On Tue, Oct 29, 2019 at 9:27 PM Habib Mostafaei <[hidden email]> wrote:

Hi all,

I am running Flink on a standalone cluster and getting very long
execution time for the streaming queries like WordCount for a fixed text
file. My VM runs on a Debian 10 with 16 cpu cores and 32GB of RAM. I
have a text file with size of 2GB. When I run the Flink on a standalone
cluster, i.e., one JobManager and one taskManager with 25GB of heapsize,
it took around two hours to finish counting this file while a simple
python script can do it in around 7 minutes. Just wondering what is
wrong with my setup. I ran the experiments on a cluster with six
taskManagers, but I still get very long execution time like 25 minutes
or so. I tried to increase the JVM heap size to have lower execution
time but it did not help. I attached the log file and the Flink
configuration file to this email.

Best,

Habib

<flink-xxx-client-xxx.log><flink-xxx-standalonesession-0-xxx.log><flink-xxx-taskexecutor-0-xxx.log>

-- 
Habib Mostafaei, Ph.D.
Postdoctoral researcher
TU Berlin,
FG INET, MAR 4.003
Marchstraße 23, 10587 Berlin

Piotr Nowojski-3

Re: low performance in running queries

Hi,

More important would be the code profiling output. I think VisualVM allows to share the code profiling result as “snapshots”? If you could analyse or share this, it would be helpful.

From the attached screenshot the only thing that is visible is that there are no GC issues, and secondly the application is running only on one (out of 10?) CPU cores. Which hints one obvious way how to improve the performance - scale out. However the WordCount example might not be the best for this, as I’m pretty sure its source is fundamentally not parallel.

Piotrek

On 1 Nov 2019, at 15:57, Habib Mostafaei <[hidden email]> wrote:
Hi Piotrek,
Thanks for the list of profilers. I used VisualVM and here is the resource usage for taskManager.

<imiafpejagonadce.png>
Habib

On 11/1/2019 9:48 AM, Piotr Nowojski wrote:

Hi,

> Is there a simple way to get profiling information in Flink?

Flink doesn’t provide any special tooling for that. Just use your chosen profiler, for example: Oracle’s Mission Control (free on non production clusters, no need to install anything if already using Oracle’s JVM), VisualVM (I think free), YourKit (paid). For each one of them there is a plenty of online support how to use them both for local and remote profiling.

Piotrek

On 31 Oct 2019, at 14:05, Habib Mostafaei <[hidden email]> wrote:

I enclosed all logs from the run and for this run I used parallelism one. However, for other runs I checked and found that all parallel workers were working properly. Is there a simple way to get profiling information in Flink?

Best,
Habib

On 10/31/2019 2:54 AM, Zhenghua Gao wrote:

I think more runtime information would help figure out where the problem is.
1) how many parallelisms actually working

2) the metrics for each operator

3) the jvm profiling information, etc

Best Regards,

Zhenghua Gao

On Wed, Oct 30, 2019 at 8:25 PM Habib Mostafaei <[hidden email]> wrote:

Thanks Gao for the reply. I used the parallelism parameter with different values like 6 and 8 but still the execution time is not comparable with a single threaded python script. What would be the reasonable value for the parallelism?

Best,
Habib

On 10/30/2019 1:17 PM, Zhenghua Gao wrote:

The reason might be the parallelism of your task is only 1, that's too low.
See [1] to specify proper parallelism for your job, and the execution time should be reduced significantly.

[1] https://ci.apache.org/projects/flink/flink-docs-stable/dev/parallel.html

Best Regards,

Zhenghua Gao

On Tue, Oct 29, 2019 at 9:27 PM Habib Mostafaei <[hidden email]> wrote:

Hi all,

I am running Flink on a standalone cluster and getting very long
execution time for the streaming queries like WordCount for a fixed text
file. My VM runs on a Debian 10 with 16 cpu cores and 32GB of RAM. I
have a text file with size of 2GB. When I run the Flink on a standalone
cluster, i.e., one JobManager and one taskManager with 25GB of heapsize,
it took around two hours to finish counting this file while a simple
python script can do it in around 7 minutes. Just wondering what is
wrong with my setup. I ran the experiments on a cluster with six
taskManagers, but I still get very long execution time like 25 minutes
or so. I tried to increase the JVM heap size to have lower execution
time but it did not help. I attached the log file and the Flink
configuration file to this email.

Best,

Habib

<flink-xxx-client-xxx.log><flink-xxx-standalonesession-0-xxx.log><flink-xxx-taskexecutor-0-xxx.log>
-- 
Habib Mostafaei, Ph.D.
Postdoctoral researcher
TU Berlin,
FG INET, MAR 4.003
Marchstraße 23, 10587 Berlin

Zhenghua Gao

Re: low performance in running queries

In reply to this post by Habib Mostafaei

Hi,

I ran the streaming WordCount with a 2GB text file(copied /usr/share/dict/words 400 times) last weekend and didn't reproduce your result(16 minutes in my case).

But i find some clues may help you:

The streaming WordCount job would output all intermedia result in your output file(if specified) or taskmanager.out.

It's large (about 4GB in my case) and causes the disk writes high.

Best Regards,

Zhenghua Gao

On Fri, Nov 1, 2019 at 4:40 PM Habib Mostafaei <[hidden email]> wrote:

I used streaming WordCount provided by Flink and the file contains text like "This is some text...". I just copied several times.
Best,

Habib

On 11/1/2019 6:03 AM, Zhenghua Gao wrote:
2019-10-30 15:59:52,122 INFO  org.apache.flink.runtime.taskmanager.Task                     - Split Reader: Custom File Source -> Flat Map (1/1) (6a17c410c3e36f524bb774d2dffed4a4) switched from DEPLOYING to RUNNING.
2019-10-30 17:45:10,943 INFO  org.apache.flink.runtime.taskmanager.Task                     - Split Reader: Custom File Source -> Flat Map (1/1) (6a17c410c3e36f524bb774d2dffed4a4) switched from RUNNING to FINISHED.
It's surprise that the source task uses 95 mins to read a 2G file.
Could you give me your code snippets and some sample lines of the 2G file? 
I will try to reproduce your scenario and dig the root causes.
Best Regards,

Zhenghua Gao
On Thu, Oct 31, 2019 at 9:05 PM Habib Mostafaei <[hidden email]> wrote:

I enclosed all logs from the run and for this run I used parallelism one. However, for other runs I checked and found that all parallel workers were working properly. Is there a simple way to get profiling information in Flink?

Best,

Habib

On 10/31/2019 2:54 AM, Zhenghua Gao wrote:

I think more runtime information would help figure out where the problem is.
1) how many parallelisms actually working

2) the metrics for each operator

3) the jvm profiling information, etc

Best Regards,

Zhenghua Gao

On Wed, Oct 30, 2019 at 8:25 PM Habib Mostafaei <[hidden email]> wrote:

Thanks Gao for the reply. I used the parallelism parameter with different values like 6 and 8 but still the execution time is not comparable with a single threaded python script. What would be the reasonable value for the parallelism?

Best,

Habib

On 10/30/2019 1:17 PM, Zhenghua Gao wrote:

The reason might be the parallelism of your task is only 1, that's too low.
See [1] to specify proper parallelism for your job, and the execution time should be reduced significantly.

[1] https://ci.apache.org/projects/flink/flink-docs-stable/dev/parallel.html

Best Regards,

Zhenghua Gao

On Tue, Oct 29, 2019 at 9:27 PM Habib Mostafaei <[hidden email]> wrote:

Hi all,

I am running Flink on a standalone cluster and getting very long
execution time for the streaming queries like WordCount for a fixed text
file. My VM runs on a Debian 10 with 16 cpu cores and 32GB of RAM. I
have a text file with size of 2GB. When I run the Flink on a standalone
cluster, i.e., one JobManager and one taskManager with 25GB of heapsize,
it took around two hours to finish counting this file while a simple
python script can do it in around 7 minutes. Just wondering what is
wrong with my setup. I ran the experiments on a cluster with six
taskManagers, but I still get very long execution time like 25 minutes
or so. I tried to increase the JVM heap size to have lower execution
time but it did not help. I attached the log file and the Flink
configuration file to this email.

Best,

Habib

Habib Mostafaei

Re: low performance in running queries

In reply to this post by Piotr Nowojski-3

Hi,

On 11/1/2019 4:40 PM, Piotr Nowojski wrote:

Hi,

More important would be the code profiling output. I think VisualVM allows to share the code profiling result as “snapshots”? If you could analyse or share this, it would be helpful.

Enclosed is a snapshot of VisualVM.

From the attached screenshot the only thing that is visible is that there are no GC issues, and secondly the application is running only on one (out of 10?) CPU cores. Which hints one obvious way how to improve the performance - scale out. However the WordCount example might not be the best for this, as I’m pretty sure its source is fundamentally not parallel.

Yes, your are right that the source is not parallel. Checking the timeline of execution shows that the source operation is done in less than a second while Map and Reduce operations take long running time.

Habib

Piotrek

On 1 Nov 2019, at 15:57, Habib Mostafaei <[hidden email]> wrote:

Hi Piotrek,

Thanks for the list of profilers. I used VisualVM and here is the resource usage for taskManager.

<imiafpejagonadce.png>

Habib

On 11/1/2019 9:48 AM, Piotr Nowojski wrote:

Hi,

> Is there a simple way to get profiling information in Flink?

Flink doesn’t provide any special tooling for that. Just use your chosen profiler, for example: Oracle’s Mission Control (free on non production clusters, no need to install anything if already using Oracle’s JVM), VisualVM (I think free), YourKit (paid). For each one of them there is a plenty of online support how to use them both for local and remote profiling.

Piotrek

On 31 Oct 2019, at 14:05, Habib Mostafaei <[hidden email]> wrote:

I enclosed all logs from the run and for this run I used parallelism one. However, for other runs I checked and found that all parallel workers were working properly. Is there a simple way to get profiling information in Flink?

Best,

Habib

On 10/31/2019 2:54 AM, Zhenghua Gao wrote:

I think more runtime information would help figure out where the problem is.
1) how many parallelisms actually working

2) the metrics for each operator

3) the jvm profiling information, etc

Best Regards,

Zhenghua Gao

On Wed, Oct 30, 2019 at 8:25 PM Habib Mostafaei <[hidden email]> wrote:

Thanks Gao for the reply. I used the parallelism parameter with different values like 6 and 8 but still the execution time is not comparable with a single threaded python script. What would be the reasonable value for the parallelism?

Best,

Habib

On 10/30/2019 1:17 PM, Zhenghua Gao wrote:

The reason might be the parallelism of your task is only 1, that's too low.
See [1] to specify proper parallelism for your job, and the execution time should be reduced significantly.

[1] https://ci.apache.org/projects/flink/flink-docs-stable/dev/parallel.html

Best Regards,

Zhenghua Gao

On Tue, Oct 29, 2019 at 9:27 PM Habib Mostafaei <[hidden email]> wrote:

Hi all,

I am running Flink on a standalone cluster and getting very long
execution time for the streaming queries like WordCount for a fixed text
file. My VM runs on a Debian 10 with 16 cpu cores and 32GB of RAM. I
have a text file with size of 2GB. When I run the Flink on a standalone
cluster, i.e., one JobManager and one taskManager with 25GB of heapsize,
it took around two hours to finish counting this file while a simple
python script can do it in around 7 minutes. Just wondering what is
wrong with my setup. I ran the experiments on a cluster with six
taskManagers, but I still get very long execution time like 25 minutes
or so. I tried to increase the JVM heap size to have lower execution
time but it did not help. I attached the log file and the Flink
configuration file to this email.

Best,

Habib

<flink-xxx-client-xxx.log><flink-xxx-standalonesession-0-xxx.log><flink-xxx-taskexecutor-0-xxx.log>

application-1572869697842.apps (6K) Download Attachment

Piotr Nowojski-3

Re: low performance in running queries

Hi,

Unfortunately your VisualVM snapshot doesn’t contain the profiler output. It should look like this [1].

> Checking the timeline of execution shows that the source operation is done in less than a second while Map and Reduce operations take long running time.

It could well be that the overhead comes for example from the state accesses, especially if you are using RocksDB. Still would be interesting to see the call stack that’s using the most CPU time.

Piotrek

[1] https://i.stack.imgur.com/yTdZ5.png

On 4 Nov 2019, at 14:35, Habib Mostafaei <[hidden email]> wrote:

Hi,
On 11/1/2019 4:40 PM, Piotr Nowojski wrote:
Hi,

More important would be the code profiling output. I think VisualVM allows to share the code profiling result as “snapshots”? If you could analyse or share this, it would be helpful.
Enclosed is a snapshot of VisualVM.

From the attached screenshot the only thing that is visible is that there are no GC issues, and secondly the application is running only on one (out of 10?) CPU cores. Which hints one obvious way how to improve the performance - scale out. However the WordCount example might not be the best for this, as I’m pretty sure its source is fundamentally not parallel.
Yes, your are right that the source is not parallel. Checking the timeline of execution shows that the source operation is done in less than a second while Map and Reduce operations take long running time.
Habib

Piotrek

On 1 Nov 2019, at 15:57, Habib Mostafaei <[hidden email]> wrote:

Hi Piotrek,
Thanks for the list of profilers. I used VisualVM and here is the resource usage for taskManager.
<imiafpejagonadce.png>
Habib

On 11/1/2019 9:48 AM, Piotr Nowojski wrote:
Hi,

> Is there a simple way to get profiling information in Flink?

Flink doesn’t provide any special tooling for that. Just use your chosen profiler, for example: Oracle’s Mission Control (free on non production clusters, no need to install anything if already using Oracle’s JVM), VisualVM (I think free), YourKit (paid). For each one of them there is a plenty of online support how to use them both for local and remote profiling.

Piotrek

On 31 Oct 2019, at 14:05, Habib Mostafaei <[hidden email]> wrote:

I enclosed all logs from the run and for this run I used parallelism one. However, for other runs I checked and found that all parallel workers were working properly. Is there a simple way to get profiling information in Flink?
Best,
Habib
On 10/31/2019 2:54 AM, Zhenghua Gao wrote:
I think more runtime information would help figure out where the problem is.
1) how many parallelisms actually working
2) the metrics for each operator
3) the jvm profiling information, etc

Best Regards,
Zhenghua Gao

On Wed, Oct 30, 2019 at 8:25 PM Habib Mostafaei <[hidden email]> wrote:
Thanks Gao for the reply. I used the parallelism parameter with different values like 6 and 8 but still the execution time is not comparable with a single threaded python script. What would be the reasonable value for the parallelism?
Best,
Habib
On 10/30/2019 1:17 PM, Zhenghua Gao wrote:
The reason might be the parallelism of your task is only 1, that's too low.
See [1] to specify proper parallelism for your job, and the execution time should be reduced significantly.

[1] https://ci.apache.org/projects/flink/flink-docs-stable/dev/parallel.html

Best Regards,
Zhenghua Gao

On Tue, Oct 29, 2019 at 9:27 PM Habib Mostafaei <[hidden email]> wrote:
Hi all,

I am running Flink on a standalone cluster and getting very long
execution time for the streaming queries like WordCount for a fixed text
file. My VM runs on a Debian 10 with 16 cpu cores and 32GB of RAM. I
have a text file with size of 2GB. When I run the Flink on a standalone
cluster, i.e., one JobManager and one taskManager with 25GB of heapsize,
it took around two hours to finish counting this file while a simple
python script can do it in around 7 minutes. Just wondering what is
wrong with my setup. I ran the experiments on a cluster with six
taskManagers, but I still get very long execution time like 25 minutes
or so. I tried to increase the JVM heap size to have lower execution
time but it did not help. I attached the log file and the Flink
configuration file to this email.

Best,

Habib

<flink-xxx-client-xxx.log><flink-xxx-standalonesession-0-xxx.log><flink-xxx-taskexecutor-0-xxx.log>

<application-1572869697842.apps>