An issue with low-throughput on Flink 1.8.3 running Yahoo streaming benchmarks

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

An issue with low-throughput on Flink 1.8.3 running Yahoo streaming benchmarks

Shinhyung Yang
Dear Flink Users,

I'm running the Yahoo streaming benchmarks (the original version) [1]
on Flink 1.8.3 and got 60K tuples per second. Because I got 282K
tuples per second with Flink 1.1.3, I would like to ask your opinions
where I should look at.

I have been using one node for a JobManager and 10 nodes for a
TaskManager per each.

Below is my current setting for the benchmark and Flink 1.8.3:

* 16 vCPUs and 24 GiB for the JobManager node
* 32 vCPUs and 32 GiB for each TaskManager node

# localConf.yaml
kafka.partitions: 5
process.hosts: 1
process.cores: 32

# flink-conf.yaml
jobmanager.heap.size: 5120m
taskmanager.heap.size: 20480m
taskmanager.numberOfTaskSlots: 16
parallelism.default: 1

And the following is the previous settings for the benchmark and Flink 1.1.3:

* 16 vCPUs and 24 GiB for the JobManager node and 10 TaskManager nodes

#localConf.yaml
kafka.partitions: 5
process.hosts: 1
process.cores: 16

# flink-conf.yaml
jobmanager.heap.mb: 1024
taskmanager.heap.mb: 15360
taskmanager.numberOfTaskSlots: 16
taskmanager.memory.preallocate: false
parallelism.default: 1
taskmanager.network.numberOfBuffers: 6432


Thank you and with best regards,
Shinhyung Yang

[1]: https://github.com/yahoo/streaming-benchmarks
Reply | Threaded
Open this post in threaded view
|

Re: An issue with low-throughput on Flink 1.8.3 running Yahoo streaming benchmarks

vino yang
Hi Shinhyung,

Can you compare the performance of the different Flink versions based on the same environment (Or at least the same configuration of the node and framework)?

I see there are some different configurations of both clusters and frameworks. It would be better to comparison in the same environment so that we can figure out why there are more than 4x performance differences.

WDYT?

Best,
Vino

Shinhyung Yang <[hidden email]> 于2019年12月30日周一 下午1:45写道:
Dear Flink Users,

I'm running the Yahoo streaming benchmarks (the original version) [1]
on Flink 1.8.3 and got 60K tuples per second. Because I got 282K
tuples per second with Flink 1.1.3, I would like to ask your opinions
where I should look at.

I have been using one node for a JobManager and 10 nodes for a
TaskManager per each.

Below is my current setting for the benchmark and Flink 1.8.3:

* 16 vCPUs and 24 GiB for the JobManager node
* 32 vCPUs and 32 GiB for each TaskManager node

# localConf.yaml
kafka.partitions: 5
process.hosts: 1
process.cores: 32

# flink-conf.yaml
jobmanager.heap.size: 5120m
taskmanager.heap.size: 20480m
taskmanager.numberOfTaskSlots: 16
parallelism.default: 1

And the following is the previous settings for the benchmark and Flink 1.1.3:

* 16 vCPUs and 24 GiB for the JobManager node and 10 TaskManager nodes

#localConf.yaml
kafka.partitions: 5
process.hosts: 1
process.cores: 16

# flink-conf.yaml
jobmanager.heap.mb: 1024
taskmanager.heap.mb: 15360
taskmanager.numberOfTaskSlots: 16
taskmanager.memory.preallocate: false
parallelism.default: 1
taskmanager.network.numberOfBuffers: 6432


Thank you and with best regards,
Shinhyung Yang

[1]: https://github.com/yahoo/streaming-benchmarks