Wired pattern of syncing Kafka to Elasticsearch

Posted by Kai Fu on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/Wired-pattern-of-syncing-Kafka-to-Elasticsearch-tp44242.html

Hi team,

We are using Flink to sync data from Kafka to Elasitcsearch and the Kafka is in upsert-mode. We detect some wired patterns as shown in the figure below.
1. Job restarts regularly from the checkpoint for some unknown reasons;
2. For each run, performance and CPU utilization degenerates as time goes on, and restarts till some point;
3. The ES is far from being loaded and the job has the similar behavior even with blackhole sink;

We doubt it's due to some unreasonable configuration of the memory, and we do notice some OutOfMemory exceptions in the log. Our host has 61GB memory and the memory configuration is as below:
"taskmanager.memory.network.fraction": "0.2",
"taskmanager.memory.network.max": "8g",
"taskmanager.memory.network.min": "3g",
"taskmanager.memory.process.size": "40g"

We've another job with a more fierce stream join workload with the same host type and memory configuration, and we do not face such issues. We even also tried enlarging the host type, the issue still exists. We doubt if the memory configuration is reasonable or if there is some memory leak somewhere. Is there any guidance on this?
 

image.png

image.png
image.png
image.png

--
Best wishes,
- Kai