Hi, We have Flink 1.8.0 cluster deployed in Hadoop distributed mode, I often see even though Hadoop has enough resources Flink sits in Created state. We have 4 operators using 15 parallelism, 1 operator using 40 & 2 operators using 10. At time of submission I'm passing taskmanager memory as 4Gb and job manager memory as 2gb. and 2 slots This request should only take 20 containers and 40 Vcores. But I see Flink is overallocating resource of 65 containers and 129 Cores . I've attached snapshots for references. Right now I'm passing: -yD yarn.heartbeat.container-request-interval=1000 -yD taskmanager.network.memory.fraction=0.045 -yD taskmanager.memory.preallote=true. How do I control resource allocation?. |
Hi Arpith, All tasks in CREATED state indicates no task is scheduled yet. It is strange it a job gets stuck in this state. Is it possible that you share the job manager log so we can check what is happening there? Thanks, Zhu Arpith P <[hidden email]> 于2020年9月21日周一 下午3:52写道:
|
All the job manager logs have been deleted from the cluster. I'll have to work with the infra team to get it back, once I have it i'll post it here. Arpith On Mon, Sep 21, 2020 at 5:50 PM Zhu Zhu <[hidden email]> wrote:
|
Free forum by Nabble | Edit this page |