Hello, I have a Flink TM configured with taskmanager.memory.managed.size: 1372m. There is a streaming job using RocksDB for checkpoints, so I assume some of this memory will indeed be used. I was looking at the metrics exposed through the REST interface, and I queried some of them: /taskmanagers/c3c960d79c1eb2341806bfa2b2d66328/metrics?get=Status.JVM.Memory.Heap.Committed,Status.JVM.Memory.NonHeap.Committed,Status.JVM.Memory.Direct.MemoryUsed | jq [ { "id": "Status.JVM.Memory.Heap.Committed", "value": "1652031488" }, { "id": "Status.JVM.Memory.NonHeap.Committed", "value": "234291200" 223 MiB }, { "id": "Status.JVM.Memory.Direct.MemoryUsed", "value": "375015427" 358 MiB }, { "id": "Status.JVM.Memory.Direct.TotalCapacity", "value": "375063552" 358 MiB } ] I presume direct memory is being used by Flink and its networking stack, as well as by the JVM itself. To be sure:
Based on this question:
https://stackoverflow.com/questions/30622818/off-heap-native-heap-direct-memory-and-native-memory, I imagine Flink/RocksDB either allocates memory completely independently of the JVM, or it uses unsafe. Since the documentation (https://ci.apache.org/projects/flink/flink-docs-release-1.12/deployment/memory/mem_setup_tm.html#managed-memory)
states that "Managed memory is managed by Flink and is allocated as native memory (off-heap)", I thought this native memory might show up as part of direct memory tracking, but I guess it doesn’t. Regards, Alexis. |
Hi Alexis, First of all, I strongly recommend not to look into the JVM metrics. These metrics are fetched directly from JVM and do not well correspond to Flink's memory configurations. They were introduced a long time ago and are preserved mostly for compatibility. IMO, they bring more confusion than convenience. In Flink-1.12, there is a newly designed TM metrics page in the web ui, which clearly shows how the metrics correspond to Flink's memory configurations (if any). Concerning your questions. 1. Yes, increasing framework/task off-heap memory sizes should increase the direct memory capacity. Increasing the network memory size should also do that. 2. When 'state.backend.rocksdb.memory.managed' is true, RocksDB uses managed memory. Managed memory is not measured by any JVM metrics. It's not managed by JVM, meaning that it's not limited by '-XX:MaxDirectMemorySize' and is not controlled by the garbage collectors. Thank you~ Xintong Song On Tue, Apr 13, 2021 at 7:53 PM Alexis Sarda-Espinosa <[hidden email]> wrote:
|
Hi Xintong, Thanks for the info. Is there any way to access these metrics outside of the UI? I suppose Flink’s reporters might provide them, but will they also be available through the REST interface (or another interface)? Regards, Alexis. From: Xintong Song <[hidden email]> Hi Alexis, First of all, I strongly recommend not to look into the JVM metrics. These metrics are fetched directly from JVM and do not well correspond to Flink's memory configurations. They were introduced a long time ago and are preserved mostly
for compatibility. IMO, they bring more confusion than convenience. In Flink-1.12, there is a newly designed TM metrics page in the web ui, which clearly shows how the metrics correspond to Flink's memory configurations (if any). Concerning your questions. 1. Yes, increasing framework/task off-heap memory sizes should increase the direct memory capacity. Increasing the network memory size should also do that. 2. When 'state.backend.rocksdb.memory.managed' is true, RocksDB uses managed memory. Managed memory is not measured by any JVM metrics. It's not managed by JVM, meaning that it's not limited by '-XX:MaxDirectMemorySize' and is not controlled
by the garbage collectors.
Thank you~ Xintong Song On Tue, Apr 13, 2021 at 7:53 PM Alexis Sarda-Espinosa <[hidden email]> wrote:
|
These metrics should also be available via REST. You can check the original design doc [1] for which metrics the UI is using. Thank you~ Xintong Song On Tue, Apr 13, 2021 at 9:08 PM Alexis Sarda-Espinosa <[hidden email]> wrote:
|
Free forum by Nabble | Edit this page |