Hello,
I’ve worked around my problem by not using the HiveServer2 JDBC driver to read the ref table. Apparently, despite all
the good options passed to the Statement object, it poorly handles RAM, since converting the table into textformat and directly reading the hdfs works without any problem and with a lot of free mem…
Greetings,
Arnaud
De : LINZ, Arnaud
Envoyé : jeudi 12 novembre 2015 17:48
À : '[hidden email]' <[hidden email]>
Objet : Join Stream with big ref table
Hello,
I have to enrich a stream with a big reference table (11,000,000 rows). I cannot use “join” because I cannot window the stream ; so in the “open()”
function of each mapper I read the content of the table and put it in a HashMap (stored on the heap).
11M rows is quite big but it should take less than 100Mb in RAM, so it’s supposed to be easy. However, I systematically run into a Java Out Of Memory
error, even with huge 64Gb containers (5 slots / container).
Path, ID |
Data Port |
Last Heartbeat |
All Slots |
Free Slots |
CPU Cores |
Physical Memory |
Free Memory |
Flink Managed Memory |
akka.tcp://flink@172.21.125.28:43653/user/taskmanager 4B4D0A725451E933C39E891AAE80B53B |
41982 |
2015-11-12, 17:46:14 |
5 |
5 |
32 |
126.0 GB |
46.0 GB |
31.5 GB |
I don’t clearly understand why this happens and how to fix it. Any clue?
Free forum by Nabble | Edit this page |