|
Flink performs significant scanning during the pre-flight phase of a Flink application (https://ci.apache.org/projects/flink/flink-docs-stable/dev/types_serialization.html).
The act of creating sources, operators and sinks causes Flink to scan the data types of the objects that are used within the topology of a given streaming flow as apparently Flink will try to optimise jobs based on this information.
Is this scanning configurable? Can I turn it off and just force Flink to use Kryo serialisation only and not need or use any of this scanned information?
I have a very large, deeply nested class in a proprietary library that was auto generated and Flink seems to get into a very large endless loop when scanning it that results in out of memory errors after running for
several hours (the application never actually launches via env.execute() ,
even if I bump up the heap size significantly). The class has a number of circular references, i.e. class and its child classes contains references to other classes of the same type, is this likely to be a problem?
Many thanks,
John
|