Hi
I’m looking for advice for the best and simplest solution to handle JSON in Flink. Our system is data driven and based on JSON. As the structure isn’t static mapping it to POJO isn’t an option I therefore transfer ObjectNode and / or ArrayNode between operators either in Tuples Tuple2<String, ObjecNode> or as attributes in POJO’s. Flink doesn’t know about Jackson objects and therefore fail back to Kryo I see two options. 1. Add kryo serialisation objects for all the Jackson types we use and register them. 2. Add Jackson objects as Flink types. I guess option 2 perform best, but it require an annotation for the classes and I can’t do that for 3. Party objects. One workaround could be to create my own objects that extends the Jackson objects and use them between operators. I can’t be the first to solve this problem so I like to hear what the community suggests. Med venlig hilsen / Best regards Lasse Nedergaard |
Hey Lasse,
I've had a similar case, albeit with Avro. I was reading from multiple Kafka topics, which all had different objects and did some metadata driven operations on them. I could not go with any concrete predefined types for them, because there were hundreds of different object types. My solution was to serialize the object itself manually as byte[] and deserialize it manually in operator. You can do it the same way using something like objectMapper.writeValueAsBytes and transfer data as Tuple2<String, byte[]>. Overall, Flink does not support "dynamic" data types very well. Regards, Maciej śr., 24 lut 2021 o 17:08 Lasse Nedergaard <[hidden email]> napisał(a): > > Hi > > I’m looking for advice for the best and simplest solution to handle JSON in Flink. > > Our system is data driven and based on JSON. As the structure isn’t static mapping it to POJO isn’t an option I therefore transfer ObjectNode and / or ArrayNode between operators either in Tuples > Tuple2<String, ObjecNode> or as attributes in POJO’s. > > Flink doesn’t know about Jackson objects and therefore fail back to Kryo > > I see two options. > 1. Add kryo serialisation objects for all the Jackson types we use and register them. > 2. Add Jackson objects as Flink types. > > I guess option 2 perform best, but it require an annotation for the classes and I can’t do that for 3. Party objects. One workaround could be to create my own objects that extends the Jackson objects and use them between operators. > > I can’t be the first to solve this problem so I like to hear what the community suggests. > > Med venlig hilsen / Best regards > Lasse Nedergaard > |
Thanks for your feedback.
I go with specific Kryo serialisation as it make the code easier to use and if I encounter perf. Problems I can change the dataformat later. Med venlig hilsen / Best regards Lasse Nedergaard > Den 24. feb. 2021 kl. 17.44 skrev Maciej Obuchowski <[hidden email]>: > > Hey Lasse, > I've had a similar case, albeit with Avro. I was reading from multiple > Kafka topics, which all had different objects and did some metadata > driven operations on them. > I could not go with any concrete predefined types for them, because > there were hundreds of different object types. > > My solution was to serialize the object itself manually as byte[] and > deserialize it manually in operator. > You can do it the same way using something like > objectMapper.writeValueAsBytes and transfer data as Tuple2<String, > byte[]>. > > Overall, Flink does not support "dynamic" data types very well. > > Regards, > Maciej > > śr., 24 lut 2021 o 17:08 Lasse Nedergaard > <[hidden email]> napisał(a): >> >> Hi >> >> I’m looking for advice for the best and simplest solution to handle JSON in Flink. >> >> Our system is data driven and based on JSON. As the structure isn’t static mapping it to POJO isn’t an option I therefore transfer ObjectNode and / or ArrayNode between operators either in Tuples >> Tuple2<String, ObjecNode> or as attributes in POJO’s. >> >> Flink doesn’t know about Jackson objects and therefore fail back to Kryo >> >> I see two options. >> 1. Add kryo serialisation objects for all the Jackson types we use and register them. >> 2. Add Jackson objects as Flink types. >> >> I guess option 2 perform best, but it require an annotation for the classes and I can’t do that for 3. Party objects. One workaround could be to create my own objects that extends the Jackson objects and use them between operators. >> >> I can’t be the first to solve this problem so I like to hear what the community suggests. >> >> Med venlig hilsen / Best regards >> Lasse Nedergaard >> |
Free forum by Nabble | Edit this page |