Hello,
I am comparing Flink, Spark and some other streaming frameworks to find the best fit for my project.
Currently we have a system, which works on single server and uses off-heap to save data. We now want to go distributed with streaming support.
So we have designed a rough data flow for that, and in the data flow, some operations need to cache data(for some purpose) before streaming it to next operation.
We prefer caching data in off-heap memory instead of in-heap.
I will stream in-heap tuples and the after some operation, I want to store that data in off-heap table.
I want to know if I can achieve this in Flink.
From the documentation, I understand I can write data in file/db through sinks.
So my questions are -
1. can I write data off-heap(using say 'unsafe' library) through sink?
2. if yes, do I have to add sink(and then a source to stream to next operation) after each such operation where I want caching in my data flow?
3. is there other/better way than (2) to solve my problem?
I hope my problem is understandable. Let me know if not.
- Aakash Agrawal.