http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/Using-standalone-single-node-without-HA-in-production-crazy-tp7758.html
Hi,
I am evaluating flink for use in stateful streaming application. Some information about the intended use:
- Will run in a mesos cluster and deployed via marathon in a docker container
- Initial throughput ~ 100 messages per second (from kafka)
- Will need to scale to 10x that soon after launch
- State will be much larger than memory available
In order to quickly get this out the door I am considering postponing the YARN / HA setup of a cluster with the idea that the current application can easily fit within a single jvm and handle the throughput. Hopefully by the time I need more scale flink support for mesos will be available and I can use that to distribute the job to the cluster with minimal code rewrite.
Questions:
1. Is this a viable approach? Any pitfalls to be aware of?
2. What is the correct term for this deployment mode? Single node standalone? Local?
3. Will the RocksDB state backend work in a single jvm mode?
4. When the single jvm process becomes unhealthy and is restarted by marathon will flink recover appropriately or is failure recovery a function of HA?
5. How would I migrate the RocksDB state once I move to HA mode? Is there a straight forward path?
Ryan