Using standalone single node without HA in production, crazy?

Posted by Ryan Crumley on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/Using-standalone-single-node-without-HA-in-production-crazy-tp7758.html

Hi,

I am evaluating flink for use in stateful streaming application. Some information about the intended use:

 - Will run in a mesos cluster and deployed via marathon in a docker container
 - Initial throughput ~ 100 messages per second (from kafka)
 - Will need to scale to 10x that soon after launch
 - State will be much larger than memory available

In order to quickly get this out the door I am considering postponing the YARN / HA setup of a cluster with the idea that the current application can easily fit within a single jvm and handle the throughput. Hopefully by the time I need more scale flink support for mesos will be available and I can use that to distribute the job to the cluster with minimal code rewrite. 

Questions:
1. Is this a viable approach? Any pitfalls to be aware of?

2. What is the correct term for this deployment mode? Single node standalone? Local?

3. Will the RocksDB state backend work in a single jvm mode?

4. When the single jvm process becomes unhealthy and is restarted by marathon will flink recover appropriately or is failure recovery a function of HA?

5. How would I migrate the RocksDB state once I move to HA mode? Is there a straight forward path?

Thanks for your time,

Ryan