I started to answer these questions and then realized I was making an assumption about your environment. Do you have a reliable persistent file system such as HDFS or S3 at your disposal or do you truly mean to run on a single node?
If the you are truly thinking to run on a single node only there's no way to make this guaranteed to be reliable. You would be open to machine and disk failures, etc.
I think the minimal reasonable production setup must use at least 3 physical nodes with the following services running:
1) HDFS or some other reliable filesystem (for persistent state storage)
2) Zookeeper for the Flink HA JobManager setup
The rest is configuration..
With regard to scaling up after your initial deployment: right now in the latest Flink release (1.0.3) you cannot stop and restart a job with a different parallelism without losing your computed state. What this means is that if you know you will likely scale up and you don't want to lose that state you can provision many, many slots on the TaskManagers you do run, essentially over-provisioning them, and run your job now with the max parallelism you expect to need to scale to. This will all be much simpler to do in future Flink versions (though not in 1.1) but for now this would be a decent approach.
In Flink versions after 1.1 Flink will be able to scale parallelism up and down while preserving all of the previously computed state.
-Jamie