Hi, I wanted to explore docker-flink (using Ceph for state backend). before opting for a standalone cluster. Has there been any comparative studies on the performance of docker-flink? Would the states be consistent and performant if the docker containers go down and respawn frequently?
|
Hi Jayant, Running Flink in a Docker container should not have an impact on the performance in itself. Docker does not employ virtualization. To put it simply, Docker containers are processes on the host operating system that are isolated against each other using kernel features. See [1] for a more in-depth discussion. Whether the state of your Flink Application remains consistent when containers get restarted depends on many factors, such as whether you have checkpointing and JobManager HA enabled [2][3]. Also the checkpoint files still need to be available for job recovery after container restarts. If you want to use the docker images published under https://hub.docker.com/_/flink/, you probably want to overwrite the provided flink-conf.yaml by setting the FLINK_CONF_DIR environment variable to enable a fault tolerant setup. Best, Gary On Wed, Dec 6, 2017 at 9:43 AM, Jayant Ameta <[hidden email]> wrote:
|
Thank you Gary. I know that theoretically there shouldn't be any performance issue. I was curious to know if any other users have tried out docker-flink and whether they have faced/reported any performance hit. I would want real time processing for some of the events, and was looking existing users' experience with docker-flink. Jayant Ameta On Thu, Dec 7, 2017 at 4:37 PM, Gary Yao <[hidden email]> wrote:
|
Free forum by Nabble | Edit this page |