Performance of docker-flink

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Performance of docker-flink

Jayant Ameta
Hi,
I wanted to explore docker-flink (using Ceph for state backend). before opting for a standalone cluster.

Has there been any comparative studies on the performance of docker-flink? Would the states be consistent and performant if the docker containers go down and respawn frequently?
Reply | Threaded
Open this post in threaded view
|

Re: Performance of docker-flink

Gary Yao-2
Hi Jayant,

Running Flink in a Docker container should not have an impact on the performance
in itself. Docker does not employ virtualization. To put it simply, Docker
containers are processes on the host operating system that are isolated against
each other using kernel features. See [1] for a more in-depth discussion.

Whether the state of your Flink Application remains consistent when containers
get restarted depends on many factors, such as whether you have checkpointing
and JobManager HA enabled [2][3]. Also the checkpoint files still need to be
available for job recovery after container restarts.

If you want to use the docker images published under
https://hub.docker.com/_/flink/, you probably want to overwrite the provided
flink-conf.yaml by setting the FLINK_CONF_DIR environment variable to enable a
fault tolerant setup.

Best,
Gary


On Wed, Dec 6, 2017 at 9:43 AM, Jayant Ameta <[hidden email]> wrote:
Hi,
I wanted to explore docker-flink (using Ceph for state backend). before opting for a standalone cluster.

Has there been any comparative studies on the performance of docker-flink? Would the states be consistent and performant if the docker containers go down and respawn frequently?

Reply | Threaded
Open this post in threaded view
|

Re: Performance of docker-flink

Jayant Ameta
Thank you Gary.
I know that theoretically there shouldn't be any performance issue. 
I was curious to know if any other users have tried out docker-flink and whether they have faced/reported any performance hit. I would want real time processing for some of the events, and was looking existing users' experience with docker-flink.


Jayant Ameta

On Thu, Dec 7, 2017 at 4:37 PM, Gary Yao <[hidden email]> wrote:
Hi Jayant,

Running Flink in a Docker container should not have an impact on the performance
in itself. Docker does not employ virtualization. To put it simply, Docker
containers are processes on the host operating system that are isolated against
each other using kernel features. See [1] for a more in-depth discussion.

Whether the state of your Flink Application remains consistent when containers
get restarted depends on many factors, such as whether you have checkpointing
and JobManager HA enabled [2][3]. Also the checkpoint files still need to be
available for job recovery after container restarts.

If you want to use the docker images published under
https://hub.docker.com/_/flink/, you probably want to overwrite the provided
flink-conf.yaml by setting the FLINK_CONF_DIR environment variable to enable a
fault tolerant setup.

Best,
Gary


On Wed, Dec 6, 2017 at 9:43 AM, Jayant Ameta <[hidden email]> wrote:
Hi,
I wanted to explore docker-flink (using Ceph for state backend). before opting for a standalone cluster.

Has there been any comparative studies on the performance of docker-flink? Would the states be consistent and performant if the docker containers go down and respawn frequently?