The persistent storage options

Containers are meant to be ephemeral, and so scale pretty well for stateless applications. Stateful containers, however, need to be treated differently. For stateful applications, a persistent storage mechanism has to be there for the container idea to be right and relevant. Containers can be developed and dismantled without the data persistence. The data resides within the container. If there is any change, then the data gets lost. For some situations, this data loss is not a big issue. For certain scenarios, the data loss is not accepted; the data persistence feature has to be there. The solution approach prescribed by Docker is given in the following section.

It is possible to store data within the writable layer of a container, but there are a few downsides:

  • The data won't persist when that container is no longer running, and it can be difficult to get the data out of the container if another process needs it.
  • A container's writable layer is tightly coupled to the host machine where the container is running. Moving the data somewhere else is a difficult affair.
  • Writing into a container's writable layer requires a storage driver to manage the filesystem. The storage driver provides a union filesystem, using the Linux kernel. This extra abstraction reduces performance as compared to using data volumes, which write directly to the host filesystem.

Docker offers three different ways to mount data into a container from the Docker host: volumesbind mounts, or tmpfs mounts. Volumes are almost always the right choice. Volumes are the preferred mechanism for persisting data generated by and used by Docker containers. While bind mounts are dependent on the directory structure of the host machine, volumes are completely managed by Docker. Volumes have several advantages over bind mounts:

  • Volumes are easier to back up or migrate than bind mounts
  • Volumes are easy to manage by using Docker CLI commands or the Docker API
  • Volumes work on both Linux and Windows containers
  • Volumes can be more safely shared among multiple containers
  • Volume drivers allow storing volumes on remote hosts or cloud providers, to encrypt the contents of volumes, or to add other functionality
  • A new volume's contents can be pre-populated by a container

Volumes are often a better choice than persisting data in a container's writable layer, because using a volume does not increase the size of containers using it, and the volume's contents exist outside the life cycle of a given container. 

If a container generates non-persistent state data, then consider using a tmpfs mount to avoid storing the data anywhere permanently, and to increase the container's performance by avoiding writing into the container's writable layer.

All the three options are discussed as follows:

  • Volumes are stored in a part of the host filesystem that is managed by Docker (/var/lib/docker/volumes/ on Linux). Non-Docker processes cannot modify this part of the filesystem.
  • Bind mounts may be stored anywhere on the host system. They may even be important system files or directories. Non-Docker processes on the Docker host or a Docker container can modify them at any time.
  • The tmpfs mounts are stored in the host system's memory only and are never written to the host system's filesystem.

Let's discuss more about them.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset