Till now, we discussed how effectively data volumes can be used to share data between the Docker host and the containers as well as between containers. Data sharing using data volumes is turning out to be a very powerful and essential tool in the Docker paradigm. However, it does carry a few pitfalls that are to be carefully identified and eliminated. In this section, we make an attempt to list out a few common issues associated with data sharing and the ways and means to overcome them.
Earlier in the data volume section, we learnt that the Docker engine automatically creates directories based on the VOLUME
instruction in Dockerfile
as well as the -v
option of the docker run
subcommand. We also understood that the Docker engine does not automatically delete these auto-generated directories in order to preserve the state of the application(s) run inside the container. We can force Docker to remove these directories using the –v
option of the docker rm
subcommand. This process of manual deletion poses two major challenges enumerated as follows:
VOLUME
instruction. Likewise, we might also have our own Docker images with VOLUME
inscribed in it. When we launch containers using such Docker images, the Docker engine will auto-generate the prescribed directories. Since we are not aware of the data volume creation, we may not call the docker rm
subcommand with the -v option to delete the auto-generated directory.In the previously mentioned scenarios, once the associated container is removed, there is no direct way to identify the directories whose containers were removed. Here are a few recommendations on how to avoid this pitfall:
docker inspect
subcommand and check whether any data volume is inscribed in the image or not.docker rm
subcommand with the -v
option to remove any data volume (directory) created for the container. Even if the data volume is shared by multiple containers, it is still safe to run the docker rm
subcommand with the -v
option because the directory associated with the data volume will be deleted only when the last container sharing that data volume is removed.As mentioned earlier, Docker enables us to etch data volumes in a Docker image using the VOLUME
instruction during the build time. Nonetheless, the data volumes should never be used to store any data during the build time, otherwise it will result in an unwanted effect.
In this section, we will demonstrate the undesirable effect of using the data volume during the build time by crafting a Dockerfile
, and then showcase the implication by building this Dockerfile
:
The following are the details of Dockerfile
:
Ubuntu 14.04
as the base image:# Use Ubuntu as the base image FROM ubuntu:14.04
/MountPointDemo
data volume using the VOLUME
instruction:VOLUME /MountPointDemo
/MountPointDemo
data volume using the RUN
instruction:RUN date > /MountPointDemo/date.txt
/MountPointDemo
data volume using the RUN
instruction:RUN cat /MountPointDemo/date.txt
Proceed to build an image from this Dockerfile
using the docker build
subcommand, as shown here:
$ sudo docker build -t testvol . Sending build context to Docker daemon 2.56 kB Sending build context to Docker daemon Step 0 : FROM ubuntu:14.04 ---> 9bd07e480c5b Step 1 : VOLUME /MountPointDemo ---> Using cache ---> e8b1799d4969 Step 2 : RUN date > /MountPointDemo/date.txt ---> Using cache ---> 8267e251a984 Step 3 : RUN cat /MountPointDemo/date.txt ---> Running in a3e40444de2e cat: /MountPointDemo/date.txt: No such file or directory 2014/12/07 11:32:36 The command [/bin/sh -c cat /MountPointDemo/date.txt] returned a non-zero code: 1
In the preceding output of the docker build
subcommand, you would have noticed that the build fails at step 3 because it cannot find the file created in step 2. Apparently, the file that was created in step 2 vanishes when it reaches step 3. This undesirable effect is due to the approach Docker uses to build its images. An understanding of the Docker image-building process would unravel the mystery.
In the build process, for every instruction in a Dockerfile
, the following steps are followed:
Dockerfile
instruction to an equivalent docker run
subcommandWhen a container is committed, it saves the container's filesystem and, deliberately, does not save the data volume's filesystem. Therefore, any data stored in the data volume will be lost in this process. So never use a data volume as storage during the build process.