The syntax of a Dockerfile

The building blocks of a Dockerfile are a dozen directives. Most of these are made up of functions of the docker run/create flags. Let's take a look at the most essential ones:

FROM <IMAGE>[:TAG|[@DIGEST]: This is to tell the Docker daemon which image the current Dockerfile is based on. It's also the one and only instruction that has to be in a Dockerfile; you can have a Dockerfile that contains only this line. Like all of the other image-relevant commands, the tag defaults to the latest if unspecified.
RUN:

 RUN <commands>
 RUN ["executable", "params", "more params"]

The RUN instruction runs one line of a command at the current cache layer and commits the outcome. The main discrepancy between the two forms is with regards to how the command is executed. The first form is called shell form. This actually executes commands in the form of /bin/sh -c <commands>. The other form is exec form. This treats the command with exec directly.

Using the shell form is similar to writing shell scripts, hence concatenating multiple commands by shell operators and line continuation, condition tests, or variable substitutions is completely valid. Bear in mind, however, that commands aren't processed by bash but by sh.

The exec form is parsed as a JSON array, which means that you have to wrap texts with double quotes and escape the reserved characters. Besides, as the command is not processed by any shell, the shell variables in the array will not be evaluated. On the other hand, if the shell doesn't exist in the base image, you can still use the exec form to invoke executables.

CMD:

CMD ["executable", "params", "more params"]
CMD ["param1","param2"]
CMD command param1 param2 ...

The CMD it to set default commands for the built image, but it doesn't run the command at build time. If arguments are supplied upon executing docker run, the CMD configurations here are overridden. The syntax rules of CMD are almost identical to RUN; the previous two forms are the exec form, and the third one is the shell form, which prepends /bin/sh -c to the parameters as well. There's another ENTRYPOINT directive that would interact with CMD;the parameter of ENTRYPOINT would prepend to the three forms of CMD when a container starts. There can be many CMD directives in a Dockerfile, but only the last one will take effect.

ENTRYPOINT:

 ENTRYPOINT ["executable", "param1", "param2"]
 ENTRYPOINT command param1 param2

These two forms are, respectively, the exec form and the shell form, and the syntax rules are the same as RUN. The entry point is the default executable for an image. This means that when a container spins up, it runs the executable configured by ENTRYPOINT. When ENTRYPOINT is combined with the CMD and docker run arguments, writing it in a different form would lead to very different behavior. Here are the rules regarding their combinations:

If the ENTRYPOINT is in shell form, then the CMD and docker run arguments would be ignored. The runtime command would be as follows:

/bin/sh -c entry_cmd entry_params ...

If the ENTRYPOINT is in exec form and the docker run arguments are specified, then the CMD commands are overridden. The runtime command would be as follows:

entry_cmd entry_params run_arguments

If the ENTRYPOINT is in exec form and only CMD is configured, the runtime command would become the following for the three forms:

entry_cmd entry_parms CMD_exec CMD_parms
entry_cmd entry_parms CMD_parms
entry_cmd entry_parms /bin/sh -c CMD_cmd CMD_parms

ENV:

 ENV key value
 ENV key1=value1 key2=value2 ...

The ENV instruction sets environment variables for the consequent instructions and the built image. The first form sets the key to the string after the first space, including special characters, except the line continuation character. The second form allows us to set multiple variables in a line, separated with spaces. If there are spaces in a value, either enclose them with double quotes or escape the space character. Moreover, the key defined with ENV also takes effect on variables in the same document. See the following examples to observe the behavior of ENV:

FROM alpine
# first form
ENV k1 wD # aw
# second form, line continuation character also works

ENV k2=v2 k3=v 3 
    k4="v 4"
# ${k2} would be evaluated, so the key is "k_v2" in this case
ENV k_${k2}=$k3 k5="K=da"
# show the variables
RUN env | grep -Ev '(HOSTNAME|PATH|PWD|HOME|SHLVL)' | sort

The output during the docker build would be as follows:

...
 ---> Running in c5407972c5f5
k1=wD # aw
k2=v2
k3=v 3
k4=v 4
k5="K=da"
k_v2=v 3
...

ARG key[=<default value>]: The ARG instruction can pass our arguments as environment variables into the building container via the --build-arg flag of docker build. For instance, building the following file using docker build --build-arg FLAGS=--static would result in RUN ./build/dev/run --static on the last line:

FROM alpine
ARG TARGET=dev
ARG FLAGS
RUN ./build/$TARGET/run $FLAGS

Unlike ENV, only one argument can be assigned per line. If we are using ARG together with ENV, then the value of ARG, no matter where it is (either by --build-arg or the default value), would be overwritten by the value of ENV. Due to the frequent use of the proxy environment variables, these are all supported as arguments by default, including HTTP_PROXY, http_proxy, HTTPS_PROXY, https_proxy, FTP_PROXY, ftp_proxy, NO_PROXY, and no_proxy. This means we can pass these building arguments without defining them in the Dockerfile beforehand. One thing worth noting is that the value of ARG would remain in both the shell history on the building machine and the Docker history of the image, which means it's wise not to pass sensitive data via ARG:

LABEL key1=value1 key2=value2 ...: The use of LABEL resembles that of ENV, but a label is only stored in the metadata section of an image and is used by other host programs instead of programs in a container. For example, if we attach the maintainer of our image in the form LABEL [email protected], we can filter the annotated image with the -f(--filter) flag in this query: docker images --filter [email protected].
EXPOSE <port> [<port> ...]: This instruction is identical to the --expose flag used with docker run/create, exposing ports in the container created by the resulting image.
USER <name|uid>[:<group|gid>]: The USER instruction switches the user to run the subsequent instructions, including the ones in CMD or ENTRYPOINT. However, it can't work properly if the user doesn't exist in the image. If you want to run instructions using a user that doesn't exist, you have to run adduser before using the USER directive.

WORKDIR <path>: This instruction sets the working directory to a certain path. Environment variables set with ENV take effect on the path. The path would be created automatically if it doesn't already exist. It works like cd in a Dockerfile, as it takes both relative and absolute paths and can be used multiple times. If an absolute path is followed by a relative path, the result would be relative to the previous path:

WORKDIR /usr
WORKDIR src
WORKDIR app
RUN pwd
# run docker build
...
---> Running in 73aff3ae46ac
/usr/src/app

COPY:

 COPY [--chown=<user>:<group>] <src> ... <dest>
 COPY [--chown=<user>:<group>] ["<src>", ..., "<dest>"]

This directive copies the source to a file or a directory in the building container. The source as well as the destination could be files or directories. The source must be within the context path and not excluded by .dockerignore, as only those will be sent to the Docker daemon. The second form is for cases in which the path contains spaces. The --chown flag enables us to set the file owner on the fly without running additional chown steps inside containers. It also accepts numeric user IDs and group IDs:

ADD:

ADD [--chown=<user>:<group>] <src > ... <dest>
ADD [--chown=<user>:<group>] ["<src>", ..., "<dest>"]

ADD is quite similar to COPY in terms of its functionality: it moves files into an image. The major differences are that ADD supports downloading files from a remote address and extracting compressed files from the container in one line. As such, <src> can also be a URL or compressed file. If <src> is a URL, ADD will download it and copy it into the image; if <src> is inferred as a compressed file, it'll be extracted into the <dest> path:

VOLUME:

VOLUME mount_point_1 mount_point_2 ...
VOLUME ["mount point 1", "mount point 2", ...]

The VOLUME instruction creates data volumes at the given mount points. Once it's been declared during build time, any change in the data volume at consequent directives would not persist. Besides, mounting host directories in a Dockerfile or docker build isn't doable because of portability concerns: there's no guarantee that the specified path would exist in the host. The effect of both syntax forms is identical; they only differ with regard to syntax parsing. The second form is a JSON array, so characters such as should be escaped.

ONBUILD [Other directives]: ONBUILD allows you to postpone some instructions to later builds that happen in the derived image. For example, suppose we have the following two Dockerfiles:

--- baseimg.dck ---
FROM alpine
RUN apk add --no-cache git make
WORKDIR /usr/src/app
ONBUILD COPY . /usr/src/app/
ONBUILD RUN git submodule init 
       && git submodule update 
       && make
--- appimg.dck ---
FROM baseimg
EXPOSE 80
CMD ["/usr/src/app/entry"]

The instruction then would be evaluated in the following order when running docker build:

$ docker build -t baseimg -f baseimg.dck .
---
FROM alpine
RUN apk add --no-cache git make
WORKDIR /usr/src/app
---
$ docker build -t appimg -f appimg.dck .
---
COPY . /usr/src/app/
RUN git submodule init   
 && git submodule update 
 && make
EXPOSE 80
CMD ["/usr/src/app/entry"]

Table of Contents for The syntax of a Dockerfile

Create new playlist

Sign In

Sign Up

Table of Contents for
The syntax of a Dockerfile