On this post we will explain the different Dockerfile instructions you can use when creating your container. These are commands that are put in the Docker File.
The syntax for instructions and their arguments in a Dockerfile is:
# Comment
INSTRUCTION arguments
Instructions can be lowercase or uppercase letters, but to differentiate the instructions and the arguments, instructions are generally written in uppercase.
Dockerfile example
So before we get into each instruction, let me show you a Dockerfile example so we know what we are talking about:
FROM ubuntu
RUN apt-get update && apt-get -y install httpd
RUN mkdir -p /data/myscript
WORKDIR /data/myscript
CMD python app.py
There are a variety of Dockerfile instructions we can put in our Dockerfile. These include FROM, RUN, WORKDIR, COPY, ADD, VOLUME, CMD, ENTRYPOINT, WORKDIR, USER, ONBUILD, LABEL, ARG, SHELL, HEALTHCHECK, EXPOSE and ENV. You can see a full list of the available Dockerfile instructions here. We are going to describe the main ones below.
FROM
The FROM
instruction initializes a new build stage and sets the Base Image for subsequent instructions. As such, a valid Dockerfile
must start with a FROM
instruction.
Syntax:
FROM <image> [AS <name>]
FROM <image>[:<tag>] [AS <name>]
FROM <image>[@<digest>] [AS <name>]
ARG
is the only instruction that may precedeFROM
in theDockerfile
.FROM
can appear multiple times within a singleDockerfile
to create multiple images or use one build stage as a dependency for another. Simply make a note of the last image ID output by the commit before each newFROM
instruction. EachFROM
instruction clears any state created by previous instructions.- Optionally a name can be given to a new build stage by adding
AS name
to theFROM
instruction. The name can be used in subsequentFROM
andCOPY --from=<name|index>
instructions to refer to the image built in this stage. - The
tag
ordigest
values are optional. If you omit either of them, the builder assumes alatest
tag by default. The builder returns an error if it cannot find thetag
value.
RUN
RUN has 2 forms:
RUN <command>
(shell form, the command is run in a shell, which by default is/bin/sh -c
on Linux orcmd /S /C
on Windows)RUN ["executable", "param1", "param2"]
(exec form)
The RUN
instruction will execute any commands in a new layer on top of the current image and commit the results. The resulting committed image will be used for the next step in the Dockerfile
.
Layering RUN
instructions and generating commits conforms to the core concepts of Docker where commits are cheap and containers can be created from any point in an image’s history, much like source control.
The exec form makes it possible to avoid shell string munging, and to RUN
commands using a base image that does not contain the specified shell executable.
CMD
The CMD Dockerfile instruction specifies the command to run when a container is launched. It is similar to the RUN instruction, but rather than running the command when the container is being built, it will specify the command to run when the container is launched, much like specifying a command to run when launching a container with the docker run command, for example:
docker run -i -t ubuntu /bin/bash
This would be articulated in the Dockerfile as:
CMD ["/bin/bash"]
You can also specify parameters to the command, like so:
CMD ["/bin/bash", "-l"]
Here we’re passing the -l flag to the /bin/bash command.
You’ll note that the command is contained in an array. This tells Docker to run the command ’as-is’. You can also specify the CMD instruction without an array, in which case Docker will prepend /bin/sh -c to the command.
This may result in unexpected behavior when the command is executed. As a result, it is recommended that you always use the array syntax.
Lastly, it’s important to understand that we can override the CMD instruction using the docker run command. If we specify a CMD in our Dockerfile and one on the docker run command line, then the command line will override the Dockerfile ’s CMD instruction.
ENTRYPOINT
ENTRYPOINT has two forms:
ENTRYPOINT ["executable", "param1", "param2"]
(exec form, preferred)ENTRYPOINT command param1 param2
(shell form)
An ENTRYPOINT
allows you to configure a container that will run as an executable.
For example, the following will start nginx with its default content, listening on port 80:
docker run -i -t --rm -p 80:80 nginx
ENTRYPOINT looks similar to CMD, because it also allows you to specify a command with parameters. The difference is ENTRYPOINT command and parameters are not ignored when Docker container runs with command line parameters.
WORKDIR
WORKDIR /path/to/workdir
The WORKDIR
instruction sets the working directory for any RUN
, CMD
, ENTRYPOINT
, COPY
and ADD
instructions that follow it in the Dockerfile
. If the WORKDIR
doesn’t exist, it will be created even if it’s not used in any subsequent Dockerfile
instruction.
The WORKDIR
instruction can be used multiple times in a Dockerfile
. If a relative path is provided, it will be relative to the path of the previous WORKDIR
instruction. For example:
WORKDIR /a
WORKDIR b
WORKDIR c
RUN pwd
USER
USER <user>[:<group>] or
USER <UID>[:<GID>]
The USER
instruction sets the user name (or UID) and optionally the user group (or GID) to use when running the image and for any RUN
, CMD
and ENTRYPOINT
instructions that follow it in the Dockerfile
.
ONBUILD
ONBUILD [INSTRUCTION]
The ONBUILD
instruction adds to the image a trigger instruction to be executed at a later time, when the image is used as the base for another build. The trigger will be executed in the context of the downstream build, as if it had been inserted immediately after the FROM
instruction in the downstream Dockerfile
.
Any build instruction can be registered as a trigger.
This is useful if you are building an image which will be used as a base to build other images, for example an application build environment or a daemon which may be customized with user-specific configuration.
For example, if your image is a reusable Python application builder, it will require application source code to be added in a particular directory, and it might require a build script to be called after that. An example for ONBUILD instruction is shown below:
ONBUILD ADD . /app/src
ONBUILD RUN /usr/local/bin/python-build --dir /app/src
LABEL
LABEL <key>=<value> <key>=<value> <key>=<value> ...
The LABEL
Dockerfile instruction adds metadata to an image. A LABEL
is a key-value pair. To include spaces within a LABEL
value, use quotes and backslashes as you would in command-line parsing. A few usage examples:
LABEL "com.example.vendor"="ACME Incorporated"
LABEL com.example.label-with-value="foo"
LABEL version="1.0"
ARG
ARG <name>[=<default value>]
The ARG
instruction defines a variable that users can pass at build-time to the builder with the docker build
command using the --build-arg <varname>=<value>
flag. A Dockerfile may include one or more ARG
instructions. For example, the following is a valid Dockerfile:
SHELL
SHELL ["executable", "parameters"]
The SHELL
instruction allows the default shell used for the shell form of commands to be overridden. The default shell on Linux is ["/bin/sh", "-c"]
, and on Windows is ["cmd", "/S", "/C"]
. The SHELL
instruction must be written in JSON form in a Dockerfile.
HEALTHCHECK
The HEALTHCHECK
instruction tells Docker how to test a container to check that it is still working. This can detect cases such as a web server that is stuck in an infinite loop and unable to handle new connections, even though the server process is still running. The healthcheck is a quite powerful instruction so we will go into further details in our next post.
EXPOSE
EXPOSE <port> [<port>/<protocol>...]
The EXPOSE
instruction tells Docker that the container listens on the specified network ports at runtime. Default is TCP if the protocol is not specified.
The EXPOSE
instruction does not actually publish the port. It functions as a type of documentation between the person who builds the image and the person who runs the container, about which ports are intended to be published. To actually publish the port when running the container, use the -p
flag on docker run
to publish and map one or more ports, or the -P
flag to publish all exposed ports and map them to high-order ports.
By default, EXPOSE
assumes TCP. You can also specify UDP:
EXPOSE 80/udp
To expose on both TCP and UDP, include two lines:
EXPOSE 80/tcp
EXPOSE 80/udp
In this case, if you use -P
with docker run
, the port will be exposed once for TCP and once for UDP. Remember that -P
uses an ephemeral high-ordered host port on the host, so the port will not be the same for TCP and UDP.
Regardless of the EXPOSE
settings, you can override them at runtime by using the -p
flag. For example
docker run -p 80:80/tcp -p 80:80/udp ...
ENV
ENV <key> <value>
ENV <key>=<value> ...
The ENV
instruction sets the environment variable <key>
to the value <value>
. This value will be in the environment for all subsequent instructions in the build stage and can be replaced inline in many as well.
The ENV
instruction has two forms. The first form, ENV <key> <value>
, will set a single variable to a value. The entire string after the first space will be treated as the <value>
– including whitespace characters. The value will be interpreted for other environment variables, so quote characters will be removed if they are not escaped.
The second form, ENV <key>=<value> ...
, allows for multiple variables to be set at one time. Notice that the second form uses the equals sign (=) in the syntax, while the first form does not. Like command line parsing, quotes and backslashes can be used to include spaces within values.
COPY
COPY has two forms:
COPY [--chown=<user>:<group>] <src>... <dest>
COPY [--chown=<user>:<group>] ["<src>",... "<dest>"]
(this form is required for paths containing whitespace)
The COPY
instruction copies new files or directories from <src>
and adds them to the filesystem of the container at the path <dest>
.
Multiple <src>
resources may be specified but the paths of files and directories will be interpreted as relative to the source of the context of the build.
The following example shows the COPY syntax
COPY hom* /mydir/ # adds all files starting with "hom"
COPY hom?.txt /mydir/ # ? is replaced with any single character, e.g., "home.txt"
ADD
ADD has two forms:
ADD [--chown=<user>:<group>] ["<src>",... "<dest>"]
(this form is required for paths containing whitespace)ADD [--chown=<user>:<group>] <src>... <dest>
The ADD
instruction copies new files, directories or remote file URLs from <src>
and adds them to the filesystem of the image at the path <dest>
.
We can see an example below
ADD hom* /mydir/ # adds all files starting with "hom"
ADD hom?.txt /mydir/ # ? is replaced with any single character, e.g., "home.txt"
Multiple <src>
resources may be specified but if they are files or directories, their paths are interpreted as relative to the source of the context of the build.
At first glance you may notice that COPY and ADD seems to perform the same operations. However, ADD also supports 2 additional sources. First, you can use a URL instead of a local file / directory. Secondly, you can extract a tar file from the source directly into the destination.
A valid use case for ADD
is when you want to extract a local tar file into a specific directory in your Docker image.
If you’re copying in local files to your Docker image, always use COPY
because it’s more explicit.
VOLUME
The VOLUME
instruction creates a mount point with the specified name and marks it as holding externally mounted volumes from native host or other containers. The value can be a JSON array, VOLUME ["/var/log/"]
, or a plain string with multiple arguments, such as VOLUME /var/log
or VOLUME /var/log /var/db
.
A volume is a specially designated directory within one or more containers that bypasses the Union File System to provide several useful features for persistent or shared data:
- Volumes can be shared and reused between containers.
- A container doesn’t have to be running to share its volumes.
- Changes to a volume are made directly.
- Changes to a volume will not be included when you update an image.
- Volumes persist until no containers use them.
The syntax for the VOLUME instruction is quite simple actually
VOLUME ["/data"]
Volume is the last dockerfile instruction that we will describe. We covered the main ones but there are more instructions that you can check at the Dockerfile instructions reference.