Technology

Understanding the Dockerfile VOLUME Instruction

Docker volumes are used to store persistent data outside your containers. They allow config files, databases, and caches used by your application to outlive individual container instances.

Volumes can be mounted when you start containers with the docker run command’s -v flag. This can either reference a named volume or bind mount a host directory into the container’s filesystem.

It’s also possible to define volumes at image build time by using the VOLUME instruction in your Dockerfiles. This mechanism guarantees that containers started from the image will have persistent storage available. In this article you’ll learn how to use this instruction and the use cases where it makes sense.

Defining Volumes In Dockerfiles

The Dockerfile VOLUME instruction creates a volume mount point at a specified container path. A volume will be mounted from your Docker host’s filesystem each time a container starts.

The Dockerfile in the following example defines a volume at the /opt/app/data container path. New containers will automatically mount a volume to the directory.

FROM ubuntu:22.04
VOLUME /opt/app/data

Build your image so you can test the volume mount:

$ docker build -t volumes-test:latest .

Retrieve the list of existing volumes as a reference:

$ docker volume ls
DRIVER   VOLUME NAME
local    demo-volume

Now start a container using your test image:

$ docker run -it volume-test:latest
[email protected]:/#

Repeat the docker volume ls command to confirm a new volume has been created:

$ docker volume ls
DRIVER   VOLUME NAME
local    3198bf857fdcbb8758c5ec7049f2e31a40b79e329f756a68725d83e46976b7a8
local    demo-volume

Exit out of your test container’s shell so that the container stops:

[email protected]:/# exit
exit

The volume and its data will continue to persist:

$ docker volume ls
DRIVER   VOLUME NAME
local    3198bf857fdcbb8758c5ec7049f2e31a40b79e329f756a68725d83e46976b7a8
local    demo-volume

You can define multiple volumes in one instruction as a space-delimited string or a JSON array. Both of the following forms create and mount two unique volumes when containers start:

VOLUME /opt/app/data /opt/app/config
# OR
VOLUME ["/opt/app/data", "/opt/app/config"]

Populating Initial Volume Content

Volumes are automatically populated with content placed into the mount directory by previous image build steps:

FROM ubuntu:22.04
COPY default-config.yaml /opt/app/config/default-config.yaml
VOLUME /opt/app/config

This Dockerfile defines a volume that will be initialized with the existing default-config.yaml file. The container will be able to read /opt/app/config/default-config.yaml without having to check whether the file exists.

Changes to a volume’s content made after the VOLUME instruction will be discarded. In this example, the default-config.yaml file is still available after containers start because the rm command comes after /opt/app/config is marked as a volume.

FROM ubuntu:22.04
COPY default-config.yaml /opt/app/config/default-config.yaml
VOLUME /opt/app/config
RUN rm /opt/app/config/default-config.yaml

Overriding VOLUME Instructions When Starting a Container

Volumes created by the VOLUME instruction are automatically named with a long unique hash. It’s not possible to change their names so it can be difficult to identify which volumes are actively used by your containers.

You can prevent these volumes appearing by manually defining volumes on your containers with docker run -v as usual. The following command explicitly mounts a named volume to the container’s /opt/app/config directory, making the Dockerfile’s VOLUME instruction redundant.

$ docker run -it -v config:/opt/app/config volumes-test:latest

When Should You Use VOLUME Instructions?

VOLUME instructions can be helpful in situations where you want to enforce that persistence is used, such as in images that package a database server or file store. Using VOLUME instructions makes it easier to start containers without remembering the -v flags to apply.

VOLUME also serves as documentation of the container paths that store persistent data. Including these instructions in your Dockerfile allows anyone to determine where your container keeps its data, even if they’re unfamiliar with your application.

VOLUME Pitfalls

VOLUME isn’t without its drawbacks. Its biggest problem is how it interacts with image builds. Using an image with a VOLUME instruction as your build’s base image will behave unexpectedly if you change content within the volume mount point.

The gotcha from earlier still applies: the effects of commands after the VOLUME instruction will be discarded. As VOLUME will reside in the base image, everything in your own Dockerfile comes after the instruction and you’re unable to modify the directory’s default contents. Behind the scenes, starting the temporary container for the build will create a new volume on your host that gets destroyed at the end of the build. Changes will not be copied back to the output image.

Automatic volume mounting can be problematic in other situations too. Sometimes users might prefer to start a temporary container without any volumes, perhaps for evaluation or debugging purposes. VOLUME removes this possibility as it’s not possible to disable the automatic mounts. This causes many redundant volumes to accumulate on the host if containers that use the instruction are regularly started.

Summary

Dockerfile VOLUME instructions allow volume mounts to be defined at image build time. They guarantee that containers started from the image will have persistent data storage available, even if the user omits the docker run command’s -v flag.

This behavior can be useful for images where persistence is critical or many volumes are needed. However the VOLUME instruction also breaks some user expectations and introduces unique behaviors so it needs to be written with care. Providing a Docker Compose file that automatically creates required volumes is often a better solution.



File source

Tags
Show More

Related Articles

Back to top button
Close