Docker Certified Associate Exam Series (Part-7): Docker Engine Storage & Volumes

In the previous blog of this 7-part series, we discussed Docker Engine Security. This last blog dives into how storage is implemented in Docker and the components involved.

When you install Docker on a host, it automatically creates a directory: /var/lib/docker. This becomes the default file path for the storage of Docker files and objects. The directory has subfolders such as aufs, containers, images, volumes, and more. Files related to images are stored in the images subfolder, while those related to containers are stored in the containers subfolder.

Before we discuss drivers and volumes, let's first understand Docker's layered architecture.

Layered Architecture

Layered architecture is a software design pattern that divides an application into a set of layers, each with a specific responsibility. Each layer provides services to the layer above it and uses services provided by the layer below it. This separation of concerns makes the application more modular, easier to maintain and test and allows for better scalability.

When building an image in Docker, it is built progressively using a Layered Architecture. Consider the following Dockerfile for the Ubuntu image:

FROM ubuntu
RUN    apt-get update && apt-get -y install python python3-pip
RUN    pip install flask flask-mysql
COPY ./opt/source code
ENTRYPOINT FLASK_APP=/opt/source-code/app.py flask run
  1. Docker first creates the base Ubuntu OS in the first layer
  2. The second layer then installs apt packages
  3. The third image layer then installs and updates Python packages 
  4. The fourth layer copies the source code 
  5. The fifth layer updates the application’s ENTRYPOINT. 

Each layer stores changes from the previous layer in the cache, which makes image building faster and easier.

To understand the advantage of a layered architecture, let us consider building a second Ubuntu image with the same base OS, application packages, and Python packages as the previous application. The only layers that change are the source code and application ENTRYPOINT. 

FROM ubuntu
RUN    apt-get update && apt-get -y install python python3-pip
RUN    pip install flask flask-mysql
COPY .app2.py /opt/source code
ENTRYPOINT FLASK_APP=/opt/source-code/app2.py flask run

When building this image, Docker will not have to rebuild the first 3 layers, as they are available in cache memory. This saves disk space and makes image building a lot faster. This is especially advantageous when you have to update your application’s source code frequently.

These layers of your Dockerfile are read-only, and once you have built your image, you can’t make changes to the contents. You can only modify the contents of the image layers by initiating a new build. When you create a container based on this image, Docker will create a writable container layer onto which data written by the container is updated. This layer will contain log files, temporary files, and any other files created by the users of a container.

If you are trying to modify the contents of an image layer, Docker will make a copy of the file within the writable container layer. Any changes you make in the container will be written on this file, and this is known as the copy-on-write mechanism. The image will, therefore, remain the same while each container can accept changes from users.

Volumes

As soon as a container terminates, every change made to the container layer is also discarded. To make these changes persistent and permanent, we use volumes. To create a volume, we use the command:

docker volume create data_volume

This command creates a subfolder named data_volume in the volume folder of the /var/lib/docker directory. You can mount this volume onto your container’s read-write layer using the command:

docker run -v data_volume: /var/lib/mysql mysql

Now all data written onto your container’s writable layer will be stored in this volume. This data will persist even after the container exits. You can also instruct Docker to create a new volume data_volume2 for your container straight from the command line:

docker run -e MYSQL_ROOT_PASSWORD=root -v data_volume2:/var/lib/mysql mysql

This is called volume mountingwhere container data is stored in the default Docker directory. You can also store container data on any location within the docker host through the process of volume bindingTo store data on an external folder, run the command:

docker run -e MYSQL_ROOT_PASSWORD=root -v /data/mysql: /var/lib/mysql mysql

You can also use the newer convention to specify the volume bind:

mkdir /data/mysql

And bind it using this command:

docker run -e MYSQL_ROOT_PASSWORD=root --mount type=bind,source=/data/mysql,target=/var/lib/mysql mysql

Storage Drivers

Docker uses storage drivers to enable storage operations. There are many storage drivers you can use, including AUFS, BTRFS, Device Mapper, Overlay, and Overlay2, among others. The choice of storage driver usually depends on the underlying Operating System. For instance, Ubuntu uses AUFS, while other Operating Systems like Fedora or CentOS offer support for Device Mapper.

When possible, overlay2 is the recommended storage driver. When installing Docker for the first time, overlay2 is used by default. Previously, it was used by default when available, but this is no longer the case. 

To check the current storage driver, use the docker info command and look for the Storage Driver line.

Enroll in our Docker for Absolute Beginner course.

Docker Training Course for the Absolute Beginner | KodeKloud
Learn Docker with simple and easy hands-on Labs

Volume Operations

In this section, you will familiarize yourself with additional volume operations. To view the details of a volume in JSON format, use the command:

docker volume inspect data_volume

To list available volumes, run the command:

docker volume ls

To delete a volume that's not in use, run the command:

docker volume remove data_volume

To delete a volume in use, you need first to stop and remove the container as shown below:

docker stop container-name 
docker rm container-name
docker volume remove data_volume

To remove all unused volumes:

docker volume prune

To make your volume read-only, specify this attribute in the run command:

docker container run --mount \
source=data_vol1, destination= /var/www/html/index.html, readonly, httpd

To learn more about Docker volume operations, check out Docker volume documentation.

At Kodekloud, we have a comprehensive Docker Associate Associate Exam preparation course. The course explains all Docker concepts included in the certification's curriculum. After each topic, you get interactive quizzes to help you internalize the concepts learned. At the end of the course, we have mock exams that will help familiarize you with the exam format, time management, and question types.

Docker Certified Associate Exam Course | KodeKloud
Prepare for the Docker Certified Associate Exam Course

The previous parts of the Docker Certified Associate Exam Series:

Research Questions & Conclusion

This concludes the Docker Engine Storage & Volumes chapter of the DCA certification exam.

Here is a quick quiz to help you assess your knowledge. Leave your answers in the comments below and tag us back. 

Quick Tip – Questions below may include a mix of DOMC and MCQ types.

1. Which component is responsible for performing all of these operations: Maintaining the layered architecture, creating a write-able layer, moving files across layers to enable Copy-OnWrite, etc?

[A] Namespaces

[B] LibContainer

[C] Storage drivers

[D] Control groups

2. Which among the below is a correct command to start a container named webapp with a volume named vol2, mounted to the destination directory /app

[A] docker run -d --name webapp --mount source=vol2,target=/app httpd

[B] docker run -d --name webapp -v vol2:/app httpd

[C] docker run -d --name webapp --volume vol2:/app httpd

[D] docker run -d --name webapp --storage vol2:/app httpd

3. Which of the following are valid storage drivers supported by Docker?

[A] AUFS

[B] S3

[C] overlay2

[D] Device Mapper

4. By default, all files created inside a container are stored on a writable container layer.

[A] True

[B] False

5. What is the command to create a volume with the name my-vol?

[A] docker volume create my-vol

[B]  docker create volume my-vol

[C]  docker volume prune

[D]  docker volume rm all

6. Which among the below is the correct command to start a container named webapp with the volume vol3, mounted to the destination directory /opt in read-only mode?

[A] docker run -d --name webapp --mount source=vol3,target=/opt,readonly httpd

[B] docker run -d --name webapp -v vol3:/opt:ro httpd

[C] docker run -d --name webapp -v vol3:/opt:readonly httpd

[D] docker run -d --name webapp --volume vol3:/opt:ro httpd

[E] docker run -d --name webapp --mount source=vol3,target=/opt,ro httpd

7. What is the command to remove unused volumes?

[A] docker container rm my-vol

[B] docker volume rm my-vol

[C] docker volume prune

[D] docker volume rm --all

8. The selection of the storage driver depends on the underlying OS being used.

[A] True

[B] False

By properly following this study guide till this part of the series, you have prepared yourself to handle all Docker Engine Storage questions and practical scenarios – and are, of course, a step closer to passing the DCA certification test.