Docker vs. containerd

Docker vs. containerd

A while ago Kubernetes announced that it was deprecating Docker. "Deprecated" is basically a tech word for "This will soon expire."; "Don't use this anymore, it will soon be removed." So how will Kubernetes run containers without Docker? With containerd, of course! Yes, Docker was replaced with something else, called containerd. Or at least, that's the sort of default migration path, from Docker to containerd. Kubernetes cluster administrators can choose something else if they want, such as CRI-O. But wait, why remove Docker and replace it with this? We already know that Docker can do anything we dream of, with containers. Why is another tool necessary? Is it better? What is the difference between Docker and containerd? Let's clear this up! Time for a short history lesson.

In the Beginning, There Was Docker…

We already know that Docker exploded in popularity, fairly quickly, just like Kubernetes did a while after. But why? What was so cool about this tool?

Containers existed way before Docker. But Docker made everything easier, with a rather simple approach. Not to say that it was simple for the developers to write this utility. But it was simple for users to manage containers with it. Want to build a container? Docker can build it. Want to download a container image? Docker can do that. Built a cool container and want to upload it to the company's server? Yes, Docker can do that too. Want to connect two containers so they can talk to each other? Yes, Docker can implement networking between multiple containers. Want to modify a container image? Want to execute a command inside a container? Want to look at container logs? Want to freeze a container in time, pause all the programs running there and resume activity later? Of course, Docker can do all of these things too.

So basically Docker is an "everything but the kitchen sink" utility. It gave us a way to do everything we want with containers, with a single tool, without needing to download additional programs. It simplified our user experience. Not to mention that the commands we need to use are pretty intuitive too. Commands such as

docker start

docker stop

are easy to understand at first glance.

So if Docker is so easy to use, what's the deal with containerd? Well, let's dig deeper.

Old Monolithic Docker

Docker was a pretty complex program that went through many, many changes during its lifetime. In the beginning, this was what is called a "monolithic" utility. "Monolithic" in this case means "one inseparable thing". In other words, it was a big program that could do a lot of stuff. But it was just one big program, not two, not 10. But, of course, even a monolithic program has many parts, in the form of application sections, libraries, or simply various pieces of code that deal with different types of activities. Some part of its code was responsible for pulling in container images. Another part was responsible for starting up containers, and so on.

To understand this better, we can think of a game like GTA V. This is a big, monolithic game. But some part of its code is responsible for how cars appear and move on the street. Another part is responsible for the weather. Another part is responsible for how people walk on the street and go through their routines. It's one big monolithic game that contains all of this stuff. We can't launch just the part of the game that is responsible for cars moving on the street. We have to launch the whole thing, containing all of those parts. Now that's no problem for a game, we want to access the entire thing, with the complete world it simulates. But, in the case of Docker, this is not true. It soon became obvious that it would be useful if we could somehow access only parts of it. Let's see why.

New Modular Docker

Everybody started to use containers. So Docker became more, and more complex. When you have a complex system, breaking it up into smaller pieces can simplify things. For example, let's think about about a command like this:

docker run --name webserver -p 80:80 -d nginx

This pulls in the "nginx" image and immediately starts a container that runs this Nginx application. This, in turn, gives us access to a web server. People can now connect to it on port 80 and see whatever web page we have there. Now let's try to think about what Docker, as a program, has to do here. First of all, it needs to have some part in its code that can understand our command:

docker run --name webserver -p 80:80 -d nginx

It must somehow "translate" this internally and know what the human wants to achieve here. That's the job of the Docker CLI, "Command Line Interface". After it understands what we want, some other part of its code needs to pull in the "nginx" container image. Next, another part of the code has to start that container and make it accessible on port 80. And this is where we get to the interesting bit, and, finally, understand what the deal with containerd is.

When Docker was monolithic, a single application translated our command, then pulled in the container image, started it, and made it accessible on port 80. Nowadays, that's not true anymore. In a very simplified form, this is what currently happens:

The Docker CLI utility accepts the command. Then it figures out what we want to do. After it understands our intention, it passes this intention to the Docker Daemon. This daemon is a separate program (from Docker CLI) that always runs in the background, waiting for instructions. After the Docker Daemon receives our desired action, it tells another app, called a container runtime, to pull in the container image. This container runtime is called containerd.

So we can now finally understand what containerd is. In tech terms, it is a container runtime. This is a sort of container manager. It takes care of things such as:

  • Downloading container images.
  • Uploading container images.
  • Setting up networking between these containers, so that they can communicate with each other, or the outside world.
  • Managing data and files stored inside these containers.
  • Starting, stopping, restarting containers.

containerd is called a high-level container runtime. For some actions, it makes use of yet another runtime, called a low-level container runtime. This low-level runtime is called runc. For example, when containerd needs to start a container, it tells runc to do that. At the end of the day, a container is an application running in an isolated part of the system. For example, if we start a regular calculator app on Windows, this opens up a regular process, not isolated from the rest of the system. It can access any file and do almost anything it wants. But if this calculator would run in a container, it would only see files inside that container. And it would only be able to communicate with other processes inside that container. Its entire world is inside that container. As far as it's concerned, it thinks that is the "real system", so it's not able to see anything that is outside of that space. runc is the one responsible to start a process in this special, isolated mode.

All of these, Docker CLI, Docker Daemon, containerd, runc, are entirely separate programs. Pretty impressive how so many programs pass jobs along to each other just to start a container. And we even skipped some small steps it goes through, to keep things simpler. But how did we end up from monolithic Docker to this collection of entirely separate applications talking to each other?

Well, after years and years of work, Docker developers started to split up different sections of Docker's code. So one part of the code became containerd. Some other part became runc. But, why? First of all, this makes things simpler for developers. Now instead of digging through various files, trying to find the part of code responsible for starting containers, developers can just go directly to runc, which has a separate GitHub page. So, in the past, runc was just some section of Docker's code responsible for this job. But developers slowly extracted that code and made it into an entirely separate utility. But this wasn't just to make development easier.

We can imagine old Docker as some sort of iPhone. It's one inseparable phone, all of its parts tightly glued together. Sure, inside, there is a separate battery, a camera, a processor, and so on. But they're so tightly assembled and integrated with each other that we can't easily disassemble our iPhone and replace our old camera with a new, better camera. But it would be useful if we could do that, wouldn't it? Imagine we're unhappy with our old, 12 Megapixels camera. And we could just slide it off and replace it with a new 48 Megapixel camera, as easily as we can replace batteries in our mouse. Would definitely be a nice thing. Well, we can't do it with an iPhone, but we can do it with the new modular Docker.

So we can now think of the new, modern Docker, as a big, entire car, with all of its parts: the engine, the steering wheel, the pedals, and so on. And if we need the engine, we can easily extract it and move it into another system. Which is exactly what happened when Kubernetes needed such an engine. They basically said "Hey, we don't need the entire car that is Docker, let's just pull out its container runtime/engine, containerd, and install that into Kubernetes". If you want, you can read more about why Kubernetes did this, in this blog post: "Kubernetes Removed Docker. What Happens Now?". And this is actually the bigger reason why Docker was split into many smaller components, so that they can be freely moved around and plugged into other systems. This gives server administrators a lot of flexibility to build their Kubernetes infrastructure however they want, with pieces that work best for them, give them more performance, or security, or whatever is most important for them. And this is not limited to Kubernetes. Pieces like containerd can be inserted into whatever system we want. In fact, containerd can even be used directly, on our computer, if we'd want to. But 99% of users won't want to use containerd directly, without going through Docker first. Why is that?

Docker for Humans, Containerd for Programs

Docker was written with human beings in mind. We can imagine it as a sort of translator that tells an entire factory, filled with robots, about what the human wants to build or do. Docker CLI is the actual translator, some of the other pieces in Docker are the robots in the factory. On the left side, we have the human, needing to do something with containers. In the middle, we have Docker CLI + all of its other components. And, at the right, we have some action performed by Docker, such as building a container, pulling an image, or starting a container. So Docker is a middleman, accepting commands from humans and then producing a result.

Diagram showing how Docker CLI translates user commands and sends instructions to containerd

For example, for a command like

docker run --name webserver -p 80:80 -d nginx

this is how actions flow from one Docker component to the other, until, finally, the container starts:

Diagram showing how commands reach Docker CLI, instructions are sent to containerd, and runc finally starts a container

Again, for simplicity, we left some parts out, like the Docker Daemon. But in a nutshell, this is what happens after someone enters that command:

  1. Docker CLI understands what we want to do, and then sends instructions to containerd.
  2. containerd does its magic, downloads the nginx image if it's not available.
  3. Next, containerd tells runc to start this container
  4. And we finally get our result: nginx running in a little isolated container box.

It's easy to see that the Docker CLI is not necessarily required for this action. This means we don't really need Docker with all of its parts, such as the Docker CLI, Docker Daemon, and some of its other bits and pieces. However, we still need containerd and runc to start a container. So why not tell containerd directly about our intention? If we skip running Docker CLI, and the Docker Daemon, at least, we would use less memory on our system, right? It would be more efficient, that is true. In fact, it's one of the reasons why Kubernetes removed Docker and opted to use containerd directly. But that's a tradeoff that is useful on servers running hundreds of containers. For our personal computer, where we just run a few containers, test things out, this wouldn't make a noticeable difference. But if Kubernetes can skip the middleman that is Docker, and tell containerd directly about what it wants to do, containers can start up a bit faster. And half a second here, half a second there, with hundreds of containers, can add up and show noticeable improvements.

But keep in mind, Kubernetes is a program, containerd is also a program. And programs can quickly talk to each other, even if the language they speak is complex. containerd is developed from the ground up, to let other programs give it instructions. It receives instructions in a specialized language. These instructions are named API calls and they are sent through what is called an API, Application Programming Interface. These interfaces are basically doors through which API calls can be sent by one program, and received by another program at the other end. Of course, the API also establishes what kind of "language" these programs should use. The messages sent in API calls need to follow a certain format, so that the receiving program can understand them.

Diagram showing how client programs send API calls to containerd

It would be tedious for humans to send API calls every time they want to tell containerd to do something. But when developers write programs that should interact with containerd, they implement ways to send the correct API calls. So apps can efficiently communicate with each other through these APIs. Here is an example of a small program connecting to containerd and then sending it an instruction to download a container image:

package main

import (
        "context"
        "log"

        "github.com/containerd/containerd"
        "github.com/containerd/containerd/namespaces"
)

func main() {
        if err := redisExample(); err != nil {
                log.Fatal(err)
        }
}

func redisExample() error {
        client, err := containerd.New("/run/containerd/containerd.sock")
        if err != nil {
                return err
        }
        defer client.Close()

        ctx := namespaces.WithNamespace(context.Background(), "example")
        image, err := client.Pull(ctx, "docker.io/library/redis:alpine", containerd.WithPullUnpack)
        if err != nil {
                return err
        }
        log.Printf("Successfully pulled %s image\n", image.Name())

        return nil
}

Source code extracted from this page: https://github.com/containerd/containerd/blob/main/docs/getting-started.md

Do we want to write such stuff just to pull in a container image? Of course not. So Docker, on the other hand, with its Docker CLI, is built to receive instructions from human beings. It's more "human-friendly", letting us do many things, with rather short commands that are easy to write and easy to remember.

Experimenting with containerd

But if we really want to experiment with containerd, we can do that, without needing to make complex API calls. If we do have Docker installed on our system, containerd is already installed too, since Docker needs it. And there are a few utilities we can use to speak to containerd directly. One method is through the ctr utility. For example, to tell containerd to download the nginx image, we would enter a command like this:

sudo ctr images pull docker.io/library/nginx:latest

To see what commands ctr supports, we enter this:

ctr

To get help about a certain subcommand, we just write "ctr subcommand_name" with no further parameters/instructions. For example, if we want to see how we can use the "images" subcommand, we can write:

ctr images

But this ctr command is more of a "shortcut", meant for simple interactions with containerd, in case someone needs to debug things or test some stuff. Imagine we are developers, and just implemented some cool new stuff into containerd. Now we want to test if the container image is downloaded faster with some optimizations we made. Sending an API call to containerd would be tedious. But with ctr, we can bypass having to write and send an API call. We write less, in the form of short commands, and test faster. ctr then does the heavy lifting and sends the correct API calls. So ctr is not really meant to be used as we use the Docker CLI. It may seem similar, but that is not its purpose. Plus, it doesn't really support all the things that we can do with Docker.

There is also a tool called nerdctl. This has to be downloaded and installed separately. nerdctl tries to mimic Docker CLI's syntax. So it's a way to write Docker-like commands, but without actually talking to Docker. Instead, it tells containerd directly about the actions we want to take.

Remember how we'd use this command to start an Nginx container?

docker run --name webserver -p 80:80 -d nginx

This of course goes through all of those steps, telling Docker CLI about what we want to do, which then goes to the Docker Daemon, and then finally reaches containerd at some point. With nerdctl, we can tell containerd directly to start our container, with a command like:

nerdctl run --name webserver -p 80:80 -d nginx

So we skip going through the Docker CLI and the Docker Daemon.

nerdctl has the added benefit that it can give us access to the newest features implemented into containerd. For example, we can work with encrypted container images, a rather new feature in 2022, that will eventually be implemented into regular Docker commands too. However, these features are still experimental, not necessarily safe for real-world workloads yet. So nerdctl isn't geared toward end-users. It's also a tool aimed at developers or system administrators that want an easy way to test or debug containerd's features. That is, they can quickly test things with simple nerdctl commands, rather than complex API instructions, as we mentioned earlier when discussing the ctr utility.

So there we have it! Hopefully, this clears up the mystery about what Docker is, how it's built, and what containerd is.

Want to Learn More About Docker?

If you're totally new to Docker, you can check out this Docker for Beginners Course. If you're already familiar with Docker, but want to learn slightly more advanced concepts, check out this course that can help you pass the Docker Certified Associate Exam.