Kubernetes Terms: Pods, Containers, Nodes & Clusters
In DevOps, Kubernetes is a container orchestration tool. It is used to deploy & manage containerized applications in an automated way.
Want to learn more about how Kubernetes works? Check out this video.
So, what does the term “containerized” mean?
A containerized application is an application that has been packaged as one or more containers. You can think of a container as a box that includes everything an application needs to run, such as the application code, libraries, and dependencies.
For example, say, we have packaged a Node.js web application as a container. To run the application, the container would include
- Node.js runtime
- Application code
- And other libraries and dependencies.
To understand why we need containers, let’s consider a simple example.
Imagine that you have a web application written in Python. Let’s say you want to share the application with someone else. Or you want to deploy the application in a test or production environment.
To run your application, the other person or environment would need to have a specific version of Python installed. Also, they would need to have installed any dependencies your application requires. This can be difficult to manage, especially if you are dealing with multiple people or environments.
This is where containers come into the picture. Containers are packages that contain everything an application needs to run including the application code, dependencies, libraries, and runtime environment.
With containers, you can package your entire application and its dependencies into a container image. The container image can then be run on any machine with a container runtime (a piece of software) installed. When a container image is run, the container runtime starts the container & executes the application code. This means that the other person or environment can easily run your application. They don’t have to worry about installing the specific version of Python or other dependencies.
That’s why we need containers. They make it easy to deploy applications on different platforms & environments. The applications run consistently and reliably, no matter what system they run on.
Stateless & Immutable
In Kubernetes, containers are stateless & immutable.
- Stateless: A stateless container is a container that does not store any data. The container does not maintain any information about its previous state or any data that it has processed. This makes it easier for Kubernetes to create and destroy containers at any time. There is no new state that could be lost when a container is destroyed. And when a container is created, there's no old state that needs to be recovered.
- Immutable: In Kubernetes, a container is considered immutable. This means that the contents of the container, such as the application code, libraries, and system files, can’t be modified once the container has been built. The only way to update the contents is to create a new container and deploy it in place of the old one. This also makes some things easier for Kubernetes. For example, scaling up when there is high demand for the application. To scale up, Kubernetes just launches multiple instances of the same container -- basically clones. You want all clones to work exactly the same way. Immutability ensures they are all identical, at all times.
One final, but crucial point about containers: In Kubernetes, we never interact with them directly. Instead, we work with Pods. Kubernetes uses Pods to manage and schedule containers.
So, what are Pods?
Pods are a core building block in Kubernetes. They host and manage the containers that run our applications. Think of them as the house where containers live in.
A Pod can host a single container or multiple containers.
All the containers within a Pod are co-located, meaning they run on the same node (server). They are also co-scheduled, meaning they are scheduled to run on the same node at the same time. This arrangement is super useful when we have applications composed of multiple containers that need to communicate with each other. Or, when we want to share resources among the containers in the Pod.
For example, imagine you have a web application consisting of two containers. A frontend container that serves the user interface and a backend container that handles database queries. Without using pods, these two containers could be deployed on separate nodes. And they would need to communicate over the network using their own IP addresses.
However, when we deploy these two containers in the same pod, they can communicate with each other without using an external network. Basically, they're super close to one another and can communicate much faster. Pods also make it easier to manage the application as a single unit. Additionally, both containers can access the same storage resources. This makes it easier for them to share and work with the same data.
All Pods share one important characteristic: they are impermanent in nature. This means that they can be created and destroyed as needed. Even when nodes fail, our applications continue to run and remain available. For example, if a worker node running four pods fails, Kubernetes will reschedule the lost pods and containers to run on healthy nodes.
Now that we understand what Pods are, we need to understand what nodes are, because Pods run on nodes.
Nodes are the physical or virtual machines that are used to run pods. In Kubernetes, there are two distinct types of nodes: master nodes and worker nodes.
The master nodes host the control plane, which is responsible for managing the state of a Kubernetes cluster. All of the interconnected master and worker nodes make up a Kubernetes cluster. And this cluster is the platform we use to deploy and host containers.
Master nodes are basically the "brains" behind Kubernetes. But why do we need multiple master nodes? We could have just one, our cluster would work just fine. But if that node fails, the control plane becomes unavailable. So having multiple master nodes helps with reliability. If one master node fails, the others can still do their job.
The control plane monitors containers and coordinates actions across the cluster. As users, we can communicate with a Kubernetes cluster by sending requests to the control plane.
Worker nodes are responsible for running containers. We never directly interact with the worker nodes. We send instructions to the control plane. The control plane then delegates the task of creating and maintaining containers to the worker nodes.
A Kubernetes cluster is a group of nodes used to run containerized applications. It is composed of master node components and worker node components.
Master Node Components
A master node runs the following components:
- API Server: The API server provides the main entry point for interacting with the cluster. In simple terms, it is a web server that listens for incoming HTTP requests. It exposes the Kubernetes API that external clients use to communicate with the cluster. For example, an external client could use the Kubernetes API to get a list of all the running Pods.
- Scheduler: The scheduler is responsible for finding the nodes for Pods to run. First, it filters the nodes that are eligible to run the Pod. Once it has a list of eligible nodes, it uses a scoring function. The scoring function determines the best node to run the Pod. The node with the highest score is selected as the target for the Pod.
- Controller Manager: One of the main functions of the Kubernetes controller manager is to maintain the desired state of the cluster. Let’s say a user wants 3 replicas (copies) of an application. The controller manager will then ensure that 3 copies of the application are running at all times. It does this by periodically checking if the current state matches the desired state. If it's not a match, it takes corrective actions.
- etcd: etcd is a key-value data store that stores the state of a Kubernetes cluster. Clients can always read the latest data from the store.
Worker Node Components
A worker node has the following components:
- Kubelet: Kubelet is the primary interface between the node and the Kubernetes control plane. It communicates with the API server to receive information about the pods that are assigned to the node. Then it takes action to ensure that pods are running and healthy. This includes tasks such as starting and stopping containers, managing their resource allocation, and monitoring their health.
- Kube-proxy: Kube-proxy maintains network rules on nodes. These rules determine how traffic is allowed to flow to and from the Pods. For example, a rule might specify that traffic from a particular IP address is allowed to reach a particular Pod. These rules ensure that Pods can communicate with each other and with external networks as needed.
- Container runtime: Container runtime is the software responsible for actually running the containers. It provides the infrastructure and functionality required to create, and manage containers. Kubernetes supports container runtimes such as containerd and CRI-O.
Pods, containers, nodes, and clusters form the foundation of Kubernetes. A strong understanding of these terms and concepts is key to understanding how Kubernetes works.
In the container orchestration space, Kubernetes is the clear winner. It was released in 2014. Since then, its adoption has been on the rise. It is now widely used by small and large organizations across the world.
Plus, Kubernetes is open-source and supported by a large and active community. You can find many resources and support, like documentation, tutorials, and forums. So it is easier to get started with Kubernetes.
To continue learning more about Kubernetes, consider taking a beginner-friendly course from KodeKloud: Kubernetes for the Absolute Beginner.