What Is Kubernetes DaemonSet and How to Use It?
In Kubernetes, a Deployment is a higher-level abstraction built on top of ReplicaSets. In other words, a Deployment provides a simpler, higher-level interface for managing and scaling applications, while ReplicaSets are the lower-level building blocks that a Deployment uses to achieve this.
When you create a Deployment, you specify the number of replicas(copies) you want to run, the container image to use, and various other details about your Pods. Kubernetes then creates a ReplicaSet and schedules the specified number of replicas of your Pod on the nodes in your cluster.
The ReplicaSet created by the Deployment is responsible for ensuring that the specified number of replicas are running at all times. It is possible that some nodes in the cluster may not have any replicas of the Pod running on them. This is because the ReplicaSet will only ensure that the specified number of replicas are running across the entire cluster, regardless of which nodes they are running on.
However, there may be cases where you want to ensure that a specific Pod runs on all nodes in the cluster. For example, if you want to run a logging agent on all nodes to collect log data, or run a node-level service like a network plugin or storage daemon. In such scenarios, you can use Kubernetes DaemonSets.
What is a Kubernetes DaemonSet?
A DaemonSet is a Kubernetes resource that ensures a specified Pod runs on all nodes or a specific subset of nodes in a cluster. DaemonSets are commonly used to deploy special programs that run in the background, performing tasks such as monitoring and logging.
For example, a log collector daemon gathering log data from all the other programs running on a node. A monitoring agent tracking of the node's performance and send alerts if there are any issues.
For instance, assume you have a Kubernetes cluster with three nodes: node 1, node 2, and node 3, and you want to collect logs from all three. You can use the DaemonSet to ensure a replica of a logging Pod runs on all three nodes.
If you then add a new node, node 4, to the cluster, the DaemonSet will automatically schedule a replica of the logging Pod to run on node 4. Similarly, if you remove node 3 from the cluster, the DaemonSet will automatically terminate the replica of the logging Pod running on node 3.
Why Use a DaemonSet?
Some common use cases of DaemonSets are as follows:
- Logging and monitoring: DaemonSets are often used to ensure that a logging agent or a monitoring tool is running on every node in the cluster. This way, these agents or tools can collect data from each node and provide valuable insights into the health and performance of the cluster.
- Cluster storage: DaemonSets can be used to manage and maintain storage on every node in a cluster. For example, you might use a DaemonSet to run a distributed storage system such as Ceph to ensure that your storage infrastructure is highly available, scalable, and performant.
- Node resource monitoring: DaemonSets can also be used to monitor resource utilization on each node in a cluster. By running monitoring systems such as Prometheus, you can collect data on CPU usage, memory usage, disk usage, and other resource metrics. This data can then be used to optimize resource allocation and identify performance issues.
Now that we understand what a DaemonSet is and its most common use cases, let's dive deeper into how to use it in your Kubernetes cluster.
Prerequisites
To follow along with the examples in the coming sections, you will need:
- A running Kubernetes cluster with at least two worker nodes.
- The kubectl command-line tool installed on your local computer. kubectl is a tool used to interact with Kubernetes clusters. It's required for deploying and managing resources on Kubernetes.
Note that all the commands and their outputs provided in the examples have been tested on a multi-node (1 master node and 2 worker nodes) Kubernetes cluster created using Minikube. If you have Minikube already installed, you can set up a multi-node cluster by following the instructions provided in this tutorial.
Creating a DaemonSet
A DaemonSets is created by submitting a DaemonSet YAML configuration file to the Kubernetes API server. The example below demonstrates how to create a fluentd logging agent on each node in a specific cluster.
We start by creating a file, "fluentd.yaml" describing the DaemonSet.
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: fluentd
spec:
selector:
matchLabels:
app: fluentd
template:
metadata:
labels:
app: fluentd
spec:
containers:
- name: fluentd
image: fluent/fluentd
volumeMounts:
- name: varlog
mountPath: /var/log
- name: varlibdockercontainers
mountPath: /var/lib/docker/containers
readOnly: true
terminationGracePeriodSeconds: 30
volumes:
- name: varlog
hostPath:
path: /var/log
- name: varlibdockercontainers
hostPath:
path: /var/lib/docker/containers
A DaemonSet configuration file needs to have the following fields:
- apiVersion: The "apiVersion" field refers to the version of the Kubernetes API you're using to create this object.
- kind: The "kind" field specifies the type of object being created (a DaemonSet in this case).
- metadata: The "metadata" field provides information about the object being created, including the name of the object.
- spec: The "spec" field is where you provide the actual details of the DaemonSet, including the Pod template (.spec.template). The Pod template is a template of the Pod that the DaemonSet will be creating replicas of. In addition to the required fields for a normal Pod, a Pod template in a DaemonSet must also specify appropriate labels. These labels are used to identify the Pods that the DaemonSet is responsible for. You also need to specify a Pod selector in the DaemonSet's ".spec.selector" field. This selector is used to match the labels of the Pod template so that the DaemonSet knows which Pods it should be managing.
Now that you know how to write a valid DaemonSet configuration, you can use the "kubectl apply" command to submit the DaemonSet to the Kubernetes API:
kubectl apply -f fluentd.yaml
After the "fluentd" DaemonSet has been successfully submitted to the Kubernetes API, you can use the "kubectl describe" command to check its current state:
kubectl describe daemonset fluentd
Note that the screenshot above is a shortened version of the complete output.
The output shows that a "fluentd" Pod was successfully deployed on all three nodes of our cluster. You can confirm this by using the "kubectl get pods" command with the "-o" flag to display the nodes that each "fluentd" Pod was assigned to.
kubectl get pods -l app=fluentd -o wide
Limiting DaemonSets to Specific Nodes
DaemonSets are most commonly used to run a Pod across every node in a Kubernetes cluster. However, there may be cases where you want to run a Pod on only a subset of nodes.
For example, if you have a workload that requires fast storage, you would want to deploy that workload only to the nodes that have fast storage available. In cases like these, you can use node labels to tag specific nodes that meet the requirements of the workload.
Add Labels to Nodes
You can add the desired set of labels to a subset of nodes using the "kubectl label" command. The following command adds the label "ssd=true" to the node daemonset-demo-m03:
kubectl label nodes daemonset-demo-m03 ssd="true"
Now you can filter the node that has the "ssd" label set to "true" using the "kubectl get nodes" command with the "–selector" flag.
kubectl get nodes --selector ssd="true"