Kubernetes Tutorial for Beginners
So you've decided to learn about Kubernetes. It's certainly a wise choice. Many companies use this software platform. So it can help a lot, career-wise, to know how to use it.
But Kubernetes is quite large, complex, and extensible. There are literally thousands of things you can do with it. This makes it quite hard to learn, especially in the beginning. What do you start with? How do you continue after that? This is why we created learning paths at KodeKloud. For example, our Kubernetes learning path will guide you step-by-step, making your learning experience easy. And once you go through that, you'll know all the important things about Kubernetes.
But maybe you just want to get your feet wet. You want a quick intro to Kubernetes. You want to see what it can do. And you want to find out in 30 minutes or less. Well, this blog will help you with that. We'll be discussing the theory about the basic building blocks of Kubernetes. And we'll also go through some practical exercises.
We'll help you understand this software platform -- what it is, how it is used, what it can do.
What is Kubernetes?
The rise of microservices architecture changed the software industry. It solved a lot of problems with the old model: traditional monolithic applications. It helped speed up software delivery and increased the efficiency of technology companies.
However, this architecture also introduced new challenges. These become more visible when deploying applications at a scale.
For example, it's hard to operate each service and ensure its availability. It's also hard to manage the network between thousands of small services. And it's no small feat to upgrade so many services either.
So it became increasingly necessary to have a platform that can tackle these problems and provide an easy way to manage microservices. And here's where Kubernetes comes into play.
Kubernetes is a platform for managing containerized microservices at scale. It allows you to have a large number of containers running and makes it easy to control them. You can manage deployment, configuration, networking, storage and any other application requirement.
Want to learn how Kubernetes works? Check out this video.
Kubernetes helps in overcoming the scalability challenges with microservices, which standalone container engines like Docker can’t solve. However, these tools work hand-in-hand and they’re not a replacement for each other by any means. Just in case you want to learn more than this blog can teach you, check out this Kubernetes course for absolute beginners.
Kubernetes runs in the form of a cluster where multiple nodes are connected to each other. These nodes are simply machines running a Linux operating system with some Kubernetes components. Each node in the cluster can have a role of a master or a worker node.
The master is the control plane responsible for the management of the Kubernetes objects. It is where the interaction usually happens between the administrator and the cluster. Objects are resources that Kubernetes creates to handle a specific task related to the application. For example, a Service object enables network connectivity between Pods running inside the cluster.The worker node is where the actual workloads run. It is the node responsible for running the containers. It communicates with the master to create, terminate, or report the health of containers.
Master node components
There are some Kubernetes components that must be installed on the node to act as a master:
- API server: This component receives the requests sent to the cluster. Requests can be to create a new object or query the state of current objects. It is the entrypoint to communicate with the cluster for any management tasks.
- etcd: This is a key-value database for the cluster. It contains the desired state configuration of the cluster.
- Scheduler: This component selects the best node to run a specific Pod. It watches for newly created Pods and decides the node it should start on based on some criteria.
- Controller manager: This component is responsible for starting higher-level controllers. These controllers implement control loops to monitor the state of the cluster and keep it in the desired state.
Worker node components
Worker nodes on the other hand have different components:
- kubelet: this is the main Kubernetes agent installed on all nodes. It manages containers created by Kubernetes and ensures they are healthy.
- kube-proxy: this component creates some networking rules on each node. These rules enable network communication to pods.
Image source: kubernetes documentation
Kubernetes cluster installation
Now that we’ve covered the main components of a Kubernetes cluster, let’s discuss some of the installation options for Kubernetes.
1. Minikube: This is a local, single-node Kubernetes installation. It’s suitable for learning and development. Just a quick way to get a mini Kubernetes cluster on your personal computer.
You can find instructions for using minikube here.
2- Kubeadm: This is the main tool for bootstrapping and initializing Kubernetes. It is used to manually deploy a Kubernetes cluster on an existing infrastructure.
You can find more information here.
3- Automatic cluster installation: This method deploys Kubernetes cluster using automation tools or scripts. It is a quick way to get a production-grade Kubernetes cluster ready.
One of the most used tools for this method is kubespray.
4- Managed Clusters: suitable for people that want to use Kubernetes but don't want to set up and manage the cluster themselves. Cloud platforms usually offer this service. Simply put, they create and manage the cluster for you. You get access to the final product, an usable Kubernetes cluster. But you don't care about the servers involved, software configuration, security upgrades, and so on. They do all of that for you.
The most known managed cluster services are EKS, GKE and AKS.
5- Playgrounds: with one click, you can get access to a Kubernetes cluster in your browser. This is a suitable learning environment, where you can enter commands and experiment. You can find such playgrounds on KodeKloud.
Interacting with the cluster
After you have your cluster up and running, it’s time to start working with it.
Kubernetes exposes an HTTP REST API for clients to interact with. It is exposed through the API server component. However, users rarely use this API directly. They usually use a tool called kubectl.
Kubectl is a command-line tool used to run commands against the Kubernetes cluster. It allows you to create Kubernetes objects and monitor the cluster health and configuration.
For information on installing kubectl you can check the Kubernetes documentation.
After installing kubectl you can start interacting with the cluster through kubectl commands.
Kubernetes YAML manifests
The main task when interacting with the cluster is creating Kubernetes objects. Kubernetes objects are created using YAML files called manifests. These files act as a template that describe the object to be created. The YAML file is sent to the cluster using the kubectl command. Then the cluster creates this object with the specifications provided in the YAML file.
Let’s create a simple Pod using a YAML file. A Pod is the smallest unit of workload in Kubernetes. It runs one or more containers inside it which hold the application code. You can think of a Pod as an execution environment for the application.
Here's an example of a Pod definition yaml file:
Copy the above code snippet and paste it into a file named nginx.yaml
The YAML file is divided into 4 sections :
1- apiVersion: Kubernetes defines each resource under a specific apiVersion or group. To create a resource you have to specify its apiVersion in the YAML file.
To get a list of available apiVersions use the command kubectl api-versions.
2- kind: this is the type of resource you want to create. Here we're creating a Pod.
To get a list of available resources use the command kubectl api-resources.
3- metadata: this is some information that identifies the resource. You can set a name and a label for the resource here.
4- spec: this is the required configuration for the resource. You can see here that this Pod is going to run a container from the nginx:1.14.2 image.
Now let's deploy that Pod to the cluster.
To deploy a resource to the cluster we use the command kubectl apply -f followed by the YAML file name.
If we check the Pods on the cluster now we'll see our Pod and its status.
We can get more details about our resources using the kubectl describe command.
Now that we've covered the basics of creating an object using YAML files. let's explore some of the most important kubernetes resources required to run an application.
Kubernetes main goal is to run containerized applications. So it creates some resources to satisfy these applications requirements.
Let's explain the core kubernetes resources. These are Replicasets, Deployments, Services, and Ingress.
Running an application on a single Pod is not that great in terms of availability. If this Pod fails the whole application will fail. So we usually want to keep multiple copies/replicas of the application running. This is where the ReplicaSet comes into play.
ReplicaSets ensure that a specified number of replicas of a Pod are running. It keeps monitoring the current number of replicas and compares it with the desired number that we want. If there's any difference, the ReplicaSet will automatically fix it.
For example, we create a ReplicaSet for a specific Pod. We tell the ReplicaSet that we want 3 replicas of this Pod. The ReplicaSet will then create 3 Pods to match this desired number. If at any given time one of the Pods fails the ReplicaSet will automatically detect this and replace it with a new one.
We can also use ReplicaSets to scale our application. We do this by increasing the desired number of replicas in the ReplicaSet. This will automatically adjust the number of running Pods of the application.
Let's inspect a YAML file for a Replicaset and deploy it to the cluster.
Create a file named frontend.yaml and paste the below code snippet to it.
There are some fields here that we need to focus on:
1- replicas : this is the number of Pods that we need to be running at any given time.
2- selector: this field ties the replicaset with its Pods. The Replicaset monitors the Pods that have the same label as the one specified in this selector field. Here, this replicaset will monitor Pods with the label tier: frontend.
3- template: here we specify the specs of the Pods to be created by this replicaset. When we deploy this replicaset it will create 3 Pods from this template. And each time a Pod will be recreated by the replicaset it will use this template.
As usual, we use the kubectl apply command to deploy our resource.
Checking the ReplicaSets in the cluster we can see the one we just created.
And we can also see the Pods that this ReplicaSet created:
Now let's try to delete one of those Pods and see what happens.
You can see that a new Pod was automatically created by the ReplicaSet. frontend-cglf8 was deleted, but frontend-9vjzz quickly appeared and took its place. The ReplicaSet maintains the desired number of replicas.
You can also check this from the logged events of the ReplicaSet.
controlplane ~ ➜ kubectl describe rs frontend Name: frontend Namespace: default Selector: tier=frontend Labels: app=NewApp Annotations: <none> Replicas: 3 current / 3 desired Pods Status: 3 Running / 0 Waiting / 0 Succeeded / 0 Failed Pod Template: Labels: tier=frontend Containers: php-redis: Image: gcr.io/google_samples/gb-frontend:v3 Port: <none> Host Port: <none> Environment: <none> Mounts: <none> Volumes: <none> Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal SuccessfulCreate 6m11s replicaset-controller Created pod: frontend-r89l8 Normal SuccessfulCreate 6m10s replicaset-controller Created pod: frontend-ll99r Normal SuccessfulCreate 6m10s replicaset-controller Created pod: frontend-cglf8 Normal SuccessfulCreate 3m2s replicaset-controller Created pod: frontend-9vjzz
This approach of simply recreating a new Pod to replace the destroyed one is perfect for stateless applications. Stateless applications don't store any data or configuration. They don't need persistent storage to keep their state. Otherwise said, they don't need to "remember" anything. Hence, they can easily be terminated and replaced at any time.
Stateful applications on the other hand need some type of persistent storage to write data. This data can be used by other applications, clients, or processes. An example for this would be a database server or key-value store.
Stateful applications are typically not good candidates for ReplicaSets. They can be managed by another type of Kubernetes resource called a StatefulSet.
That's it for the basics of ReplicaSets. Now let's check another important Kubernetes resource, the Deployment.
Releasing new versions of an application is an important part of its lifecycle. And apps are updated periodically. We either introduce new features or fix some issues in each version. And as Kubernetes hosts our application, it needs a way to manage rolling out new application versions. This is what Deployments do.
Deployments are high-level controllers that manage updates to Pods. This means that you set which version of your application you currently want. And then the Deployment takes care of ensuring this version is released.
Let's explain how this works in more detail.
When you create a Deployment, it automatically creates a ReplicaSet object. The number of replicas in this ReplicaSet is specified in the Deployment YAML file. Also, the template used to create the Pods will be specified in the Deployment YAML file. When this ReplicaSet is created it will start the desired number of Pods from the template.
Now this ReplicaSet and its Pods are managed by the Deployment. The Deployment will keep monitoring the Pod template for changes.
Now let's say you want to release a newer version of the application. You build your new container image with your new code. Then all that you have to do is replace the image specification in the Pod template, inside the deployment YAML file. When the Deployment detects this change, it will understand that new Pods need to be created with this newer version.
The Deployment will then create a new ReplicaSet. New Pods will now be created under this ReplicaSet and the old Pods will be terminated. The termination of old Pods and creation of new Pods is controlled by a Strategy.
Let's create a file, name it nginx-deployment.yaml, and add this content to it:
Here, the Deployment will create a ReplicaSet with 3 Pods. Each Pod runs a single container from the image nginx:1.14.2.
Let's deploy this to the cluster.
You can check the status of rolling out the new Pods with the rollout status command.
Listing the available Deployments we can see our Deployment there.
Now let's check if we have any ReplicaSets created.
We can see the ReplicaSet created by our Deployment here.
If we check the details of our Deployment we can also find the ReplicaSet there.
Let's check the available Pods.
These are the 3 Pods created by our Deployment.
Now let's try to update the image in the Deployment to nginx:1.16.1.
If we check the available ReplicaSets now we can find a new one created.
This new ReplicaSet is now running the new version of the Nginx image inside its Pods. You can also see that the Pods in the old ReplicaSet are being terminated. This is how a new version of an application replaces an old version.
You can see the new image and new ReplicaSet in the Deployment details now.
The newly created Pods should be running now.
You can also check the history for a Deployment using the rollout history command.
Services and Ingress
After we've deployed our application we need to enable network access to it. When Pods run on Kubernetes they are automatically assigned an IP address. However, this IP doesn't persist when Pods are terminated and recreated. The IP changes. So we need an object to provide a stable IP for clients to connect to.
This is where the Service comes into play.
Service is a Kubernetes resource that provides a stable IP to connect to. Traffic received on this IP is then forwarded to backend Pods. This way Services enable reliable network connectivity to applications.
Network connectivity to Pods can be internal or external. Internal connections happen within the cluster, between Pods. External connections are received from clients outside the cluster. And for each type of connection there is a different type of Service.
Ingress on the other hand allows for a more intelligent external access. It can make more complex decisions.. It uses the hostname, path, or other application content encoded in the network traffic. This enables more flexibility in the routing decisions.
You can find more details about Services and Ingress here.
Now let's create a simple Service, of the type called ClusterIP. We'll create a file called frontend-service.yaml and add this content to it:
Here you can see the kind of the resource is Service. And the type of Service is ClusterIP.
targetPort is the port that the backend Pods listen for incoming connections on. The service will forward traffic to this port. Ultimately, network traffic will reach the application running inside the Pod.
port is the port that the Service itself will receive traffic on.selector is what ties the Service to its backend Pods. The Service checks for Pods with similar labels and forwards the traffic to them.
Let's deploy this yaml file to the cluster.
Listing available Services we can see the frontend Service there.
Let's check the details of the Service.
The important thing to notice here is the Endpoints. Endpoints are the backend Pods that this Service forwards traffic to. It shows <none> because we didn't deploy any Pods with a label that matches the Service selector.
Now let's create a Deployment for the backend Pods. We'll create a file called frontend-deployment.yaml and add this content to it:
Let's add this Deployment into Kubernetes:
And we can check the details of the Service now:
Notice the difference. Now we can see 3 Endpoints which are the 3 Pods created by the Deployment.
However, if we tried to connect to this Service on the node it would fail.
The reason for this is that ClusterIP services are only internal to the cluster. They can't be reached from outside the node.
For outside traffic to be able to enter through a node, we need a NodePort Service. Let's create a file called nodeport-frontend.yaml and add this content to it:
You can see here an additional nodePort field. This is the entry Port on the node. All traffic coming to the node, on that port, will be forwarded to the proper place. In this case, traffic will be forwarded to some Pod on port 80.
Checking the details of the NodePort Service:
Now trying to connect to the NodePort should succeed:
We've basically "entered" a node through port 30008. After that, the Service forwarded our network traffic to a Pod, on port 80. And just like that, from the outside world, we reached an internal Kubernetes Pod.
Connecting from a browser:
That's it for the basics of Services. Please check this blog if you want to know how to use the Ingress resource.
We hope this clears some of the concepts of Kubernetes and helps you get started with this amazing technology. If you want an easy way to learn about this platform, and get some experience, check out our Kubernetes learning path. It will help you get your Kubernetes knowledge to the next level.