Kubernetes V1.28 Release

Hey there! Kubernetes has just rolled out its new version, 1.28. It's packed with cool new stuff, better features, and some fixes here and there. But what's really special about this update? It's the story behind it.

They've named this release 'Planternetes'. It's a fun way to show how many different people, from all walks of life, came together to make this version of Kubernetes. Whether they're tech pros, busy parents, or students just diving into the world of tech, everyone pitched in. It's like a big team project where everyone has something valuable to add.

So, with 1.28, it's not just about the techy side of things. It's a shoutout to all the awesome folks who worked hard to make something that's used by so many people around the world. Way to go, team!

Kubernetes presents its second release for 2023, encompassing a total of 45 enhancements. Here's a breakdown:

19 are newly introduced or improved preliminary features. (alpha)
14 have progressed and will be automatically activated from this release onward. (beta)
12 have achieved full readiness after rigorous testing. (stable)

Kubernetes adheres to a systematic progression for each enhancement, ensuring quality and reliability. For a comprehensive understanding of this methodology, we recommend viewing our informative video titled "Kubernetes Enhancements Proposals".

What’s New - A 20,000 feet view

Let's delve into a comprehensive overview of the Kubernetes 1.28 release. Each iteration of Kubernetes brings forth a range of enhancements, typically segmented into Features, API modifications, Documentation updates, Deprecations, and addressed Bugs/Regressions. The 1.28 release is no exception, boasting a substantial list of improvements as detailed in the release notes. Here, we spotlight some of the most significant enhancements from this list for a clearer understanding.

API Awareness of Sidecar Containers: Introduced in its alpha phase, this enhancement ensures that the API is more attuned to sidecar containers, optimizing their operations.
Recovery from Non-Graceful Node Shutdown: This feature, now in its General Availability (GA) phase, facilitates efficient recovery when a node shuts down unexpectedly.
CustomResourceDefinition (CRD) Validation Rules: The validation rules for CRD have undergone improvements, ensuring more accurate and efficient custom resource definitions.
Automatic StorageClass Assignment: Kubernetes now retroactively assigns a default StorageClass, streamlining storage operations.
SelfSubjectReview API Promotion: The SelfSubjectReview API has been promoted to General Availability, indicating its stability and readiness for widespread use.
Backdating of Kubeadm CA Certificates: This enhancement allows for the generation of kubeadm CA certificates with backdated timestamps.
Pods with Volumes and User Namespaces: Users can now deploy pods that utilize both volumes and user namespaces, expanding the range of pod configurations.
Kubernetes built with Go 1.20.5: In its recent updates, Kubernetes has been built using Go version 1.20.5. This integration ensures that Kubernetes benefits from the latest optimizations, security patches, and features offered by this specific version of Go.
Kubeadm Config Validate Command: A new command to validate the kubeadm configuration, ensuring that users can verify their configurations with ease.

These enhancements, among others, signify Kubernetes' commitment to continuous improvement and user-centric development.

Here's KodeKloud's video on this release. Take a look!

Highlighting the Foremost Enhancement

The enhancement that has garnered the most attention and discussion within the community is prominently placed at the top of our list.

API awareness of sidecar containers which is now in alpha

Why We Need API Awareness for Sidecar Containers

In the world of Kubernetes, think of sidecar containers as helpful buddies to the main app in a pod. They add extra tools like logging or monitoring.

Sidecar Container: What is it and How to use it (Examples)
Learn what sidecar containers are and how to use them in Kubernetes. See two basic examples of how sidecar containers can access logs from the main container using different methods.

But here's the thing: in the past, Kubernetes saw all containers in a pod as the same, not knowing which one was the main app and which ones were the helpers. This could cause some hiccups when starting or stopping the pod because the order matters for some apps. So, having Kubernetes recognize and treat sidecar containers differently is a smart move to make things run smoother.

Problems We Ran Into Without Sidecar Awareness

Before Kubernetes knew the difference between main apps and their helper sidecars:

  • Sometimes, the helper sidecars started after the main app. This meant we missed out on some early logs or checks.
  • When shutting down, if the main app stopped first, the helper might keep running on its own. This was like leaving the lights on in an empty room - a waste of energy and sometimes causing mistakes.
  • Since there wasn't a clear way to tell Kubernetes which containers were sidecars, people came up with their own fixes. But these could be hit-or-miss and sometimes caused more problems.

The Upgrade with Sidecar Awareness

So, here's the cool update in Kubernetes v1.28:

Now, when you set up a pod, you can clearly tell Kubernetes which containers are the main apps and which ones are the helpful sidecars. What does this mean?

  • Kubernetes will make sure to start the sidecar helpers first. So, by the time the main app kicks off, all the tools it needs (like logging) are ready to go.
  • When it's time to shut everything down, Kubernetes will first stop the main app but let the sidecar finish its job, like sending out any last-minute logs.
  • Best of all, there's now a clear, standard way to set up and manage these sidecars. No more guessing or using tricky workarounds!

In short, this update makes everything smoother and more reliable when using sidecars with your main app.

New Feature for Init Containers: The Restart Policy

Alright, let's break this down in simpler terms:

Kubernetes has a new trick up its sleeve for init containers, called the "restartPolicy". This helps when an init container is also playing the role of a helper, or what we call a sidecar. Here's what it does:

  • When you have init containers set to "Always" for restartPolicy, they start in the order you list them, just like other init containers.
  • The cool part? Instead of waiting for the helper init container to finish its job, Kubernetes gets the main containers of the pod going as soon as the helper starts.
  • A helper container is considered "ready to go" once it passes its startup check and finishes its starting tasks. If there's no startup check, Kubernetes just waits for the starting tasks to finish.
  • For those init containers that only run once before the main app starts, you can either skip setting the restartPolicy or just set it to "Always".
  • Here's a bonus: helper containers won't stop a Pod from finishing its job. Once all the main containers are done, the helpers will be stopped too.
  • And one more thing: even if a Pod is set to not restart or only restart on issues, a helper container that's already started will restart if it faces any hiccups or finishes its job. This is true even when the Pod is shutting down.

In short, this new feature makes sure everything starts and stops in the right order, making things run smoother!

Another Big Update:

Recovery from Non-Graceful Node Shutdown

When Kubernetes Nodes Suddenly Shut Down

Think of Kubernetes as a busy city, and its nodes are like buildings. Sometimes, a building might suddenly close down because of power cuts, broken equipment, or other unexpected problems. When that happens, all the activities inside that building get interrupted, and it can cause confusion or even mess things up.

In the same way, when a Kubernetes node shuts down without warning, everything it was doing can get stuck or mixed up. This can lead to lost data or interruptions in the services it was providing.

Why We Need a Safety Net for Sudden Shutdowns

Picture this: You have an important database running, kind of like a busy cashier at a store. Now, imagine if the lights suddenly go out. The cashier might lose track of sales, money could go missing, or the store might have to close for a while.

That's what happens when a Kubernetes node, where our database is, shuts down out of the blue. Things can get messy, and we might lose data or face interruptions. So, it's super important for Kubernetes to notice these hiccups and quickly set things right.

Previously encountered challenge

Prior to the recent improvements, we encountered a significant challenge when a node unexpectedly went offline. The pods that were actively running on this node would linger in an 'unknown' state for an extended duration. This situation was problematic because it hindered these pods from transitioning to a functional node.

The underlying cause of this issue was the unavailability of the kubelet on the offline node to execute the necessary pod deletions. As a result, the StatefulSet was unable to instantiate a new pod bearing the same name. To compound the problem, if these pods were utilizing volumes, the VolumeAttachments associated with the original, now-inactive node wouldn't be removed. This restriction meant that the volumes couldn't be linked to a new, active node, causing potential interruptions to the application's operations.

Enhancement Overview:

To address a specific challenge within our system, we've introduced a solution that allows users to manually designate a Node as unavailable. This is achieved by applying a label named node.kubernetes.io/out-of-service to a Node, which can have either a NoExecute or NoSchedule effect.

When the NodeOutOfServiceVolumeDetach feature is activated within the kube-controller-manager and a Node carries this label, any residing pods lacking the necessary tolerances will be promptly evicted. This efficient approach accelerates the volume detachment process for terminating pods on that Node, facilitating their swift relocation to an alternative Node.

Stability and Feature Promotion:

We're excited to announce that the "Non-Graceful Node Shutdown" feature has now achieved stable status. With this promotion, the "NodeOutOfServiceVolumeDetach" feature gate in the kube-controller-manager is inherently enabled and remains immutable.

Enhanced Metrics for Insight:

But our enhancements don't stop there. The Kubernetes team has enriched the system with insightful metrics to provide a comprehensive understanding of internal operations. For example, metrics such as force_delete_pods_total and force_delete_pod_errors_total in the Pod GC Controller offer a granular analysis of the reasons behind the forceful deletion of specific pods. Additionally, the attachdetach_controller_forced_detaches metric provides insights into the rationale for the forceful detachment of certain volumes. Essentially, these metrics serve as an intuitive dashboard, shedding light on the reasons behind specific system actions, thereby simplifying the troubleshooting process.

Looking Ahead: Future Enhancements

For those with an eye on the future, there are promising developments in the pipeline. While our current system requires manual intervention to manage nodes that experience unexpected shutdowns, the Kubernetes team is actively exploring strategies to automate this aspect. Envision a scenario where Kubernetes possesses the capability to immediately identify an offline node and seamlessly transition workloads to an operational node, all without any manual intervention. This level of automation is precisely what we anticipate in the upcoming iterations. For a deeper dive into the forthcoming advancements, please refer to the official Kubernetes blog post: Kubernetes 1.28: Non-Graceful Node Shutdown GA - What's Next.

Up next on our list...

Improvements to CustomResourceDefinition (CRD) validation rules

Imagine you're a developer working with Kubernetes, and you want to introduce a new resource type called Database to manage various databases (like MySQL, PostgreSQL, MongoDB) within your cluster. This Database resource would have attributes like type, version, storage size, and connection credentials. While Kubernetes is powerful, it doesn't natively understand or manage "databases."

Below is an example of a CRD for a new resource type called Database.

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: databases.example.com
spec:
  group: example.com
  versions:
    - name: v1
      served: true
      storage: true
      schema:
        openAPIV3Schema:
          type: object
          properties:
            spec:
              type: object
              properties:
                type:
                  type: string
                  enum: ["MySQL", "PostgreSQL", "MongoDB"]
                version:
                  type: string
                storageSize:
                  type: string
                connectionCredentials:
                  type: object
                  properties:
                    username:
                      type: string
                    password:
                      type: string
                      format: password
                    host:
                      type: string
                    port:
                      type: integer
  scope: Namespaced
  names:
    plural: databases
    singular: database
    kind: Database
    shortNames:
      - db

Why it's needed:

CustomResourceDefinitions (CRDs) come to the rescue in such scenarios. They allow users like you to define and manage these custom Database resources, making Kubernetes adaptable to this specific use case. However, to ensure that each Database resource is correctly configured and doesn't disrupt the cluster's operations, it's crucial to have validation mechanisms in place. This is where the challenges begin.

Challenges encountered:

  • To ensure that each Database resource is valid (e.g., has a supported database type or a valid version number), developers previously had to use admission webhooks for validation. While effective, these webhooks added complexity.
  • Implementing, maintaining, and troubleshooting these webhooks was not straightforward. For teams without deep Kubernetes expertise, this could become a significant operational burden.
  • If a Database resource failed validation, the feedback wasn't always clear. Developers could spend unnecessary time debugging, trying to figure out what went wrong and where.

Improvements made:

With Kubernetes 1.28, the process of validating CRDs like our Database resource has been greatly simplified and enhanced:

Instead of relying on external webhooks, developers can now embed validation rules directly within the CRD schema using the Common Expression Language (CEL).

Common Expression Language (CEL) is a lightweight and fast expression language that can be embedded in applications to provide an evaluation environment. With Kubernetes 1.28, you can use CEL expressions in the validation schema of your CRD to enforce more complex rules than what's possible with the OpenAPIV3 schema alone.

Here is an example of a Database CRD that uses CEL for validation:

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: databases.example.com
spec:
  group: example.com
  versions:
    - name: v1
      served: true
      storage: true
      schema:
        openAPIV3Schema:
          type: object
          properties:
            spec:
              type: object
              properties:
                type:
                  type: string
                  enum: ["MySQL", "PostgreSQL", "MongoDB"]
                version:
                  type: string
                storageSize:
                  type: string
                connectionCredentials:
                  type: object
                  properties:
                    username:
                      type: string
                    password:
                      type: string
                      format: password
                    host:
                      type: string
                    port:
                      type: integer
          x-kubernetes-validator:
            - rule: "spec.storageSize.matches('^[0-9]+Gi$') || spec.storageSize.matches('^[0-9]+Mi$')"
              message: "storageSize must be a value in Gi or Mi, e.g., 10Gi, 512Mi"
            - rule: "spec.connectionCredentials.port > 0"
              message: "Port must be a positive integer"
  scope: Namespaced
  names:
    plural: databases
    singular: database
    kind: Database
    shortNames:
      - db

If a Database resource doesn't meet the validation criteria, the new reason and fieldPath fields provide clear feedback, pinpointing the exact issue.

The reason field will contain a description of the validation error, and the fieldPath field will contain the path to the field that caused the validation error.

For example, consider the following Database custom resource that violates the validation rules specified in the CRD:

apiVersion: example.com/v1
kind: Database
metadata:
  name: mydatabase
spec:
  type: PostgreSQL
  version: "13"
  storageSize: 10G
  connectionCredentials:
    username: admin
    password: secret
    host: mydatabase.example.com
    port: -5432

In this example, the storageSize field has a value of 10G, which does not match the required pattern of ending in Gi or Mi, and the port field has a negative value, which is not allowed.

If you try to create this resource, the API server will reject it, and the status field of the resource will contain information about the validation errors:

status:
  conditions:
    - type: "Invalid"
      status: "True"
      reason: "ValidationError"
      message: 'validation failure list:\n\
                spec.storageSize in body should match '^[0-9]+Gi$' || '^[0-9]+Mi$'\n\
                spec.connectionCredentials.port in body should be greater than or equal to 1'
      fieldPath: 'spec.storageSize, spec.connectionCredentials.port'

This integrated approach reduces the need for external tools and streamlines the entire CRD validation process.

Automatic, retroactive assignment of a default StorageClass graduates to stable

Why it's needed:

In Kubernetes, when users create a PersistentVolumeClaim (PVC) to request storage, they can specify a StorageClass to determine the type and configuration of the storage provisioned. However, not all users are aware of this or may forget to specify a StorageClass. Without a default mechanism in place, these PVCs would remain unbound, leading to potential confusion and operational overhead.
If a user forgets to specify a StorageClass when creating a PVC, the PVC would remain unbound. This means that the PVC would not be associated with any storage, leading to potential application failures or disruptions.

Cluster administrators would need to manually intervene to either bind these PVCs or inform users to specify a StorageClass, leading to additional operational tasks.

Especially for newcomers or those not deeply familiar with Kubernetes storage concepts, the absence of a default mechanism can lead to confusion. They might wonder why their storage requests are not being fulfilled.

Enhancement with technical details:

With the graduation of this feature to stable in Kubernetes v1.28, the system now automatically sets a storageClassName for a PVC if the user doesn't provide a value. This means that if a user creates a PVC without specifying a StorageClass, Kubernetes will automatically assign a default StorageClass to that PVC. Furthermore, the control plane also retroactively sets a StorageClass for any existing PVCs that don't have a storageClassName defined. This ensures that even PVCs created in previous versions of Kubernetes without a specified StorageClass can benefit from this feature.

The behavior of automatically assigning a default StorageClass is now automatic and always active in Kubernetes v1.28, marking its transition to general availability.

For more in-depth details, you can refer to the StorageClass documentation in Kubernetes.

Promotion of the SelfSubjectReview API

In a Kubernetes setup, figuring out who the user is after they've been authenticated can get pretty complicated. You've got different authentication methods like Proxy, OIDC, webhook, or even a mix of these. And each of these methods can mess around with the user's attributes, making it tough to know the final identity used for authorization.

Let's take an example to make it clearer. Imagine you've got a Kubernetes cluster where users log in through a webhook Token. This webhook thing adds certain groups to the user based on their roles in the organization. Now, picture this: John, one of the users, is trying to figure out why he can't access a particular resource in the cluster. He knows his access depends on the groups he’s part of, but he’s not sure which groups the webhook has given him.
In the good ol' days before the SelfSubjectReview API came along, John would have to dig through the webhook authenticator's logs or bug the cluster admin to find out which groups he belongs to. That's not just a hassle but also means messing with sensitive logs or bothering the admin.

But guess what? The SelfSubjectReview API really helps out! Now, John can simply send a request to that API endpoint, and the Kubernetes API server will spill the beans on his user attributes, including the groups he’s in. No more sifting through logs or bugging the admin – he can now troubleshoot his access issue all by himself and quickly too!

For example, a John can make a request to the SelfSubjectReview API endpoint like this:

POST
/apis/authentication.k8s.io/v1alpha1/selfsubjectreviews
{
	"apiVersion": "authentication.k8s.io/v1alpha1",
    "kind": "SelfSubjectReview"
}

And the response might look like this:

{
  "apiVersion": "authentication.k8s.io/v1alpha1",
  "kind": "SelfSubjectReview",
  "status": {
    "userInfo": {
      "name": "jane.doe",
      "uid": "b6c7cfd4-f166-11ec-8ea0-0242ac120002",
      "groups": [
        "viewers",
        "editors"
      ],
      "extra": {
        "provider_id": [
          "token.company.dev"
        ]
      }
    }
  }
}

Alongside, the corresponding kubectl command ‘kubectl auth whoami’ is now generally available.

Backdate generated kubeadm CA certificates

Why Kubeadm CA Certificates?

In a Kubernetes cluster, Kubeadm CA (Certificate Authority) certificates play a vital role in ensuring secure communication between various components. They act as a trusted entity that can verify and authenticate the identity of other components within the cluster. This is essential for maintaining the integrity and confidentiality of the data being exchanged.

The Problem with Clock Desynchronization

In a perfect world, all the clocks in a distributed system like Kubernetes would be perfectly synchronized. However, in reality, slight deviations in time can occur between different parts of the system. This desynchronization might seem trivial, but it can lead to problems with certificate validation.

Certificates have a validity period, defined by a start time and an end time. If a system's clock is out of sync, it might think that a valid certificate is expired or not yet valid,

The Solution in Kubernetes 1.28

To address this challenge, Kubernetes 1.28 introduced a change in how kubeadm generates CA certificates. Now, the start time of these certificates is offset by 5 minutes in the past relative to the current system time. This offset acts as a buffer to accommodate potential clock desynchronization, reducing the likelihood of certificate validation issues.

Enabled use of pods with volumes and user namespace

The Kubernetes 1.28 feature of enabling the use of pods with volumes and user namespaces is a significant advancement in the security and isolation aspects of applications running on Kubernetes. This feature is particularly essential in multi-tenant environments, where various users and applications share the same underlying resources. Here's a detailed look at this feature, its background, and why it's needed:

Background

The feature is described in KEP-127, and it aims to support user namespaces within Kubernetes pods. User namespaces are a feature of the Linux kernel that isolates user and group ID ranges, allowing the same user and group IDs to be different inside and outside a container.

There are three Key Components.

User Namespaces: These provide a way to isolate the user IDs and group IDs, ensuring that a process has no privileges outside its own namespace. This isolation enhances security by preventing a process running inside a container from affecting processes running on the host system.
Pod Volumes: The feature also includes the ability to map user and group IDs in the volumes associated with a pod. This ensures that the files' ownership is consistent with the user namespace, allowing proper access control.
Mapping Configuration: Administrators can configure ID mappings, defining how user and group IDs in a container are mapped to IDs on the host.

Need for the Feature

There have been some security issues in the past where processes inside containers tried to cause trouble. Here are some examples:

CVE-2019-5736: There was a problem where a container could overwrite a critical file. But with user namespaces, this issue is fully prevented.
Azurescape: This was a big issue where one container could take over another. But again, user namespaces stop this from happening.
CVE-2021-25741 & CVE-2017-1002101 & CVE-2021-30465: In these cases, a process inside a container could act like it's the boss outside. But with user namespaces, this is stopped.
CVE-2016-8867: This was an issue where a process inside a container could get more powers. But user namespaces prevent this.
CVE-2018-15664: There was a problem where a process could read or change files on the main system. But user namespaces stop this from happening.

In the world of Kubernetes, the idea is to use these user namespaces to make sure that processes inside containers have different IDs than those outside. This way, even if a process inside a container thinks it's the boss, if it ever escapes to the outside world, it won't have any special powers. This is a good thing because it means if something goes wrong inside the container, it won't affect the main system.

Kubernetes built with Go 1.20.5

Kubernetes, the popular container orchestration system, is like a well-oiled machine that helps manage and run containers (small units that package up the code and all its dependencies so the application runs quickly and reliably). To build this machine, a programming language called Go is used.Now, with the 1.28 release, Kubernetes is built using a newer version of Go, specifically 1.20.5.

Why Does This Matter?

Think of Go as the toolbox used to build and maintain the machine (Kubernetes). When the toolbox gets new tools or existing tools are sharpened and improved, the machine can be built and maintained more efficiently.

Performance Improvements: The new version of Go often comes with tweaks and enhancements that make the code run faster and more efficiently.
Bug Fixes: Just like fixing squeaks and rattles in a machine, the new version of Go helps in fixing issues in the code that might have been causing problems. This leads to a more stable and reliable Kubernetes.
New Features: Sometimes, new versions of Go introduce new capabilities that developers can use.

How Does This Affect Me?

If you're using Kubernetes, this update means that you might notice it running more smoothly and efficiently.
If you're a developer working with Kubernetes, you might find new possibilities and improvements that make your work easier. It's like discovering new tools in your toolbox that help you do your job better.

kubeadm ‘config validate’ command

Background and Why It's Needed

Kubeadm - as we all know - is a tool within Kubernetes that helps in bootstrapping a Kubernetes cluster. It relies on configuration files to know how to set up the cluster. These files can be intricate, and a small mistake in them can lead to problems down the line.

That's what the "kubeadm config validate" command does for Kubernetes configurations.

Introducing the "kubeadm config validate" Command

The new "kubeadm config validate" command is like a pre-flight check for your Kubernetes configuration files. Here's what it does:

Catches Errors Early: By validating the configuration files before applying them, it helps catch mistakes early in the process.
Saves Time and Frustration: Finding and fixing a configuration error after it has caused an issue can be time-consuming and frustrating. This validation step helps avoid that by pointing out the problems before they cause trouble.
Improves Confidence: Knowing that your configuration files have been validated and are error-free can give you confidence that the setup process will go smoothly.

How to Use It

Using the command is simple. You just run "kubeadm config validate" followed by the path to your configuration file, and it will check the file for any errors or inconsistencies. Here is an example:

Consider a kubeadm configuration file with a mistake:

Kubeadm-config.yaml

apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
kubernetesVersion: v1.20.0
controlPlaneEndpoint: "control-plane.example.com:6443"
networking:
  podSubnet: "10.244.0.0/16"
  serviceSubnet: "10.244.0.0/12" # mistake: overlaps with podSubnet

In this example, the serviceSubnet overlaps with the podSubnet, which is not allowed.
Run the kubeadm config validate command with the path to your configuration file:

kubeadm config validate -f kubeadm-config.yaml

The output will show the error:

[ERROR overlapping]: PodSubnet (10.244.0.0/16) overlaps with ServiceSubnet(10.244.0.0/12) 
To see the stack trace of this error execute with --v=5 or higher

That concludes our discussion for today. If you found this article insightful, be sure to subscribe or follow our platform for the latest insights on Kubernetes. Until our next post, take care!

Conclusion

From here you can checkout the following resources to gain a thorough understanding of the Kubernetes 1.28 release.

It is packed with new features, enhancements, and deprecations. To explore the exhaustive list of changes, you can check out the release notes page.

https://github.com/kubernetes/sig-release/blob/master/releases/release-1.28/release-notes/release-notes-draft.md

On the other hand, some features have been deprecated or removed, which you can also find in the above link.