Unmesh Patil:
Folks, I’m getting an error while initializing the controlplane node in the “Test cluster installation using kubeadm” lab. When I run kubeadm init with the given parameters, it throws up this error:-
> [etcd] Creating static Pod manifest for local etcd in “/etc/kubernetes/manifests”
> [wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory “/etc/kubernetes/manifests”. This can take up to 4m0s
> [kubelet-check] Initial timeout of 40s passed.
> [kubelet-check] It seems like the kubelet isn’t running or healthy.
> [kubelet-check] The HTTP call equal to ‘curl -sSL http://localhost:10248/healthz’ failed with error: Get “http://localhost:10248/healthz”: dial tcp 127.0.0.1:10248: connect: connection refused.
> [kubelet-check] It seems like the kubelet isn’t running or healthy.
> [kubelet-check] The HTTP call equal to ‘curl -sSL http://localhost:10248/healthz’ failed with error: Get “http://localhost:10248/healthz”: dial tcp 127.0.0.1:10248: connect: connection refused.
> [kubelet-check] It seems like the kubelet isn’t running or healthy.
> [kubelet-check] The HTTP call equal to ‘curl -sSL http://localhost:10248/healthz’ failed with error: Get “http://localhost:10248/healthz”: dial tcp 127.0.0.1:10248: connect: connection refused.
> [kubelet-check] It seems like the kubelet isn’t running or healthy.
> [kubelet-check] The HTTP call equal to ‘curl -sSL http://localhost:10248/healthz’ failed with error: Get “http://localhost:10248/healthz”: dial tcp 127.0.0.1:10248: connect: connection refused.
> [kubelet-check] It seems like the kubelet isn’t running or healthy.
> [kubelet-check] The HTTP call equal to ‘curl -sSL http://localhost:10248/healthz’ failed with error: Get “http://localhost:10248/healthz”: dial tcp 127.0.0.1:10248: connect: connection refused.
Kubelet seems to be fine:-
> ● kubelet.service - kubelet: The Kubernetes Node Agent
> Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
> Drop-In: /etc/systemd/system/kubelet.service.d
> └─10-kubeadm.conf
> Active: active (running) since Wed 2021-08-11 12:36:05 UTC; 1s ago
> Docs: https://kubernetes.io/docs/home/
> Main PID: 39654 (kubelet)
> Tasks: 20 (limit: 7372)
> CGroup: /system.slice/kubelet.service
> └─39654 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/ku
Any ideas?
James:
I literally hit the same thing. I think this may be a new problem now that it’s running v1.22?
Anyway the problem is the cgroup driver docker is using. It’s not set to “systemd” which is what kubelet’s expecting.
Just modify native.cgroupdriver in /etc/docker/daemon.json and you should be good to go.
James:
I’m actually finding the “Check” isn’t completing at Step 5 too. Thinks the master node’s not initialised, even though it is (have copied in the kube config to ~/.kube) and I can run kubectl commands successfully
James:
btw, you also need to do the docker change on node01 as well…
James:
Last note on this one. It asks you to install the flannel CNI, but there’s no http://kubernetes.io|kubernetes.io documentation that you could look up in the exam to do this. Suggest it’s changed to weave @Mumshad Mannambeth @Anant Trivedi
Parvesh Kumar:
hi i also faced same problem many times
Parvesh Kumar:
modified
Parvesh Kumar:
{
“exec-opts”: [
“native.cgroupdriver=systemd”
],
“bip”:“172.12.0.1/24”,
“registry-mirrors”: [
“http://docker-registry-mirror.kodekloud.com”
]
}
~
Parvesh Kumar:
still there is issue
Parvesh Kumar:
root@controlplane:~# kubeadm init --pod-network-cidr 10.244.0.0/16 --apiserver-advertise-address=10.37.92.3
[init] Using Kubernetes version: v1.22.0
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using ‘kubeadm config images pull’
[certs] Using certificateDir folder “/etc/kubernetes/pki”
[certs] Using existing ca certificate authority
[certs] Using existing apiserver certificate and key on disk
[certs] Using existing apiserver-kubelet-client certificate and key on disk
[certs] Using existing front-proxy-ca certificate authority
[certs] Using existing front-proxy-client certificate and key on disk
[certs] Using existing etcd/ca certificate authority
[certs] Using existing etcd/server certificate and key on disk
[certs] Using existing etcd/peer certificate and key on disk
[certs] Using existing etcd/healthcheck-client certificate and key on disk
[certs] Using existing apiserver-etcd-client certificate and key on disk
[certs] Using the existing “sa” key
[kubeconfig] Using kubeconfig folder “/etc/kubernetes”
[kubeconfig] Using existing kubeconfig file: “/etc/kubernetes/admin.conf”
[kubeconfig] Using existing kubeconfig file: “/etc/kubernetes/kubelet.conf”
[kubeconfig] Using existing kubeconfig file: “/etc/kubernetes/controller-manager.conf”
[kubeconfig] Using existing kubeconfig file: “/etc/kubernetes/scheduler.conf”
[kubelet-start] Writing kubelet environment file with flags to file “/var/lib/kubelet/kubeadm-flags.env”
[kubelet-start] Writing kubelet configuration to file “/var/lib/kubelet/config.yaml”
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder “/etc/kubernetes/manifests”
[control-plane] Creating static Pod manifest for “kube-apiserver”
[control-plane] Creating static Pod manifest for “kube-controller-manager”
[control-plane] Creating static Pod manifest for “kube-scheduler”
[etcd] Creating static Pod manifest for local etcd in “/etc/kubernetes/manifests”
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory “/etc/kubernetes/manifests”. This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.
[kubelet-check] It seems like the kubelet isn’t running or healthy.
[kubelet-check] The HTTP call equal to ‘curl -sSL http://localhost:10248/healthz’ failed with error: Get “http://localhost:10248/healthz”: dial tcp 127.0.0.1:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn’t running or healthy.
[kubelet-check] The HTTP call equal to ‘curl -sSL http://localhost:10248/healthz’ failed with error: Get “http://localhost:10248/healthz”: dial tcp 127.0.0.1:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn’t running or healthy.
[kubelet-check] The HTTP call equal to ‘curl -sSL http://localhost:10248/healthz’ failed with error: Get “http://localhost:10248/healthz”: dial tcp 127.0.0.1:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn’t running or healthy.
[kubelet-check] The HTTP call equal to ‘curl -sSL http://localhost:10248/healthz’ failed with error: Get “http://localhost:10248/healthz”: dial tcp 127.0.0.1:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn’t running or healthy.
[kubelet-check] The HTTP call equal to ‘curl -sSL http://localhost:10248/healthz’ failed with error: Get “http://localhost:10248/healthz”: dial tcp 127.0.0.1:10248: connect: connection refused.
Unfortunately, an error has occurred:
timed out waiting for the condition
This error is likely caused by:
- The kubelet is not running
- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)
If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
- 'systemctl status kubelet'
- 'journalctl -xeu kubelet'
Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI.
Here is one example how you may list all Kubernetes containers running in docker:
- 'docker ps -a | grep kube | grep -v pause'
Once you have found the failing container, you can inspect its logs with:
- 'docker logs CONTAINERID'
error execution phase wait-control-plane: couldn’t initialize a Kubernetes cluster
To see the stack trace of this error execute with --v=5 or higher
Parvesh Kumar:
after docker restart it worked
Unmesh Patil:
Thx @James will try this out shortly. I’ve been trying to complete that lab since morning, but experiencing various connectivity issues to the lab. I will try again in some time.