Hello KodeKloud fam,
Please help me to understand the problem here. The question reads:
Solve this question on: ssh cluster4-controlplane
Identify and fix the issue that occurs while running
kubectl commands on cluster4?
Okay, so cluster4 is jammed lol. I’ll take you through the steps I took as best as I can and then I hope someone can help me trace my steps. I have tried to go through the solution provided by the examiner but even that does not seem to work
So the first step I took was obviously the k get pods
command…lol. then I got an error that looked like:
The connection to the server cluster4-controlplane:6443 was refused - did you specify the right host or port?
I did a quick check to ensure that my kubelet.service was running (it was) and decided it might be that my kube-apiserver was down, so I drilled down to the container level by using crictl ps -a
command. When I got the container ID, I did a crictl logs <container ID>
. In the logs, I saw something like: command 'etcd-server' does not exist
. Long story short, I changed that to --etcd-servers.
Now, every time I run a kubectl command, I get the same error:
The connection to the server cluster4-controlplane:6443 was refused - did you specify the right host or port?
I have tried to read the logs to understand it but it’s all “foreign language” to me…lol
Hopefully someone can take a look at it and actually explain what they see. I tried to go with the provided solution which says:
Step 5: Validate CA Certificate Path
If the logs now show a TLS error like:
transport: authentication handshake failed: tls: failed to verify certificate: x509: certificate signed by unknown authority
This means the API server is reaching etcd, but the certificate authority (CA) being used is incorrect.
Open the manifest again:
sudo vi /etc/kubernetes/manifests/kube-apiserver.yaml
Look for this line:
- --etcd-cafile=/etc/kubernetes/pki/ca.crt
This is wrong. The API server should use the etcd-specific CA certificate.
Fix it:
- --etcd-cafile=/etc/kubernetes/pki/etcd/ca.crt
Save and exit.
First of all, I cannot seem to find an error that the solution suggests (I understand that the error may also be reading differently) and even when I change the --etcd-cafile option, I still continue to get the error message.
Thank you all so much.