ETCD Restore is not working - Need help

Hi,
It seem that there is no concreate step for ETCD restore. I am facing challenge. Below is my scenario.

  1. ETCD and kube-apiserver both are running as static POD. So, service start/stop is not doable which is mentioned in kodekloud.

  2. I took the backup of ETCD
    sudo ETCDCTL_API=3 etcdctl snapshot save /tmp/etcd-backup-1 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --endpoints=https://127.0.0.1:2379 --key=/etc/kubernetes/pki/etcd/server.key
    Snapshot saved at /tmp/etcd-backup-1

  3. To simulate the fail/disaster scenario; I removed the etcd directory (which is mentioned as data-dir in the ETCD.yaml file).
    sudo rm -rf /var/lib/etcd

  4. After deleting the etcd directory; I was unable to get the list of pods (as expected)
    kubectl get pods --all-namespaces

  5. Restored the etcd backup with the same data-dir (because I deleted it earlier). Basically, I planned to avoid opening ETCD.yaml file and modify (there is no need).
    sudo ETCDCTL_API=3 etcdctl snapshot restore /tmp/etcd-backup-1 --data-dir=/var/lib/etcd --initial-advertise-peer-urls=https://127.0.0.1:2380 --initial-cluster=k8s-control=https://127.0.0.1:2380 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --endpoints=https://127.0.0.1:2379 --key=/etc/kubernetes/pki/etcd/server.key --name=k8s-control

  6. After restore; I can see that the “member” directory gets again created in the path /var/lib/etcd (which I removed earlier).

  7. However, “kubectl get pods --all-namespaces” is still not showing all PODs back. It’s giving below error. It is to confirm that kubelet.service is up and running; I restarted it.

The connection to the server 10.0.1.101:6443 was refused - did you specify the right host or port?

  1. I searched around Internet and forum discussion but nothing is mentioned which is crystal clear. I did couple of following things which are mentioned in the internet but nothing works. e.g.

sudo -i
swapoff -a
systemctl restart kubelect.service
systemctl daemon-reload

I can see that lot many guys are facing the problem with this scenario and failing also in the CKA exam. Please give some direction.

Regards
Arun Seetha

Hello Arun,

Can you follow the steps in the attached link GitHub - mmumshad/kubernetes-cka-practice-test-solution-etcd-backup-and-restore: This is the solution to the practice test for backing up and restoring an ETCD Cluster

Thanks,
KodeKloud Support