ETCD Restore is not working

seethaarun82 · December 30, 2021, 7:20pm

Hi team,
I have followed exactly the same stpes described in the “Solution video of ETCD backup and restore”; however, ETCD POD remains always in the pending state after restore.

Below is the restore command with the changes done in the YAML file. Please suggest.

root@controlplane:~# etcdctl snapshot restore /opt/snapshot-pre-boot.db --data-dir=/var/lib/etcd-restore-5db --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key --initial-cluster-token=“etcd-cluster-5” --endpoints=127.0.0.1:2379 --name=“controlplane” --initial-cluster=“controlplane=https://127.0.0.1:2380” --initial-advertise-peer-urls=“https://127.0.0.1:2380”
2021-12-30 18:59:11.836950 I | mvcc: restore compact to 834
2021-12-30 18:59:11.843737 I | etcdserver/membership: added member ab1610bd7ecd0ab2 [https://127.0.0.1:2380] to cluster fa6b3ae47f8348ab
root@controlplane:~# vim /etc/kubernetes/manifests/etcd.yaml root@controlplane:~# cat /etc/kubernetes/manifests/etcd.yaml apiVersion: v1kind: Pod
metadata:
annotations:
kubeadm.kubernetes.io/etcd.advertise-client-urls: https://10.52.119.3:2379
creationTimestamp: null
labels:
component: etcd
tier: control-plane
name: etcd
namespace: kube-system
spec:
containers:

command:
- etcd
- –advertise-client-urls=https://10.52.119.3:2379
- –cert-file=/etc/kubernetes/pki/etcd/server.crt
- –client-cert-auth=true
- –data-dir=/var/lib/etcd-restore-5db
- –initial-cluster-token=etcd-cluster-5
- –initial-advertise-peer-urls=https://10.52.119.3:2380
- –initial-cluster=controlplane=https://10.52.119.3:2380
- –key-file=/etc/kubernetes/pki/etcd/server.key
- –listen-client-urls=https://127.0.0.1:2379,https://10.52.119.3:2379
- –listen-metrics-urls=http://127.0.0.1:2381
- –listen-peer-urls=https://10.52.119.3:2380
- –name=controlplane
- –peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt
- –peer-client-cert-auth=true
- –peer-key-file=/etc/kubernetes/pki/etcd/peer.key
- –peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
- –snapshot-count=10000
- –trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
  image: k8s.gcr.io/etcd:3.4.13-0
  imagePullPolicy: IfNotPresent
  livenessProbe:
  failureThreshold: 8
  httpGet:
  host: 127.0.0.1
  path: /health
  port: 2381
  scheme: HTTP
  initialDelaySeconds: 10
  periodSeconds: 10
  timeoutSeconds: 15
  name: etcd
  resources:
  requests:
  cpu: 100m
  ephemeral-storage: 100Mi
  memory: 100Mi
  startupProbe:
  failureThreshold: 24
  httpGet:
  host: 127.0.0.1
  path: /health
  port: 2381
  scheme: HTTP
  initialDelaySeconds: 10
  periodSeconds: 10
  timeoutSeconds: 15
  volumeMounts:
- mountPath: /var/lib/etcd-restore-5db
  name: etcd-data
- mountPath: /etc/kubernetes/pki/etcd
  name: etcd-certs
  hostNetwork: true
  priorityClassName: system-node-critical
  volumes:
hostPath:
path: /etc/kubernetes/pki/etcd
type: DirectoryOrCreate
name: etcd-certs
hostPath:
path: /var/lib/etcd-restore-5db
type: DirectoryOrCreate
name: etcd-data
status: {}
root@controlplane:~# kubectl get all -ANAMESPACE NAME READY STATUS RESTARTS AGEdefault pod/blue-746c87566d-b47pw 1/1 Running 0 45mdefault pod/blue-746c87566d-fhpmg 1/1 Running 0 45m
default pod/blue-746c87566d-kvn9v 1/1 Running 0 45m
default pod/red-75f847bf79-gxjvs 1/1 Running 0 45m
default pod/red-75f847bf79-qhzzn 1/1 Running 0 45m
kube-system pod/coredns-74ff55c5b-ds9vg 1/1 Running 0 50m
kube-system pod/coredns-74ff55c5b-l9xqs 1/1 Running 0 50m
kube-system pod/etcd-controlplane 0/1 Pending 0 2s
kube-system pod/kube-apiserver-controlplane 1/1 Running 0 50m
kube-system pod/kube-controller-manager-controlplane 0/1 Running 5 50m
kube-system pod/kube-flannel-ds-r27q7 1/1 Running 0 50m
kube-system pod/kube-proxy-l9f8m 1/1 Running 0 50m
kube-system pod/kube-scheduler-controlplane 0/1 Running 5 50m

Regards
Arun

ashishchorge · December 31, 2021, 4:48am

Please try below command without --cacert and --cert and --key.

ETCDCTL_API=3 etcdctl --data-dir <data-dir-location> snapshot restore snapshotdb

Ayman · January 1, 2022, 1:33pm

Check the full steps GitHub - mmumshad/kubernetes-cka-practice-test-solution-etcd-backup-and-restore: This is the solution to the practice test for backing up and restoring an ETCD Cluster

seethaarun82 · January 3, 2022, 7:52am

Hi team,
Thanks for the support; however, in my case it didn’t work even though following the steps mentioned in GitHub.

At last, I made it to work with the following way in the Kodekloud lab.

Took a backup of kube-apiserver yaml file to a directory other than usual
“/etc/kubernetes/manifests/” path.
After ETCD restore command, ETCD POD was not coming up. I copied the kube-apiserver.yaml back to /etc/kubernetes/manifests/ and it started working.

Regards
Arun Seetha

never-inst · July 23, 2023, 4:02pm

I tried the same but now my kube-apiserver-controlplane pod is also in pending state. But after restarting both the pods they is now ok(Running).