Deepak Ladwa:
Hi - At times I have seen that the etcd pod goes to pending state post restore, even though restoration was successful and required data is visible. I tired delete pod, but no use. Anyone faced this issue?
Broc:
Anything you see in the log of etcd pod
Deepak Ladwa:
@Broc 2021-02-08 20:29:07.495101 W | etcdserver: read-only range request "key:"/registry/services/endpoints/kube-system/kube-scheduler" " with result “range_response_count:1 size:580” took too long (177.250912ms) to execute
Deepak Ladwa:
That was docker logs. Below snippet is pod logs.
controlplane $ kubectl -n kube-system logs etcd-controlplane
Error from server (BadRequest): container “etcd” in pod “etcd-controlplane” is not available
controlplane $
Broc:
I see that error too…but things are working for me. I believe it may be some behind the scene handshake timeout error or it doesn’t. But if you literally see the state in “pending”, it should be fixed. Involve experts and keep posting what you find. I love to collect more information about infra and configs to analyze but i’m in a tight schedule. Hey, but it’s a good start digging pod and container logs.
Deepak Ladwa:
@Broc thank for the info.
Faced this issue of Pending state of etcd pod, after restore.
In etcd.yaml file, I have updated --data-dir path, volumeMount path, and volumes hostPath path to new path where I restored backup ====> /var/lib/etcd-from-backup
And pending state changed into running state.
( There might be issue of --data-dir in the command of container, causing pending state. )