K8sgpt question

HI,
I faced isue in my production env.

  1. The running pod got evicted from the node due to taint applied dynamically to the node. The Pod was running fine during the deployment and after running for 2-3 days, suddenly got evicted.
  • Is it possible to apply taint to the node in middle of the processing? When does it happen?

  • Development team were unable to find the root cause for the pod eviction. Can k8gpt help to troubleshoot the pod eviction issue?

Can’t say if the AI can tell you. But I can tell you a bit about doing it “the old fashioned way”:

  • Check what the taints are now on the relevant node.
  • Check what taints the pod will tolerate.

Certain kind of system problems cause the scheduler to apply dynamic taints, so it’s definitely possible to see this. Check k describe node XX to see if there is memory or disk pressure on the node, which might cause some of these taints to get put on the node.