"Everything was working... until it wasn't"

If you’ve ever used Kubernetes in a real-world project, you've probably hit an error that made no sense at first glance — a Pod stuck restarting, a Service not routing traffic, or your app mysteriously vanishing from the internet. The good news? You’re not alone.

This guide walks you through the most common Kubernetes problems developers and DevOps teams face — and more importantly, how to fix them quickly. Whether you're new to Kubernetes or scaling your first production app, consider this your essential cheat sheet.

1. CrashLoopBackOff — The Pod That Won’t Stay Alive

Symptoms: Your Pod starts, crashes, restarts, and repeats the loop.

Why it happens:

  • App crashes on startup due to config/env issues
  • Failing health checks (liveness probe in particular)
  • Wrong entrypoint or command in the container

How to fix it:

  1. Check the container logs:
kubectl logs <pod-name>
  1. Describe the Pod to inspect events and probe settings:
kubectl describe pod <pod-name>
  1. Disable probes temporarily to test if they’re the issue.

Pro Tip: If your app needs time to boot, adjust initialDelaySeconds in your probe.

2. Pending Pods — Scheduling Never Happens

Symptoms: Your Pod stays in Pending status forever.

Why it happens:

  • No available node has the required resources
  • Taints or NodeSelector mismatch
  • Wrong affinity/anti-affinity rules

How to fix it:

  • Describe the Pod:
kubectl describe pod <pod-name>

Look for reasons like Insufficient CPU.

  • Check node status:
kubectl get nodes
  • Try reducing resource requests or relaxing node constraints.

3. Service Not Routing to Pods

Symptoms: You created a Service, but can’t reach your app.

Why it happens:

  • Selector labels don’t match Pod labels
  • Target port is wrong
  • Pods are not Ready (failing readiness probes)

How to fix it:

  • Confirm Pod labels and Service selectors match:
kubectl get pods --show-labels
kubectl get svc <svc-name> -o yaml
  • Check Service endpoints:
kubectl get endpoints <svc-name>

If it shows <none>, your Service isn’t routing to any Pods.

4. ImagePullBackOff / ErrImagePull — Container Won’t Start

Symptoms: Pod never gets created due to image pulling failure.

Why it happens:

  • Wrong image name or tag
  • Image is private and you didn’t configure credentials

How to fix it:

  • Describe the Pod and read the error message:
kubectl describe pod <pod-name>
  • For private registries, create a pull secret:
kubectl create secret docker-registry regcred \
--docker-server=your-registry.com \
--docker-username=user \
--docker-password=pass

Then reference it in your Pod spec.

5. Deployment Rollout Stuck

Symptoms: Your deployment is stuck updating. Some Pods fail to start, others won’t terminate.

Why it happens:

  • Invalid image or startup command
  • New Pods failing readiness probes
  • Not enough resources on nodes

How to fix it:

  • Monitor rollout status:
kubectl rollout status deployment <deployment-name>
  • Get detailed info:
kubectl describe deployment <deployment-name>
  • Check Pod logs to diagnose what’s failing.

Tip: You can always undo a rollout:

kubectl rollout undo deployment <deployment-name>

6. Secrets Visible in YAML — A Security Misstep

Symptoms: Secrets are visible in your manifests or Git history.

Why it happens:

  • You manually base64-encoded credentials and added them to YAML files

How to fix it:

  • Use CLI to create secrets directly:
kubectl create secret generic db-creds \
--from-literal=username=admin \
--from-literal=password=supersecret

7. Port Forwarding Doesn’t Work

Symptoms: You run kubectl port-forward but get connection errors.

Why it happens:

  • App isn’t listening on the forwarded port
  • Entrypoint script completes and container exits

How to fix it:

  • Confirm your app listens on the expected port:
kubectl exec -it <pod-name> -- netstat -tuln
  • Ensure the container process stays in foreground

Pro Tip: Avoid using background (&) processes unless supervised.

8. RBAC: Access Denied

Symptoms: You see Forbidden errors when running commands or services fail to act.

Why it happens:

  • User or service account doesn’t have required permissions

How to fix it:

  • Test permissions:
kubectl auth can-i create pods --as=system:serviceaccount:default:my-sa
  • Create Role and RoleBinding to grant access to specific actions in a namespace.

Quick Commands for Any Issue

kubectl get events --sort-by=.metadata.creationTimestamp
kubectl get pods -o wide
kubectl describe <resource> <name>
kubectl logs <pod-name>
kubectl get all -n <namespace>

Use these to understand what's going wrong — fast.

Final Thoughts: Embrace the Errors

Kubernetes isn’t just a tool — it’s an ecosystem. You’ll encounter bumps, but each error is a window into how it works.

Start with a solid foundation, keep this troubleshooting list nearby, and most importantly — keep building and debugging.

Want to go further? Explore:

  • KodeKloud Free Kubernetes Labs
KodeFlix Originals
Fun videos on Kubernetes CloudNative and DevOps by KodeKloud.
  • Master Kubernetes Fundamentals with KodeKloud’s Free Email Course
Kubernetes Email Course | KodeKloud
Sign up to access a week of exclusive access to our top-tier DevOps courses and Interactive Labs, for Free.

📩 Share this blog with your DevOps team — or bookmark it for your next on-call shift.