CKAD Exam Verification Guide

by Pramodh Kumar M
Pramodh Kumar M
- LinkedIn
Nimesha Jinarajadasa
Nimesha Jianrajadasa is a DevOps & Cloud Consultant, K8s expert, and instructional content strategist-crafting hands-on learning experiences in DevOps, Kubernetes, and platform engineering.
- LinkedIn
Nimesha Jinarajadasa
•
November 25, 2025
•
34 min read

Join 1M+ Learners

Learn & Practice DevOps, Cloud, AI, and Much More — All Through Hands-On, Interactive Labs!

Create Your Free Account BLACK FRIDAY SALE: Up to 50% OFF* On Annual Plans *terms and conditions apply

Highlights

Focus: Master verification & speed - 15-20 tasks in 120 mins = ~6-8 mins each
Setup: Use aliases (k, $do, $now) only if you've practiced them
YAML Tip: Generate with --dry-run=client -o yaml; never write from scratch
Docs Shortcut: Use kubectl explain, not web docs
Verify Everything: Always get, describe, logs, and check Events
Probes: Liveness restarts, Readiness controls traffic - know the difference
Multi-Container: Init containers run first, sidecars run alongside - verify both
ConfigMaps vs Secrets: ConfigMaps for config, Secrets for sensitive data - verify consumption
Services: Endpoints tell truth; empty endpoints = selector mismatch
Time Rule: Max 8 mins per question; verify before moving on
Golden Trick: Imperative → YAML → Apply → Verify = full marks
Practice Goal: Score 90%+ in mock exams, finish in <100 mins
Mindset: Speed + verification = success in CKAD

Welcome! You're about to dive into a comprehensive guide that covers commands which can be leveraged as a verification method for the CKAD exam.

Here's the reality: you have 120 minutes for around 15-20 questions, which means about 6-8 minutes per task. Sounds tight? It is! But master these verification techniques, and you'll be confident in finishing with time to spare.

Think of this guide as your exam companion. Each section is designed to be practical and focused on what actually matters during those 2 hours.

Exam Setup - The Speed Booster (Optional, But Highly Recommended!)

Should you set this up?

Honestly, it's up to you! Some folks love aliases and can't imagine working without them. Others prefer typing full commands every time. Here's my take: if you practice with these shortcuts for 2-3 weeks before the exam, they become second nature and can save you 10-20 minutes total. That's huge! But if you try them for the first time on exam day, they'll just confuse you and slow you down.

Good news about autocomplete:

The exam environment already has kubectl autocomplete enabled! So, when you type kubectl get po and hit TAB, it auto-completes to kubectl get pods. This works out of the box - no setup needed. Pretty sweet, right?

The time-saving aliases:

Let me be real with you - typing kubectl many times during the exam gets old fast. That's why I use k as an alias. Same with $do for --dry-run=client -o yaml - typing that several times is mind-numbing. But here's the deal: practice with these for at least 2 weeks before the exam, or don't use them at all. Half-learned shortcuts will trip you up under pressure.

cat >> ~/.bashrc << 'EOF'
alias k='kubectl'
alias kgp='kubectl get pods'
alias kgs='kubectl get svc'
alias kd='kubectl describe'
export do='--dry-run=client -o yaml'
export now='--force --grace-period=0'
EOF
source ~/.bashrc
# Set up aliases

Why these specific aliases?

k - Because typing "kubectl" 200 times hurts
kgp and kgs - Your most-used commands deserve shortcuts
$do - This magical variable generates YAML templates instantly
$now - Deletes pods immediately without 30-second grace period

Configure vim for YAML editing:

YAML is whitespace-sensitive (2 spaces for indentation, never tabs). Vim's defaults will drive you crazy if you don't fix them:

cat >> ~/.vimrc << 'EOF'
set number
set tabstop=2
set shiftwidth=2
set expandtab
EOF
# YAML editing configuration

Now here's something CRITICAL: When you save commands to files (like generating YAML), you MUST use the full kubectl command, not the k alias. Why? Because aliases only work in your interactive shell, not in files. So:

✅ Correct: kubectl run nginx --image=nginx $do > pod.yaml
❌ Wrong: k run nginx --image=nginx $do > pod.yaml (won't work during exam verification)

The $do variable works in both because it's an environment variable, but k is a shell alias and only works when you type it directly.

Recommendation:

Spend your first 2-3 minutes of the exam setting these up if you've practiced with them. If you haven't practiced, skip them entirely and just use full commands with autocomplete. A familiar workflow beats a faster unfamiliar one every single time.

1. Application Design and Build (20% of the exam)

This section tests your ability to work with container images, choose the right workload resource, implement multi-container patterns, and utilize storage. You need to verify pods are using correct images, workloads are running as expected, containers are communicating properly, and volumes are persisting data.

Container Images

What you're really checking: The correct image (with right tag) is being used, image pull is succeeding, and containers are starting properly. A wrong tag or missing image pull secret will stop everything.

# Check which image the pod is actually using
kubectl get pod <pod-name> -o jsonpath='{.spec.containers[*].image}'
# Shows exact image with tag - verify it matches requirements

# Check image pull status
kubectl describe pod <pod-name>
# Look for Events: ErrImagePull or ImagePullBackOff

# Verify image pull secrets are configured
kubectl get pod <pod-name> -o jsonpath='{.spec.imagePullSecrets[*].name}'
# Should show secret name if using private registry

# Check image pull policy
kubectl get pod <pod-name> -o jsonpath='{.spec.containers[*].imagePullPolicy}'
# Always/IfNotPresent/Never - affects caching behavior

# Verify container command and args (overrides image defaults)
kubectl get pod <pod-name> -o yaml | grep -A 5 "command:\|args:"
# Check if startup command is correct

Scenario: "Create a pod with nginx:1.19 image." After creating, verify the exact image version is correct - nginx:latest when you needed nginx:1.19 = zero points.

Troubleshooting image issues:

If ImagePullBackOff:

kubectl describe pod <pod-name> | grep -A 10 Events
# Common errors:
# - "manifest unknown" = image doesn't exist at registry
# - "unauthorized" = missing or wrong image pull secret
# - "connection refused" = registry unreachable

Workload Resources

Critical understanding: Deployments for stateless apps with rolling updates. DaemonSets for one-per-node tasks (logging/monitoring). StatefulSets for stateful apps needing persistent identity. Jobs for run-to-completion tasks. CronJobs for scheduled tasks. Choosing wrong type = wrong answer.

Deployments

# Check deployment status
kubectl get deployment <name>
# READY should match desired (3/3), not less (2/3)

# Watch rollout in real-time
kubectl rollout status deployment/<name>
# Shows update progress: "2 out of 3 new replicas have been updated..."

# Check ReplicaSets (old and new)
kubectl get rs
# After update: new RS has all pods, old RS scaled to 0

# Verify pods are running with correct labels
kubectl get pods -l app=<label> -o wide
# Count should match replicas, STATUS=Running

# Check deployment strategy
kubectl get deployment <name> -o yaml | grep -A 5 strategy
# RollingUpdate: maxSurge and maxUnavailable control update speed

Exam task: "Create deployment with 3 replicas, update image, verify rollout."

Verification workflow:

kubectl create deployment web --image=nginx:1.18 --replicas=3
kubectl get deployment web  # READY 3/3
kubectl set image deployment/web nginx=nginx:1.19
kubectl rollout status deployment/web  # Watch update
kubectl get pods -l app=web  # All running new version

Rolling updates & rollbacks:

# Update image
kubectl set image deployment/web nginx=nginx:1.20

# Check history
kubectl rollout history deployment/web
# Shows revision numbers

# Rollback if needed
kubectl rollout undo deployment/web
# Goes back to previous revision

# Rollback to specific revision
kubectl rollout undo deployment/web --to-revision=2

# Pause/resume rollout
kubectl rollout pause deployment/web
kubectl rollout resume deployment/web

Blue/Green & Canary Strategies:

# Blue/Green: Two deployments, switch service selector
kubectl get deployment  # Should see both blue and green
kubectl describe svc web  # Check selector points to correct version
kubectl get endpoints web  # Verify correct pods receiving traffic

# Canary: Small canary deployment alongside main
kubectl get pods -l version=canary  # Canary pods exist
kubectl get pods -l version=main  # Main pods exist
kubectl describe svc web  # Service routes to both (weighted by count)

DaemonSets

# Check DaemonSet status
kubectl get daemonsets
# DESIRED should equal number of nodes, READY should match

# Verify one pod per node
kubectl get pods -o wide -l <daemonset-label>
# Check NODE column - each node should have one pod

# Check for scheduling issues
kubectl describe ds <name>
# Events show: taints, insufficient resources, node selectors

Exam gotcha: If READY < DESIRED, check node taints. DaemonSet pods need tolerations to schedule on tainted nodes.

StatefulSets

# Check StatefulSet and pod status
kubectl get statefulset <name>
# READY 2/2 means both pods ready

# Verify predictable pod names
kubectl get pods -l app=<label>
# Should see: web-0, web-1, web-2 (ordered names)

# Check PVCs for each pod
kubectl get pvc -l app=<label>
# Each pod should have its own Bound PVC

# Verify ordered startup/update
kubectl get pods -w
# Pods start in order: 0 → 1 → 2
# Updates happen in reverse: 2 → 1 → 0

Critical for StatefulSets: If web-0 isn't Ready, web-1 won't start. Sequential dependency is key.

Jobs

# Check job completion
kubectl get jobs
# COMPLETIONS 1/1 means success

# Check job pods
kubectl get pods -l job-name=<job>
# STATUS should be Completed (not Running, not Failed)

# View job output
kubectl logs job/<job-name>
# Shows what the job did

# Check for failures
kubectl describe job <name>
# Events show: pod failures, backoffLimit exceeded

Scenario: "Create job that runs busybox command." After creation, verify pod completed successfully and check logs for expected output.

CronJobs

# Check CronJob schedule and status
kubectl get cronjob
# Note SCHEDULE (cron format), LAST SCHEDULE, ACTIVE jobs

# Verify CronJob created a job
kubectl get jobs | grep <cronjob-name>
# Should see jobs with timestamp suffixes

# Test without waiting for schedule
kubectl create job --from=cronjob/<name> test-run
kubectl get jobs test-run  # Verify it ran
kubectl logs job/test-run  # Check output

# Check history limits
kubectl get cronjob <name> -o yaml | grep History
# successfulJobsHistoryLimit and failedJobsHistoryLimit

Time-saver: Always test CronJobs with manual job creation - don't wait for the schedule!

Multi-Container Pod Patterns

The big picture: Init containers run sequentially before app containers (setup tasks). Sidecars run alongside app containers (logging, proxying). Ambassador pattern provides proxy to external services. Adapter pattern transforms output format. You must verify both types work correctly.

Init Containers

# Check pod status during init
kubectl get pods
# STATUS shows: Init:0/2, Init:1/2 (init containers running)
# READY shows: 0/1 (main containers not started yet)

# Verify init container completion
kubectl describe pod <name>
# Look at Init Containers section:
# State: Terminated, Reason: Completed, Exit Code: 0 = success

# Check init container logs
kubectl logs <pod> -c <init-container-name>
# Shows what init container did

# If init failed
kubectl describe pod <name> | grep -A 20 "Init Containers"
# Shows which init container failed and why

Exam verification: After creating pod with init containers, confirm all inits show "Terminated - Completed" before main container starts.

Sidecar Containers

# Verify all containers running
kubectl get pods
# READY should show 2/2 (both containers) not 1/2

# Check each container's status
kubectl describe pod <name>
# Shows status for each container separately

# View logs from specific container
kubectl logs <pod> -c <sidecar-container>
# Check sidecar is doing its job

# Verify shared volume between containers
kubectl exec <pod> -c <main-container> -- ls /shared
kubectl exec <pod> -c <sidecar> -- cat /shared/file
# Confirms containers can share data via volume

Common exam task: "Add logging sidecar to application pod." Verify both containers running and sidecar accessing app logs.

Persistent and Ephemeral Volumes

Critical distinction: PersistentVolumes survive pod deletion (databases, user uploads). EmptyDir is ephemeral (scratch space, caching). You must verify data persists where it should and is ephemeral where it should be.

PersistentVolumeClaims

# Check PVC status
kubectl get pvc
# STATUS must be Bound, not Pending

# Verify PVC details
kubectl describe pvc <name>
# Check: StorageClass, Access Modes, Capacity, Volume (which PV it's bound to)

# If stuck in Pending
kubectl describe pvc <name> | grep -A 10 Events
# Common issues: no matching PV, no default StorageClass, provisioner not running

# Check available StorageClasses
kubectl get storageclass
# Verify SC exists and is default if needed

Verification workflow:

# After creating PVC
kubectl get pvc <name> --watch
# Should go Pending → Bound within 10 seconds

# If stays Pending > 10 seconds
kubectl get pv  # Check if any PV matches requirements
kubectl get sc  # Verify StorageClass exists
kubectl get pods -n kube-system | grep provisioner  # Check provisioner running

Verify Volume in Pod

# Check volume mount in pod
kubectl describe pod <name>
# Look for: Volumes section (PVC referenced), Mounts section (mount path)

# Test data persistence
kubectl exec <pod> -- sh -c "echo test > /mnt/data/test.txt"
kubectl exec <pod> -- cat /mnt/data/test.txt  # Should show "test"

# Delete and recreate pod
kubectl delete pod <pod>
# Wait for new pod
kubectl exec <new-pod> -- cat /mnt/data/test.txt
# Data should still be there = true persistence!

Exam gotcha: If data disappears after pod restart, you're using EmptyDir not PVC.

EmptyDir Volumes

# Check for EmptyDir in pod spec
kubectl get pod <name> -o yaml | grep -A 5 "emptyDir:"

# Verify containers can share EmptyDir
kubectl exec <pod> -c container1 -- sh -c "echo shared > /cache/data"
kubectl exec <pod> -c container2 -- cat /cache/data  # Should show "shared"

Use cases: EmptyDir is perfect for:

Scratch space for computational tasks
Cache that doesn't need to survive restarts
Sharing files between init container and main container

2. Application Deployment (20% of the exam)

This section tests deployment strategies, Helm usage, and Kustomize. You need to verify rolling updates work correctly, blue/green or canary deployments route traffic properly, Helm releases are installed successfully, and Kustomize overlays apply correct configurations.

Deployment Strategies

Blue/Green Deployment:

# Verify both deployments exist
kubectl get deployment blue green
# Both should show READY X/X

# Check which version service is routing to
kubectl describe svc app | grep Selector
# Selector should match either blue or green labels

# Verify endpoints
kubectl get endpoints app
# Should only show IPs from active version (blue or green)

# Switch traffic (update service selector)
kubectl patch svc app -p '{"spec":{"selector":{"version":"green"}}}'
kubectl get endpoints app  # Now shows green pods only

Canary Deployment:

# Verify main and canary deployments running
kubectl get deployment main canary
# Both should be running

# Check total pod count (controls traffic split)
kubectl get pods -l app=myapp
# e.g., 9 main pods + 1 canary pod = 10% canary traffic

# Verify service routes to both
kubectl get endpoints myapp
# Should show IPs from both main and canary pods

# If canary succeeds, scale up canary and down main
kubectl scale deployment canary --replicas=10
kubectl scale deployment main --replicas=0
kubectl get endpoints myapp  # Now all canary

Exam verification: Always check endpoints to confirm which pods are actually receiving traffic.

Helm Package Manager

# List installed releases
helm list -n <namespace>
# Check STATUS column (should be "deployed" not "failed")

# Check release details
helm status <release-name> -n <namespace>
# Shows: deployment status, NOTES (how to access app)

# Verify Kubernetes resources created
kubectl get all -n <namespace> -l app.kubernetes.io/instance=<release-name>
# Shows pods, services, deployments created by Helm

# View values used
helm get values <release-name> -n <namespace>
# Shows only user-supplied values

helm get values <release-name> -n <namespace> --all
# Shows all values including defaults

# Check specific values were applied
kubectl get deployment <name> -o yaml | grep <expected-value>

Scenario: "Install nginx-ingress using Helm, verify it's running."

Verification workflow:

helm install my-nginx ingress-nginx/ingress-nginx -n ingress-nginx --create-namespace
helm list -n ingress-nginx  # STATUS should be "deployed"
kubectl get pods -n ingress-nginx  # Pods should be Running
kubectl get svc -n ingress-nginx  # Service should have external endpoint

Helm upgrades and rollbacks:

# Upgrade release
helm upgrade <release> <chart> -n <namespace> --set key=newvalue

# Verify upgrade
helm list -n <namespace>  # Check REVISION increased
kubectl get pods -n <namespace>  # Check new pods running

# If upgrade failed, rollback
helm rollback <release> -n <namespace>
helm rollback <release> <revision> -n <namespace>  # Specific revision

# Check rollback succeeded
helm history <release> -n <namespace>  # Shows all revisions

Common issues:

# Chart not found
helm repo add <repo-name> <url>
helm repo update
helm search repo <chart-name>

# Values not applied
helm get values <release> -n <namespace>  # Verify values
kubectl get deployment <name> -o yaml  # Check actual config

Kustomize

# Preview what will be created (BEFORE applying)
kubectl kustomize <directory>
# Shows final YAML after all patches/overlays applied

# Apply kustomization
kubectl apply -k <directory>
# -k flag tells kubectl to use Kustomize

# Verify resources created
kubectl get all -n <namespace>

# Check specific customizations were applied
kubectl get deployment <name> -o jsonpath='{.spec.replicas}'
# Verify replica count matches overlay

kubectl get pod <pod> -o jsonpath='{.spec.containers[0].image}'
# Verify image tag was customized

kubectl get deployment <name> --show-labels
# Verify commonLabels were added

Scenario: "Use Kustomize to deploy app with dev overlay, verify 2 replicas."

Verification workflow:

# First preview
kubectl kustomize overlays/dev
# Check output shows 2 replicas

# Apply
kubectl apply -k overlays/dev

# Verify
kubectl get deployment app -o jsonpath='{.spec.replicas}'  # Should show "2"
kubectl get pods -l app=myapp  # Should see 2 running pods

Testing overlays:

# Compare different environments
kubectl kustomize overlays/dev > dev.yaml
kubectl kustomize overlays/prod > prod.yaml
diff dev.yaml prod.yaml
# Shows differences between environments

# Verify patches applied
kubectl kustomize overlays/prod | grep -A 5 resources
# Check resource limits, replicas, image tags differ from base

3. Application Observability and Maintenance (15% of the exam)

This section tests API deprecations, probes, monitoring tools, logs, and debugging. You need to handle API version changes, verify probes are working, monitor resource usage, collect logs effectively, and debug failing applications.

API Deprecations

# Check available API versions
kubectl api-resources
# Shows current API groups and versions for all resources

# Find deprecated APIs in use
kubectl get deployment -o yaml | grep apiVersion
# Check if using old version like apps/v1beta1

# Verify current API version for resource type
kubectl explain deployment
# Shows: VERSION apps/v1 (current stable version)

# Find all deprecated APIs in cluster
kubectl get all --all-namespaces -o yaml | grep "apiVersion:" | sort | uniq
# Compare against Kubernetes version docs

Common deprecations to know:

Deployments: apps/v1beta1, apps/v1beta2 → apps/v1
Ingress: extensions/v1beta1 → networking.k8s.io/v1
PodSecurityPolicy: Removed in 1.25 → use Pod Security Standards

Updating deprecated APIs:

# Export resource with old API
kubectl get deployment <name> -o yaml > deploy.yaml

# Edit file: Change apiVersion to current version
# apps/v1beta2 → apps/v1

# Validate before applying
kubectl apply --dry-run=client -f deploy.yaml
# Shows any validation errors

# Apply updated version
kubectl apply -f deploy.yaml

# Verify it works
kubectl get deployment <name>
kubectl describe deployment <name>  # Check events

Probes and Health Checks

Critical understanding: Liveness probes restart containers when they fail (app crashed, deadlocked). Readiness probes control traffic routing (app not ready to serve yet). Startup probes give slow-starting apps time to start before liveness checks begin.

Readiness Probes

# Check if pod is Ready
kubectl get pods
# READY 1/1 = passing readiness probe
# READY 0/1 = failing readiness probe (not getting traffic)

# Check why probe is failing
kubectl describe pod <name> | grep -A 10 Readiness
# Shows: probe type, path/port, delay, period, timeout

kubectl describe pod <name> | grep -A 20 Events
# Look for: "Readiness probe failed: Get http://...: connection refused"

# Check service endpoints
kubectl get endpoints <service>
# Pods failing readiness aren't in endpoints list

Scenario: "Pod is Running but service returns no response."

Investigation:

kubectl get pods  # Shows 0/1 Ready
kubectl describe pod <name>  # Readiness probe failing: HTTP 500
kubectl logs <pod>  # App errors during startup
# Fix: Update application or increase initialDelaySeconds

Liveness Probes

# Check restart count
kubectl get pods
# RESTARTS column - if increasing, liveness probe is failing

# See liveness probe failures
kubectl describe pod <name> | grep -A 10 Liveness
# Shows probe configuration

kubectl describe pod <name> | grep -A 20 Events
# Look for: "Liveness probe failed: HTTP probe failed with statuscode: 404"
# Followed by: "Killing" and "Started" events

# Check last container state
kubectl describe pod <name> | grep -A 10 "Last State"
# Shows why container was killed (Reason: Error, Exit Code: 137)

Common liveness probe mistake:

# Probe too aggressive (kills healthy containers)
kubectl describe pod <name> | grep -A 5 Liveness
# Liveness: http-get http://:8080/health delay=0s period=5s timeout=1s

# Problem: No initialDelaySeconds! Probe starts immediately
# Solution: Add initialDelaySeconds to give app time to start
kubectl edit deployment <name>
# Set initialDelaySeconds: 30  # Wait 30s before first check

Testing Probes Manually

# Simulate HTTP probe
kubectl exec <pod> -- wget -qO- http://localhost:8080/health
# Check status code and response

kubectl exec <pod> -- curl -v http://localhost:8080/health
# Verbose output shows headers, status

# Simulate TCP probe
kubectl exec <pod> -- telnet localhost 8080
# Check if port is open

# For exec probes, run the command
kubectl exec <pod> -- cat /tmp/healthy
# Check if file exists (common liveness check)

Built-in CLI Monitoring Tools

# Node resource usage (requires metrics-server)
kubectl top nodes
# Shows CPU and memory usage per node

# Pod resource usage
kubectl top pods -n <namespace>
# Shows CPU and memory per pod

kubectl top pods --all-namespaces --sort-by=cpu
# Find CPU-intensive pods across cluster

kubectl top pods --all-namespaces --sort-by=memory --containers
# Show per-container usage

# Watch pod status in real-time
kubectl get pods -w
# -w flag watches for changes, updates automatically

# Custom columns for specific info
kubectl get pods -o custom-columns=NAME:.metadata.name,STATUS:.status.phase,RESTARTS:.status.containerStatuses[0].restartCount
# Quick view of problem indicators

# Check events for issues
kubectl get events -n <namespace> --sort-by='.lastTimestamp'
# Recent events show what's happening

kubectl get events --field-selector type=Warning
# Show only warnings (problem indicators)

If kubectl top doesn't work: Metrics Server not installed. Not an error, just unavailable. Use describe and logs instead.

Container Logs

# View pod logs
kubectl logs <pod-name>
# Shows stdout/stderr from main container

# Multi-container pod logs
kubectl logs <pod-name> -c <container-name>
# Specify which container

# Follow logs in real-time
kubectl logs -f <pod-name>
# Like tail -f, streams logs continuously

# Previous container logs (CRITICAL for crashes!)
kubectl logs <pod-name> --previous
# Shows logs from crashed/restarted container

# Last N lines
kubectl logs <pod-name> --tail=50
# Shows only last 50 lines

# Since specific time
kubectl logs <pod-name> --since=5m
# Logs from last 5 minutes

# With timestamps
kubectl logs <pod-name> --timestamps
# Adds timestamp to each line

# All containers in pod
kubectl logs <pod-name> --all-containers=true
# Combines logs from all containers

# Logs from label selector
kubectl logs -l app=myapp --all-containers=true
# Logs from all pods matching label

Exam critical command: For CrashLoopBackOff pods, ALWAYS use --previous flag!

# Wrong (shows nothing - new container hasn't started)
kubectl logs <crashing-pod>

# Right (shows why it crashed)
kubectl logs <crashing-pod> --previous

Debugging in Kubernetes

Debug Pod Status Issues

Pending Pods:

# Check why pod won't schedule
kubectl describe pod <pod-name> | grep -A 10 Events
# Look for: "FailedScheduling", "Insufficient cpu/memory"

# Check node resources
kubectl top nodes  # If metrics-server available
kubectl describe nodes | grep -A 10 "Allocated resources"

# Check PVC status (if pod uses volumes)
kubectl get pvc
# STATUS should be Bound, not Pending

# Check image pull secrets
kubectl describe pod <pod-name> | grep -A 5 "Image:"
# Events show: "ImagePullBackOff", "ErrImagePull"

CrashLoopBackOff:

# Step 1: Check logs from crashed container
kubectl logs <pod-name> --previous

# Step 2: Check why it's crashing
kubectl describe pod <pod-name> | grep -A 20 Events
# Look for: OOMKilled, Error, liveness probe failures

# Step 3: Check exit code
kubectl describe pod <pod-name> | grep "Exit Code"
# 0 = success, 1 = app error, 137 = OOMKilled, 143 = SIGTERM

Debug Running Pods

# Exec into container
kubectl exec -it <pod-name> -- /bin/sh
# Opens interactive shell inside container

# Check environment variables
kubectl exec <pod-name> -- printenv
# Verify CONFIG vars are set

# Check mounted files
kubectl exec <pod-name> -- ls -la /app/config
# Verify ConfigMap/Secret files exist

# Test network connectivity
kubectl exec <pod-name> -- curl http://other-service:8080
kubectl exec <pod-name> -- ping other-service

Debug Network Issues

# Create debug pod with network tools
kubectl run netshoot --rm -it --image=nicolaka/netshoot -- /bin/bash
# Inside pod:
# curl http://service-name:port
# nslookup service-name
# dig service-name.namespace.svc.cluster.local

# Test service DNS
kubectl exec <pod> -- nslookup <service-name>
# Should return service ClusterIP

# Check service endpoints
kubectl get endpoints <service-name>
# Empty = no pods match service selector
# Verify with: kubectl describe svc <name> | grep Selector
#              kubectl get pods --selector=<selector>

# Port forward for direct testing
kubectl port-forward pod/<pod-name> 8080:80
# Test from your machine: curl localhost:8080

Debug Services

# Verify service and endpoints
kubectl get svc,endpoints <service-name>
# Endpoints should list pod IPs

# Check selector matches pods
kubectl describe svc <service-name> | grep Selector
kubectl get pods --selector=<key>=<value>
# Count should match

# Check if pods are Ready
kubectl get pods -l <selector>
# Pods must be Ready to appear in endpoints

# Verify targetPort matches container port
kubectl describe svc <service-name> | grep TargetPort
kubectl describe pod <pod-name> | grep -A 5 "Ports:"
# Must match exactly

4. Application Environment, Configuration and Security (25% of the exam)

This is the biggest section! It covers CRDs/Operators, RBAC, resource management, ConfigMaps, Secrets, ServiceAccounts, and security contexts. You need to verify custom resources work, permissions are correct, resources are within limits, configs are injected properly, and security settings are enforced.

Custom Resources (CRDs) and Operators

# Check if CRD exists
kubectl get crd
kubectl get crd <crd-name>

# Describe CRD
kubectl describe crd <crd-name>
# Shows: versions, scope (Namespaced/Cluster), stored versions

# List custom resources
kubectl get <crd-plural-name>
# Example: kubectl get postgresqls

# Check custom resource status
kubectl get <crd-kind> <name> -o yaml | grep -A 10 status
# Many operators update status with Ready condition

kubectl describe <crd-kind> <name>
# Shows status and events

# Verify operator created resources
kubectl get all -n <namespace>
# Operator should have created deployments/pods/services based on CR

Scenario: "Create a MySQL custom resource using provided CRD."

Verification:

kubectl apply -f mysql-cr.yaml
kubectl get mysql my-database  # Verify CR created
kubectl describe mysql my-database  # Check status: Ready=True
kubectl get statefulset  # Operator should have created StatefulSet
kubectl get pods -l app=mysql  # MySQL pods should be running

Authentication, Authorization, and Admission Control

The three A's: Authentication (who are you?), Authorization (what can you do?), Admission Control (is your request valid?). You mainly verify RBAC (authorization) and admission effects (quotas, pod security).

RBAC Verification

# Test permissions (THE MOST IMPORTANT COMMAND!)
kubectl auth can-i <verb> <resource>
# Returns "yes" or "no"

kubectl auth can-i get pods
kubectl auth can-i create deployments
kubectl auth can-i delete secrets -n kube-system

# Test as another user/service account
kubectl auth can-i get pods --as user1
kubectl auth can-i list secrets --as system:serviceaccount:default:myapp-sa -n default

# List all permissions
kubectl auth can-i --list
# Shows everything you can do

# Check Role/ClusterRole permissions
kubectl describe role <role-name> -n <namespace>
kubectl describe clusterrole <clusterrole-name>
# Look at Rules: verbs, resources, apiGroups

# Check RoleBinding/ClusterRoleBinding
kubectl describe rolebinding <name> -n <namespace>
# Check: Subjects (who), RoleRef (which role)

kubectl get rolebinding -n <namespace> -o wide
# Quick view of all bindings

Exam task: "Create Role allowing reading pods/services, bind to user 'dev'."

Verification workflow:

# After creating Role and RoleBinding
kubectl describe role pod-reader -n app
# Verify rules allow get, list, watch on pods and services

kubectl describe rolebinding dev-binding -n app
# Verify subject is user 'dev' and roleRef is 'pod-reader'

# Test permissions
kubectl auth can-i get pods --as dev -n app  # Should return "yes"
kubectl auth can-i delete pods --as dev -n app  # Should return "no"
kubectl auth can-i get secrets --as dev -n app  # Should return "no"

Admission Control Verification

ResourceQuota:

# Check quota limits and usage
kubectl describe quota -n <namespace>
# Shows: Resource, Used, Hard

# Example output:
# Name: compute-quota
# Resource    Used  Hard
# --------    ---   ---
# pods        3     10
# cpu         800m  2
# memory      1Gi   4Gi

# If quota exceeded
kubectl apply -f pod.yaml
# Error: exceeded quota: compute-quota, requested: cpu=500m, used: cpu=800m, limited: cpu=2

# Fix: Reduce requests or delete other pods
kubectl get pods --sort-by='.status.phase'
kubectl delete pod <name>

LimitRange:

# Check default limits
kubectl describe limitrange -n <namespace>
# Shows: default, defaultRequest, max, min per container

# Verify defaults were applied to pod
kubectl describe pod <name> | grep -A 10 "Limits\|Requests"
# Check if values match LimitRange defaults

Pod Security Standards:

# Check namespace labels
kubectl get ns <namespace> -o yaml | grep pod-security
# Look for: pod-security.kubernetes.io/enforce: restricted

# If pod is rejected
kubectl apply -f pod.yaml
# Error: pods "test" is forbidden: violates PodSecurity "restricted:latest"
#        must set securityContext.runAsNonRoot=true

# Fix: Add required security settings to pod spec

Resource Requests, Limits, and Quotas

# Check pod resource settings
kubectl describe pod <name> | grep -A 10 "Limits\|Requests"
# Shows: cpu and memory requests/limits per container

# Check QoS class
kubectl get pod <name> -o jsonpath='{.status.qosClass}'
# Guaranteed (requests=limits), Burstable (requests<limits), BestEffort (none)

# Check if pod was OOMKilled
kubectl describe pod <name> | grep "OOMKilled"
# Last State: Terminated, Reason: OOMKilled

# Check node resource allocation
kubectl describe nodes | grep -A 10 "Allocated resources"
# Shows CPU/Memory allocated vs capacity

# If pod won't schedule
kubectl describe pod <name> | grep "FailedScheduling"
# 0/3 nodes available: 3 Insufficient cpu
# Solution: Lower requests or add nodes

Exam task: "Create pod with CPU request 100m, memory limit 256Mi."

Verification:

kubectl describe pod <name> | grep -A 5 "Requests:"
# cpu: 100m

kubectl describe pod <name> | grep -A 5 "Limits:"
# memory: 256Mi

ConfigMaps

# Verify ConfigMap data
kubectl get configmap <name> -o yaml
# Check data: keys and values

kubectl describe configmap <name>
# Shows keys and sizes (not full values)

# Check ConfigMap consumed as env vars
kubectl describe pod <name> | grep -A 10 "Environment:"
# Shows env vars and their sources

kubectl exec <pod> -- printenv | grep <KEY_NAME>
# Verify actual value in container

# Check ConfigMap mounted as files
kubectl describe pod <name> | grep -A 10 "Mounts:"
# Shows mount path

kubectl exec <pod> -- ls /etc/config
# List files (each key becomes a file)

kubectl exec <pod> -- cat /etc/config/<key>
# Read file content (should match ConfigMap value)

# If pod won't start
kubectl describe pod <name> | grep "ConfigMap"
# Events show: "MountVolume.SetUp failed ... configmap "app-config" not found"

Scenario: "Create ConfigMap with key 'db_host=mysql', consume as env var."

Verification:

kubectl create configmap db-config --from-literal=db_host=mysql
kubectl get configmap db-config -o yaml  # Verify data

# After creating pod with envFrom
kubectl exec <pod> -- printenv db_host  # Should output "mysql"

Secrets

# View secret (base64 encoded)
kubectl get secret <name> -o yaml
# Data is base64 encoded

# Decode secret value
kubectl get secret <name> -o jsonpath='{.data.password}' | base64 -d
# Shows actual password

# Describe secret (safer, doesn't show values)
kubectl describe secret <name>
# Shows: Type, keys, sizes

# Check secret type
kubectl get secret <name> -o jsonpath='{.type}'
# Opaque, kubernetes.io/tls, kubernetes.io/dockerconfigjson

# Verify secret consumed as env var
kubectl exec <pod> -- printenv SECRET_KEY
# Shows actual value (not base64)

# Verify secret mounted as files
kubectl exec <pod> -- ls /etc/secrets
kubectl exec <pod> -- cat /etc/secrets/password

# For TLS secrets
kubectl exec <pod> -- cat /etc/tls/tls.crt | openssl x509 -noout -text
# Verify certificate details

Exam task: "Create secret with password, mount as file in pod."

Verification:

kubectl create secret generic db-secret --from-literal=password=secret123
kubectl get secret db-secret -o jsonpath='{.data.password}' | base64 -d
# Should output: secret123

# After creating pod
kubectl exec <pod> -- cat /etc/secrets/password
# Should output: secret123 (already decoded)

ServiceAccounts

# Verify ServiceAccount exists
kubectl get serviceaccount <name> -n <namespace>

# Check which SA pod is using
kubectl describe pod <name> | grep "Service Account:"
# Should show your custom SA, not "default"

# Verify SA token is mounted
kubectl describe pod <name> | grep -A 5 "Mounts:"
# Should see: /var/run/secrets/kubernetes.io/serviceaccount

kubectl exec <pod> -- cat /var/run/secrets/kubernetes.io/serviceaccount/token
# Shows JWT token

# Check SA has needed permissions
kubectl auth can-i get pods --as system:serviceaccount:default:myapp-sa
# Test each permission the app needs

# Verify automountServiceAccountToken
kubectl get pod <name> -o jsonpath='{.spec.automountServiceAccountToken}'
# true or false

Scenario: "Create ServiceAccount 'app-sa', use in pod, verify it can list pods."

Verification workflow:

kubectl create serviceaccount app-sa
kubectl get sa app-sa  # Verify exists

# Create Role and RoleBinding (if needed)
# ...

# After creating pod with serviceAccountName: app-sa
kubectl describe pod <name> | grep "Service Account:"
# Should show: app-sa

kubectl auth can-i list pods --as system:serviceaccount:default:app-sa
# Should return "yes" (if proper RBAC configured)

Application Security (SecurityContexts, Capabilities)

# Check pod security context
kubectl describe pod <name> | grep -A 10 "Security Context"
# Pod-level: fsGroup, runAsUser, runAsNonRoot
# Container-level: runAsUser, capabilities, privileged

# Verify running user
kubectl exec <pod> -- id
# Shows: uid=1000 gid=1000 (should match runAsUser)

# Check if running as non-root
kubectl exec <pod> -- id -u
# Should be non-zero (not 0)

# Verify capabilities
kubectl get pod <name> -o yaml | grep -A 5 capabilities
# Shows: add and drop lists

# Test dropped capability (e.g., NET_RAW)
kubectl exec <pod> -- ping 8.8.8.8
# Should fail with "operation not permitted" if NET_RAW dropped

# Verify fsGroup on mounted volumes
kubectl exec <pod> -- ls -l /mnt/data
# Group ownership should match fsGroup

# Check if pod violates security standards
kubectl apply -f pod.yaml
# Error: must set securityContext.runAsNonRoot=true
# Error: must not set securityContext.privileged=true

Exam task: "Create pod running as user 1000, drop all capabilities."

Verification:

# After creating pod
kubectl exec <pod> -- id -u
# Should output: 1000

kubectl get pod <name> -o yaml | grep -A 3 capabilities
# Should show:
#   drop:
#   - ALL

# Test that capabilities are dropped
kubectl exec <pod> -- ping 8.8.8.8
# Should fail (NET_RAW needed for ping)

5. Services and Networking (20% of the exam)

This section tests Services, Ingress, and NetworkPolicies. You need to verify services route traffic correctly, ingress rules work, and network policies enforce isolation.

Services

# Check service exists
kubectl get svc <name>
# Note: TYPE, CLUSTER-IP, PORT(S)

# Verify service has endpoints
kubectl get endpoints <name>
# Should show pod IPs, not <none>

# Check service details
kubectl describe svc <name>
# Verify: Type, Selector, Port, TargetPort, Endpoints

# If endpoints are empty
kubectl describe svc <name> | grep Selector
kubectl get pods --selector=<key>=<value>
# Pods must match selector exactly

kubectl get pods -l <selector>
# Check if pods are Ready (failing readiness = no endpoints)

Critical checks:

Selector matches pod labels (check with --show-labels)
TargetPort matches container port (describe pod shows container ports)
Pods are Ready (kubectl get pods shows READY 1/1)

Testing Service Connectivity

# From within cluster
kubectl run test --rm -it --image=busybox:1.28 -- sh
# Inside pod:
wget -qO- <service-name>:<port>
wget -qO- <service-name>.<namespace>.svc.cluster.local:<port>

# Using service IP
wget -qO- <cluster-ip>:<port>

# Port forward from local machine
kubectl port-forward svc/<service-name> 8080:80
# Then: curl localhost:8080

# Check DNS resolution
kubectl exec <pod> -- nslookup <service-name>
# Should return service ClusterIP

Service Types

ClusterIP (default):

# Verify type
kubectl get svc <name>
# TYPE should be ClusterIP

# Test from within cluster only
kubectl run test --rm -it --image=busybox:1.28 -- wget -qO- <service>:80

NodePort:

# Get NodePort number
kubectl get svc <name>
# PORT(S) shows: 80:30007/TCP (30007 is NodePort)

# Test from outside cluster
curl <node-ip>:30007
# Should reach service

# Verify port is in valid range (30000-32767)
kubectl describe svc <name> | grep NodePort

LoadBalancer:

# Check external IP
kubectl get svc <name>
# EXTERNAL-IP shows cloud LB address (or <pending>)

# Test external access
curl <external-ip>:<port>

# If stuck in <pending>
kubectl describe svc <name> | grep -A 10 Events
# May need cloud provider integration

Ingress

# Check Ingress resource
kubectl get ingress
# Shows HOSTS, ADDRESS, PORTS

# Verify Ingress details
kubectl describe ingress <name>
# Check: Rules (host, path, backend service), TLS config

# Verify backend services exist
kubectl describe ingress <name> | grep "Backend:"
kubectl get svc <backend-service>
# Service must exist and have endpoints

# Check ingress controller is running
kubectl get pods -n ingress-nginx
# Or whichever namespace has your ingress controller

# Get ingress address
kubectl get ingress <name> -o jsonpath='{.status.loadBalancer.ingress[0].ip}'

Testing Ingress:

# Test with Host header
curl -H "Host: example.com" http://<ingress-ip>/path

# Verbose output for debugging
curl -v -H "Host: example.com" http://<ingress-ip>/path
# Check status code: 200 OK, 404 Not Found, 502 Bad Gateway

# Test TLS (if configured)
curl -k https://<ingress-ip> -H "Host: example.com"
# -k skips certificate verification

Common Ingress issues:

# 404 Not Found
# - Host/path doesn't match Ingress rules
# - Check: kubectl describe ingress <name>

# 502 Bad Gateway
# - Backend service has no endpoints
# - Check: kubectl get endpoints <service>

# 503 Service Unavailable
# - Backend pods not ready
# - Check: kubectl get pods

# If ingress address is empty
kubectl get ingress <name> --watch
# Wait up to 60 seconds for provisioning

NetworkPolicies

# List network policies
kubectl get networkpolicy -A

# Check policy details
kubectl describe networkpolicy <name> -n <namespace>
# Verify: PodSelector, PolicyTypes, Ingress/Egress rules

# Verify policy targets correct pods
kubectl describe netpol <name> | grep "PodSelector"
kubectl get pods --show-labels -n <namespace>
# Labels must match for policy to apply

Critical understanding: NetworkPolicies are whitelists. Once a pod is selected by any policy, only explicitly allowed traffic works. Default is deny-all for that pod.

Testing NetworkPolicy

Test from blocked pod (should fail):

# Create test pod without required label
kubectl run test-deny --rm -it --image=busybox:1.28 -- sh
# Inside pod:
wget --spider --timeout=2 <service-name>
# Should timeout (no connection)

# Or test with curl
wget --timeout=2 -qO- <service-name>
# Should fail: "download timed out"

Test from allowed pod (should succeed):

# Create test pod WITH required label
kubectl run test-allow --rm -it --labels="access=true" --image=busybox:1.28 -- sh
# Inside pod:
wget --spider --timeout=2 <service-name>
# Should succeed: "remote file exists"

# Or get full response
wget -qO- <service-name>
# Should return data from service

Debugging NetworkPolicy:

# Verify CNI supports NetworkPolicies
kubectl get pods -n kube-system | grep -E 'calico|cilium|weave'
# Flannel does NOT support NetworkPolicies!

# Check if policy selects any pods
kubectl describe netpol <name>
kubectl get pods --selector=<policy-selector> -n <namespace>
# Count should match expected pods

# Test DNS (often forgotten in egress policies)
kubectl exec <pod> -- nslookup kubernetes.default
# If fails, policy may be blocking DNS (port 53)

# Use service IP instead of DNS
kubectl get svc <service>  # Get ClusterIP
kubectl exec <pod> -- wget -qO- <cluster-ip>:<port>

Common NetworkPolicy pitfall:

# Policy selects pods but they don't have the label
kubectl describe netpol allow-from-frontend | grep PodSelector
# PodSelector: role=backend

kubectl get pods --show-labels
# No pods have role=backend label!

# Result: Policy does nothing, all traffic still allowed
# Fix: Add labels to pods or fix policy selector

Imperative Commands & Time-Savers

Generate YAML (Most Important!)

CRITICAL: When generating YAML files, always use full kubectl commands, not the k alias. Aliases only work in interactive shell!

# Pod
kubectl run nginx --image=nginx $do > pod.yaml
# Creates basic pod YAML

kubectl run nginx --image=nginx --labels="app=web,env=prod" $do > pod.yaml
# Pod with labels

kubectl run nginx --image=nginx --env="KEY=value" $do > pod.yaml
# Pod with environment variable

# Deployment
kubectl create deployment web --image=nginx --replicas=3 $do > deploy.yaml
# Creates deployment YAML

# Service
kubectl expose pod nginx --port=80 --target-port=8080 $do > svc.yaml
# Expose pod as service

kubectl expose deployment web --type=NodePort --port=80 $do > svc.yaml
# Expose deployment as NodePort

# ConfigMap
kubectl create configmap app-config --from-literal=key=value $do > cm.yaml
kubectl create configmap app-config --from-file=config.txt $do > cm.yaml

# Secret
kubectl create secret generic db-secret --from-literal=password=pass123 $do > secret.yaml
kubectl create secret tls tls-secret --cert=tls.crt --key=tls.key $do > tls.yaml

# Job
kubectl create job test --image=busybox -- echo "Hello" $do > job.yaml

# CronJob
kubectl create cronjob test --image=busybox --schedule="*/5 * * * *" -- echo "Hello" $do > cronjob.yaml

Workflow: Generate → Edit → Apply → Verify

kubectl run web --image=nginx $do > pod.yaml
# Edit pod.yaml (add resources, volumes, etc.)
kubectl apply -f pod.yaml
kubectl get pod web  # Verify

kubectl explain (2x faster than docs!)

# Find field structure
kubectl explain pod.spec.containers.livenessProbe
# Shows all fields for liveness probe

kubectl explain deployment.spec.strategy
# Shows RollingUpdate and Recreate options

# Recursive view
kubectl explain pod --recursive
# Shows entire structure in one output

# Drill down
kubectl explain pod
kubectl explain pod.spec
kubectl explain pod.spec.containers
# Progressive exploration

Example: "Add readiness probe with HTTP GET to /health on port 8080."

kubectl explain pod.spec.containers.readinessProbe.httpGet
# Output shows:
#   path: <string>
#   port: <string>
#   httpHeaders: <[]Object>

# Now you know exactly what to write!

Quick Creation (No YAML Needed)

# Create and expose in one line
kubectl create deployment web --image=nginx --replicas=3 && \
kubectl expose deployment web --port=80 --type=NodePort

# Scale quickly
kubectl scale deployment web --replicas=5

# Update image
kubectl set image deployment/web nginx=nginx:1.20

# Create with labels
kubectl run nginx --image=nginx --labels="app=web,tier=frontend"

# Set environment variables
kubectl set env deployment/web DB_HOST=mysql

# Create from multiple literals
kubectl create configmap app --from-literal=key1=val1 --from-literal=key2=val2

# Copy file into pod
kubectl cp /local/file <pod>:/path/in/pod

# Create temporary debug pod
kubectl run debug --rm -it --image=busybox:1.28 -- sh

Exam Workflow

Step-by-step process:

1. Switch context:

kubectl config use-context <context-name>
kubectl config current-context  # Always verify!

Why critical: Wrong context = zero points even if solution is perfect.

2. Read question completely - note ALL requirements:

Pod name, image, replicas, labels, resources, volumes, etc.
Don't skim! Missing one requirement loses points.

3. Choose approach:

Simple task (create pod, scale, expose) → Use imperative command
Complex task (multiple volumes, init containers, scheduling) → Generate YAML

4. Execute solution:

Copy-paste names/images from question to avoid typos
Use tab completion for speed

5. Verify immediately:

kubectl get <resource>  # Created and basic status
kubectl describe <resource>  # Detailed status and events
kubectl logs <pod>  # If applicable

6. Confirm all requirements met:

Go back to question
Check each requirement with verification commands

7. Move to next question:

If stuck after 8 minutes, flag and move on
Come back if time permits

Time Management

First pass (90 min): Easy & medium questions
- Skip anything taking > 8 minutes
- Get "easy points" first
- 60-70% score possible from easy/medium alone
Second pass (25 min): Flagged difficult questions
- Now tackle hard ones
- Partial credit better than nothing
Final review (5 min): Verify critical tasks
- Check contexts were switched correctly
- Spot-check a few answers
- Quick verification pass
Golden rule: Never spend >8 minutes on one question!

Real-World Troubleshooting Scenarios

Scenario 1: Pod Stuck in Pending

Problem: Pod created but stuck in Pending state.

Investigation:

# Step 1: Check events
kubectl describe pod <pod-name> | grep -A 10 Events
# Look for: FailedScheduling, Insufficient resources

# Step 2: Check node resources
kubectl top nodes  # If available
kubectl describe nodes | grep -A 10 "Allocated resources"

# Step 3: Check PVC status (if using volumes)
kubectl get pvc
# STATUS must be Bound

# Step 4: Check node taints and pod tolerations
kubectl describe nodes | grep Taints
kubectl get pod <pod-name> -o yaml | grep -A 5 tolerations

Common causes:

Insufficient resources → Lower requests or add nodes
PVC not bound → Fix PVC/StorageClass issues
Node affinity mismatch → Check node labels
Taints without tolerations → Add tolerations

Scenario 2: CrashLoopBackOff

Problem: Pod keeps restarting, container crashes repeatedly.

Investigation:

# Step 1: Check previous logs (GOLDEN TICKET!)
kubectl logs <pod-name> --previous
# Shows why it crashed

# Step 2: Check restart count
kubectl get pods
# RESTARTS column shows how many times

# Step 3: Check liveness probe
kubectl describe pod <pod-name> | grep -A 10 Liveness
# Probe too aggressive?

# Step 4: Check resource limits
kubectl describe pod <pod-name> | grep "OOMKilled"
# Out of memory?

Common causes:

Application crash (missing env vars, can't connect to DB)
- Fix: Add env vars or create dependent services
Liveness probe too aggressive (no initialDelaySeconds)
- Fix: Add initialDelaySeconds: 30
OOMKilled (exceeds memory limit)
- Fix: Increase memory limit or optimize app

Scenario 3: Service Not Accessible

Problem: Service created but curl fails, connection timeout.

Investigation:

# Step 1: Check endpoints
kubectl get endpoints <service-name>
# Empty = problem!

# Step 2: Check selector matches pods
kubectl describe svc <service-name> | grep Selector
kubectl get pods --selector=<key>=<value>
# Labels must match exactly

# Step 3: Check if pods are Ready
kubectl get pods -l <selector>
# READY must be 1/1

# Step 4: Test connectivity
kubectl run test --rm -it --image=busybox:1.28 -- wget -qO- <service>:80

Common causes:

Selector mismatch → Update service selector or pod labels
Pods not ready → Fix readiness probe or application
Wrong targetPort → Must match container port
NetworkPolicy blocking → Check policies

Scenario 4: Ingress Returns 404

Problem: Ingress created but returns 404 Not Found.

Investigation:

# Step 1: Check ingress rules
kubectl describe ingress <name>
# Verify host and path match your request

# Step 2: Check backend service
kubectl get svc <backend-service>
kubectl get endpoints <backend-service>
# Service must exist and have endpoints

# Step 3: Test with exact Host header
curl -v -H "Host: example.com" http://<ingress-ip>/api
# Path and host must match exactly

# Step 4: Check ingress controller
kubectl get pods -n ingress-nginx
# Controller must be running

Common causes:

Host/path mismatch → Check Ingress rules vs your test
Backend service wrong → Fix service name in Ingress
No endpoints → Fix backend pods
Typo in host → example.com vs exampl.com

Scenario 5: NetworkPolicy Not Working

Problem: Created NetworkPolicy but traffic still allowed/blocked incorrectly.

Investigation:

# Step 1: Verify CNI supports NetworkPolicies
kubectl get pods -n kube-system | grep -E 'calico|cilium|weave'
# Flannel does NOT support NetworkPolicies!

# Step 2: Check policy selects pods
kubectl describe netpol <name>
kubectl get pods --show-labels
# Labels must match

# Step 3: Test from both allowed and denied pods
kubectl run test-deny --rm -it --image=busybox:1.28 -- wget --timeout=2 <service>
kubectl run test-allow --rm -it --labels="access=true" --image=busybox:1.28 -- wget --timeout=2 <service>

# Step 4: Check DNS isn't blocked
kubectl exec <pod> -- nslookup kubernetes.default
# May need to allow DNS (port 53) in egress

Common causes:

CNI doesn't support policies → Can't fix, just document
Selector doesn't match pods → Fix labels
Forgot to allow DNS → Add egress rule for port 53
Policy too restrictive → Add needed ingress/egress rules

Quick Reference Card

# Context (Always First!)
kubectl config use-context <context>
kubectl config current-context  # Verify!

# Quick checks
k get pods
k get svc,endpoints
k get events --sort-by=.metadata.creationTimestamp

# Describe & logs
k describe <resource> <name>
k logs <pod>
k logs <pod> --previous  # For crashes
k logs <pod> -c <container>  # Multi-container

# Exec into pod
k exec -it <pod> -- /bin/sh
k exec <pod> -- <command>

# Generate YAML
kubectl run <pod> --image=<image> $do > pod.yaml
kubectl create deployment <name> --image=<image> $do > deploy.yaml
kubectl expose deployment <name> --port=<port> $do > svc.yaml

# Verify resources
k get <resource>
k describe <resource> <name>
k get events
k get endpoints <service>

# RBAC
k auth can-i <verb> <resource>
k auth can-i <verb> <resource> --as <user>

# Network debugging
k run test --rm -it --image=busybox:1.28 -- sh
k exec -it <pod> -- nslookup <service>

# Deployment operations
k rollout status deployment/<name>
k rollout undo deployment/<name>
k scale deployment/<name> --replicas=<n>

Success Checklist

You're ready for CKAD when you can consistently:

✓ Complete 15-20 questions in under 100 minutes
Practice with timer. If you can't finish mock exams in 90-100 mins, you're not fast enough.

✓ Score 90%+ on practice exams
Real exam is harder. If you can't score 90% in practice, you might not pass the real thing.

✓ Generate YAML templates instantly
kubectl run nginx --image=nginx $do > pod.yaml should be automatic, no thinking.

✓ Use kubectl explain without hesitation
When you need field syntax, first thought should be kubectl explain, not "let me search docs."

✓ Debug pods in < 2 minutes
See failing pod, diagnose root cause in under 2 minutes using describe, logs, events.

✓ Switch contexts without errors
Copy command, run it, verify with current-context - second nature.

✓ Verify solutions systematically
After creating resource, automatically run verification commands without thinking.

You're NOT ready if:

You need notes for basic commands
You forget to switch contexts
You spend > 10 minutes on questions
You don't finish practice exams on time
You score < 85% on practice exams

Final Exam Tips

One week before:

Do full mock exams daily (Killer.sh, KodeKloud)
Review this guide
Focus on weak areas
Practice aliases until automatic
Review common failure scenarios

Day before:

Light review only (don't cram!)
Practice environment setup 5 times
Review troubleshooting workflows
Get 7-8 hours sleep
Prepare workspace (quiet room, good internet)

Exam day:

Arrive 15 minutes early
Set up environment (aliases, vim) - first 5 minutes
Follow your workflow, don't improvise
Keep moving forward
Verify every solution (30 seconds = points)
Use all available time

During exam:

Take deep breath every 5 questions
If stuck, move on immediately
Remember: 66% to pass, don't need perfection
Every verified answer = points in bank

Common Mistakes to Avoid

❌ Forgetting to switch context
Lost points even with correct solution. Always verify with current-context!

❌ Not verifying solutions
Created deployment but it's not running. 30 seconds of verification = difference between 0 and full points.

❌ Writing YAML from scratch
Takes 5-8 minutes, high chance of errors. Use --dry-run=client -o yaml instead!

❌ Spending > 8 minutes on one question
Flag and move on. Come back if time permits.

❌ Not using kubectl explain
Wasting time searching docs. kubectl explain gives answer in 30 seconds.

❌ Typing resource names manually
Typos cost time. Copy-paste from question, use tab completion.

❌ Not checking --previous logs
Pod in CrashLoopBackOff but didn't check previous logs. Logs tell you why it crashed!

❌ Ignoring Events section
Events show what failed. kubectl describe Events section is gold!

You've Got This!

Remember:

CKAD is challenging but passable with proper preparation
Speed comes from practice
Verification is not optional - it guarantees points
Imperative commands save 20-30 minutes
Systematic troubleshooting beats random guessing

Your preparation checklist:

✓ Read this guide completely
✓ Practice all commands until automatic
✓ Do all KodeKloud mock exams
✓ Score consistently >90% on practice
✓ Complete practice exams in <100 minutes
✓ Memorize verification patterns
✓ Practice troubleshooting workflows

Practice with KodeKloud:

🎯 Good luck on your CKAD certification! 🎯

FAQ

Q1: Should I use aliases if I've never practiced them?

No. Only if you've practiced for 1-2 weeks. Otherwise use full kubectl + autocomplete.

Q2: Why does `k` break in scripts but `$do` works?

k is a shell alias (interactive only). $do is an environment variable (works everywhere).

Q3: What's the quickest vim tweak?

expandtab, tabstop=2, shiftwidth=2 in .vimrc to avoid tabs in YAML.

Q4: How much time can setup save?

~10-20 minutes if practiced.

Q5: How do I quickly verify a solution worked?

kubectl get <resource> -o wide and check STATUS/READY/ENDPOINTS.

Q6: When should I NOT generate YAML?

When spec is trivial (simple pod/service). Imperative may be faster.

Q7: Can kubectl explain replace docs?

Mostly yes - especially for probe/env/volume fields you forget.

Q8: PVC stuck Pending - quick triage?

get sc (default?), describe pvc (Events), get pv (match?), check provisioner pod.

Q9: CrashLoopBackOff but logs empty?

Use kubectl logs <pod> --previous - shows crashed container's logs.

Q10: Biggest zero-point mistake?

Forgetting to switch context. Always kubectl config use-context then verify!

Q11: When should I write YAML from scratch?

Almost never. Generate with --dry-run=client -o yaml, then edit

Q12: HPA won't scale - common misses?

Metrics server not running, or pods missing CPU requests.

Q13: Service has no endpoints?

Selector doesn't match pod labels. Check with describe svc and get pods --show-labels.

Q14: Ingress returns 404?

Host/path mismatch, or backend service wrong/missing.

Q15: NetworkPolicy not working?

CNI doesn't support it (Flannel), or pod labels don't match policy selector.

Pramodh Kumar M

3 min read

The Hackathon Habit: Turning Everyday Problems Into Creative Sprints

Phuong Vu

Nov 24, 2025

4 min read

Beyond the CKA: How to Become a "Golden Kubestronaut" (And Why You Should)

Nimesha Jinarajadasa

Nov 18, 2025 • kubestronaut • golden kubestronaut • CKA

21 min read

How to Choose the Right Course During Black Friday Sales

Nimesha Jinarajadasa

Nov 11, 2025 • black friday • Cloud • AI

9 min read

How to Use the cURL Command to Download Files

Alexandru Andrei

Nov 11, 2025 • Command Line • linux commands • command prompt

19 min read

Freelancing Tips: What I Learned After 100+ Clients

Alexandru Andrei

Nov 7, 2025 • freelancing • career • work online

10 min read

Why Tech Leads Are The Most in-Demand Role of Gen AI and Agentic AI Evolution

Dheeraj Nayal

Nov 7, 2025 • Gen AI • Agentic AI • AI

16 min read

Kubernetes Best Practices in 2025: Scaling, Security, and Cost Optimization

Nimesha Jinarajadasa

Nov 5, 2025 • K8s • Kubernetes • autoscaling

18 min read

Top Kubernetes Certifications in 2025: Which One Should You Choose?

Nimesha Jinarajadasa

Oct 31, 2025 • Kubernetes • K8s • certifications

18 min read

Kubernetes Tutorial for Beginners 2025

Nimesha Jinarajadasa

Oct 23, 2025 • Kubernetes • K8s • K8s Tutorial

44 min read

CKA Exam Verification Guide

Pramodh Kumar M

Oct 19, 2025 • CKA • certifications • Kubernetes

Subscribe to Newsletter

Join me on this exciting journey as we explore the boundless world of web design together.