CronJob Troubleshooting

I am practicing for CKA using Killercoda scenrios. This is one of the scenario but I’m unable to fix it. Can someone help regarding it?

For this question, please set this context (In exam, diff cluster name)

kubectl config use-context kubernetes-admin@kubernetes

cka-pod pod exposed internally within the service name cka-service and for cka-pod monitor(access through svc) purpose deployed cka-cronjob cronjob that run every minute .

Now cka-cronjob cronjob not working as expected, fix that issue

You need to first inspect any pods created by the cronjob.

controlplane $ k get po
NAME                         READY   STATUS             RESTARTS       AGE
cka-cronjob-28955215-gkhh2   0/1     Error              5 (102s ago)   3m10s
cka-cronjob-28955216-7hnhp   0/1     CrashLoopBackOff   4 (48s ago)    2m10s
cka-cronjob-28955217-txwp5   0/1     CrashLoopBackOff   3 (25s ago)    70s
cka-cronjob-28955218-sf78n   0/1     CrashLoopBackOff   1 (8s ago)     10s
cka-pod                      1/1     Running            0              3m49s

Notice pods started by the cronjob are in crashloop. Chose one and get its logs

controlplane $ k logs cka-cronjob-28955216-7hnhp
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0curl: (6) Could not resolve host: cka-pod

Note the curl error. This indicates a DNS lookup failure. Note that it is trying to look up the pod - and you don’t get a pod’s address simply by name alone, so of course it is failing.

We are also told there is a service in front of the pod, so to fix this, the curl command in the cronjob should target the service not the pod, since the question states that monitoring should be via the service.

Edit the cronjob and fix this so it curls the service not the pod

k edit cj cka-cronjob

Wait for the cronjob to start another pod.
Now the new pod is still erroring, so there is something else to fix!
Check its logs

ontrolplane $ k logs cka-cronjob-28955223-b8k78
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:--  0:00:01 --:--:--     0
curl: (7) Failed to connect to cka-service port 80 after 1029 ms: Could not connect to server

This is a different error. DNS is working now and it is hitting the service, but it cannot make a connection.

Now inspect the service

ontrolplane $ k describe service cka-service 
Name:                     cka-service
Namespace:                default
Labels:                   <none>
Annotations:              <none>
Selector:                 app=cka-pod
Type:                     ClusterIP
IP Family Policy:         SingleStack
IP Families:              IPv4
IP:                       10.107.74.133
IPs:                      10.107.74.133
Port:                     <unset>  80/TCP
TargetPort:               80/TCP
Endpoints:                
Session Affinity:         None
Internal Traffic Policy:  Cluster
Events:                   <none>

Note that there is nothing for Endpoints. This means the selector in the service is not matching the labels of the pod. Fix this next

k get pod cka-pod --show-labels

The pod has no labels.
Check the service selector in the describe output above. Note it is app=cka-pod

Now edit the pod to give it the label app: cka-pod.
k edit pod will work here since we are editing metadata only.
Once this edit is complete, the service will have an endpoint and the crojob will start working.

controlplane $ k get po
NAME                         READY   STATUS      RESTARTS   AGE
cka-cronjob-28955233-j524t   0/1     Completed   0          2m3s
cka-cronjob-28955234-kpt44   0/1     Completed   0          63s
cka-cronjob-28955235-l6xvg   0/1     Completed   0          3s
cka-pod                      1/1     Running     0          20m

At this point it still tells me “validation failed” but as far as I am concerned, it is fixed since the cronjob is working. KodeKloud do not have access to how Killercoda’s grader works.

2 Likes

Thanks a ton @Alistair_KodeKloud, I was doing the same steps but verification is failing. So I was curious if I was doing it correctly. Thanks a ton for the detailed explanation.