Kubernetes Cluster deployment in HA

I recreated the cluster with containerd

1 Like

Hi
I need help,

i have cluster with these configs

i have a multi control plan cluster with mentioned details

2 ha-proxies with keepalived configured

ha-proxy1: 172.29.28.46
ha-proxy2: 172.29.28.47
keepalive floating IP: 172.29.28.48

control-plan-nodes IPs

k8-master: 172.29.28.49
k8-master1: 172.29.28.50
k8-master2: 172.29.28.51

worker-node details

worker-node-1: 172.29.28.52

I have Nginx-pod deployed with service exposed at nodePort.

i am able access the nginx using node IP with service port

but unable to access it using Keepalived floating IP

I have also curled the nginx from ha-proxy using and got the successful output

what could be the reason for it?

You need to check the IP address of the HAProxy server and test whether it can forward requests to the Kubernetes cluster.

yes, it is able to forward the request, that I tested using telnet and curl.

I tested on same Nginx pod, i was able to get response from pod on HA-proxy

You should use curl to access HAProxy at 172.29.28.46 from worker-node-1 or a VM in the same network that can reach HAProxy. If it shows the NGINX page, then we can focus on the Keepalived service.

To check the Keepalived service, use systemctl to see if it’s running. After that, check the logs and share them here when you try to access the VIP.

I achieved the end result by using nginx-ingress-controller

but stuck in webhook error

when I create ingress for any service to route traffic through ingress-controller, it gives error.

when I disable webhook by deleteing kubectl delete -A ValidatingWebhookConfiguration ingress-nginx-admission, it works smoothly.

this was the error i faced

root@k8-master:/home/abdullah.naeem/nginx-web# kubectl apply -f ingress.yaml
*

Error from server (InternalError): error when creating “ingress.yaml”: Internal error occurred: failed calling webhook “validate.nginx.ingress.kubernetes.io”: failed to call webhook: Post “https://ingress-nginx-controller-admission.ingress-nginx.svc:443/networking/v1/ingresses?timeout=10s”: context

pods on same nodes are able to ping each other, but pods on different nodes can’t ping each other.

can you help me with this?

i am using calico

The pods cannot communicate across nodes, which is a common issue in Kubernetes. There are many possible reasons, but the most common one is networking.

Troubleshooting Steps:

  1. Check the CNI Plugin – Ensure the cluster’s network plugin (like Calico, Flannel, or Cilium) is properly installed and running.
  2. Verify Pod Network Configuration – Confirm that pods are assigned correct IPs and can reach each other within the cluster.
  3. Inspect Firewall Rules – Ensure there are no firewall rules blocking pod-to-pod communication across nodes.
  4. Validate Node-to-Node Connectivity – Confirm that nodes can reach each other over the pod network.
  5. Check Kube-Proxy & Network Policies – Ensure there are no misconfigured network policies or kube-proxy issues preventing communication.

I have tried all methods

i reinstalled CNI (calico)
Pods have been assigned IPs from designated CIDR
Firewall is off/disabled
node-node connectivity is good
i deleted all network policies and created a new one which allows all traffic

what else could be the reason!

impt

Pods are unable to reach network
like apt update don’t work

pods on worker-node1 are pingable only from worker-node1
pods on worker-node2 are pingable only from worker-node2

BUT

pods on worker-node1 are not pingable only from worker-node2 and vice versa

You have reinstalled the CNI. Based on my experience, it’s best to restart the node before testing the network again.

let me do that as well. restarting of node

Hi,

I think Issue got resolved after reboot!

Thank you so much for help!
may god bless you!
have a nice day!

Issue got resolved after changing encapsulation from VXLAN to IPIP

could you tell why VXLAN didn’t work?

I’m not sure about that. Some networking concepts are really deep, and while I understand them, I can’t explain them clearly. For detailed questions, it’s better to refer to a book on the topic.

1 Like

Hi,
I created Cluster with mentioned details

  1. 2 HA-proxies (load balanced by Keeplive)
  2. kube-apiserver LB
  3. Nginx-ingress-controller LB
  4. 3 master nodes
  5. 2 Worker-nodes
  6. CNI: Calico with IPIP encapsulation and BGP enabled
  7. CRI: containerd

Major issue faced

  1. Pod-Pod across node communication issue

Troubleshooting done

  1. used following encapsulation methods
  2. IPIP
  3. IPIPCrossSubnet
  4. Vxlan
  5. VXLANCrossSubnet
  6. None
    1. None of them worked for cross-node pod communication issue
  7. I checked
  8. calico-node logs
  9. calico-apiserver-logs
  10. related services status using calicoctl
    1. node status
    1. node-to-node-mesh was established
    2. I also changed it to “node-specific” and some other
2. ippools
3. subnets
4. etc

Issue

  1. Bgppeers were not created

Solution

  1. I had to create them manually with global status so my pod communication worked well.