Certified Kubernetes Administrator Exam Series (Part-8): Networking
In the previous blog of this 10-part series, we discussed Kubernetes Storage. This blog dives into how networking is implemented to facilitate communication between various components of a Kubernetes cluster.
Here are the eight other blogs in the series:
- Certified Kubernetes Administrator Exam Series (Part-1): Core Concepts
- Certified Kubernetes Administrator Exam Series (Part-2): Scheduling
- Certified Kubernetes Administrator Exam Series (Part-3): Logging & Monitoring
- Certified Kubernetes Administrator Exam Series (Part-4): Application Lifecycle Management
- Certified Kubernetes Administrator Exam Series (Part-5): Cluster Maintenance
- Certified Kubernetes Administrator Exam Series (Part-6): Security
- Certified Kubernetes Administrator Exam Series (Part-9): Troubleshooting
- Certified Kubernetes Administrator Exam Series (Part-10): Practice Topics
Introduction
Kubernetes has a modular design, which makes it highly flexible and scalable. Its architecture is divided into several distinct components, each with its own specific role and responsibility. These components work together to manage and orchestrate the various resources within a Kubernetes cluster.
Networking is a fundamental component of Kubernetes, as it enables communication between different cluster components, including PODs and microservices. Without proper networking, it would be impossible for these components to communicate with each other, and users would not be able to access Kubernetes services from external sources.
Series Outline
The series first goes through a number of traditional networking concepts whose knowledge is crucial to the implementation of Kubernetes networks. These include:
- Switching, routing, and gateways
- Domain Name Service
- Networking Namespaces
- Docker Networking
- Container Networking Interface
Then take a deep dive into Network implementation in Kubernetes, exploring topics such as:
- Cluster Networking
- POD Networking
- CNI in Kubernetes
- CNI Weave
- IPAM Weave
- Service Networking
- DNS in Kubernetes
- Ingress
By the end of this chapter, the candidate will have developed enough knowledge to configure communication within a network of Kubernetes PODs and services.
Try the Kubernetes Pods Lab for free
Prerequisite Concepts
This section covers the basic networking concepts needed to understand how communication is established between various machines in a network. These concepts are covered from the perspective of a System Administrator/Developer and not a network engineer since CKA involves configuring, managing, and maintaining networks already implemented by various Kubernetes solutions.
With the knowledge gained from these classes, the candidate will have a proper reference for the various operations and configurations performed to enable communication between different machines in a Linux environment. This is a prerequisite class, and anyone confident enough to set up networks in Linux machines could skip ahead to the later sections.
What is a Network?
At its most basic, a network is any link between two devices that enables communication and data exchange. These machines could either be virtual machines, workstations, or personal computers. So, how exactly does one machine ‘talk’ to another in a network? Routers and switches are devices used to set up connections between networked computers.
What is switching?
Switching is used to connect machines within the same network. In simple terms, a network switch is a device that connects devices together on a computer network, allowing them to communicate with each other. It operates at the data link layer of the OSI model and uses MAC addresses to forward data packets across the network.
Every machine in a network is allocated an IP Address, and the switch maintains a table containing the IP address of each host it is connected to. It will receive traffic destined for a certain IP address and forward it to the appropriate host. Every machine needs a network interface to connect to a switch. Interfaces could be physical or virtual, and examples include: eth0
, lo
, eth0.1
, wlan19
, radio0
, vlan2
, br0
, gre0
and teq0
among others. To view a host’s network interface, the ip link
command is used.
Assuming there are two machines, A and B, in a network whose IP address is 192.168.1.0
.
Each machine can then be assigned an IP Address to connect with the switch through the physical ethernet interface using the commands shown:
ip addr add 192.168.1.10/24 dev eth0
ip addr add 192.168.1.11/24 dev eth0
Machine A will now have the IP address 192.168.1.10
and machine B will have the IP Address 192.168.1.11
.
Once IP Addresses have been assigned, the machines can seamlessly communicate with each other through the switch. The switch can, however, only receive and forward packets between machines within the same network.
Suppose there is a second network whose IP address is 192.168.2.0
with machines whose IP addresses are 192.168.2.10
and 192.168.2.11
, a switch cannot connect machines on this network to ones in the network created above. Routing is implemented to connect machines running on different networks.
What is Routing?
Routing is the process of selecting a path for traffic to travel from one network to another. It involves the use of routers, which are devices that connect multiple networks together and direct traffic between them.
A router forwards data to and from machines located on different LAN/WANs using the network IP addresses. It maintains the IP configurations of its connected networks in a Routing Table. Routing table configurations on a host can be checked using the route
command. The routing table includes an IP address entry for every network interface the router connects to. For the two networks above, the router will have active ports for the IP addresses 192.168.1.0
and 192.168.2.0
.
When working with vast networks, every machine needs to know the exact network to which it’s forwarding traffic.
What is a Gateway?
A Gateway is a network passage, a doorway through which network traffic passes before being routed. A gateway is basically the device that forms an interface between networks. For the two networks above, the gateways will be set on the routers at IP addresses 192.168.1.1
and 192.168.2.1
respectively.
To configure the gateway so that a machine on network 1 (192.168.1.0
), can communicate with a machine on network 2 (192.168.2.0
), a route is added to the machine’s table using the command:
ip route add 192.168.2.0/24 via 192.168.1.1
The route
command is used to ascertain that the gateway has been activated. This route configuration has to be performed on every machine intending to join the network.
Suppose each machine in network 2 needs to access a web application hosted on the internet, e.g., Google servers at an IP address 172.217.194.0
, the router is connected to the service, and a routing table is added for every device accessing the service, i.e.,
ip route add 172.217.194.0/24 via 192.168.2.1
With so many services to be accessed on the internet, creating a routing table entry for each one can get taxing. A default gateway can be created for any services hosted on machines outside of the existing network using the command:
ip route add default via 192.168.2.1
Which can also be written as:
ip route add 0.0.0.0 via 192.168.2.1
Suppose a network has two routers, one to access the internet and another for the internal network. In that case, two separate entries must be configured on the routing table, one for the network and a default gateway for all public hosts.
It is also possible to set up a Linux host as a router.
Consider three hosts: A, B, and C. A and B are connected to a network whose IP Address is 192.168.1.0
. They both interface with this network through the physical ethernet link eth0
. B and C are connected to a network with an IP address 192.168.2.0
. B connects with the network through interface eth1
, while C connects through the interface eth0
. Assuming the assigned IP addresses for A and C are 192.168.1.5
and 192.168.2.5
respectively, as it connects with the two networks, B will have 2 IP addresses: 192.168.1.6
and 192.168.2.6
.
For Host A to reach Host C, it needs to know that the gateway is through Host B. This is achieved by adding a routing table:
ip route add 192.168.2.0/24 via 192.168.1.6
For Host C to send a response back to Host A, it will do so via Host B, which now acts as a router. This is done by creating an entry on the routing table of Host C, as shown:
ip route add 192.168.1.0/24 via 192.168.2.6
A connection has been made, and this can be verified using the route
command.
While a valid connection has been established, a ping
from host A to C will still return a blank response. This is because, by default, packets cannot be forwarded across different interfaces. For the case above, packets coming into port B through eth0
cannot be forwarded to any host via eth1
due to security reasons.
The configuration setting on sudo cat /proc/sys/net/ipv4/ip_forward
can be changed to explicitly allow packets to be forwarded through the different interfaces. By default, it is set to 0
which restricts the forwarding of packets across interfaces, setting it to a 1
changes this. This change is temporary and only lasts until the next reboot. To commit this permanently, the change has to be configured in the /etc/sysctl.conf
file with the setting net.ipv4.ip_forward=1
.
DNS
The ping
tool is used to test a host's reachability in a network. This is done by typing ping
followed by the machine’s IP address. For end-users, however, machines are remembered better by name, so it is important to associate each machine’s IP address with a recognizable name, such as db
for a database. This is achieved by adding an entry in the /etc/hosts
file, mentioning the intended name and the address assigned to it, i.e:
sudo cat /etc/hosts
Then the entry is added:
192.168.1.11 db
One can input as many names for as many systems as possible in the host file. The process of mapping server names to their IP addresses is known as Name Resolution. In a small work consisting of a few servers, entries in a host file are sufficient for name resolution and service discovery.
In a large, dynamic environment consisting of many servers, managing entries in host files can be difficult. Any time a server’s IP address changes, every entry in every machine’s host file has to be modified.
To solve this problem, all host entries are moved to a server for central access and management. This server is known as the DNS Server. This server acts like a telephone directory of the internet, matching referenced domain names with their correct IP addresses across multiple networks. If a hostname is missing on a host’s local file, it will look up entries in the DNS server.
To point a machine in a network to a DNS server, it is set in the DNS resolution configuration file at: /etc/resolv.conf
. Assuming the DNS Server’s IP address is 192.168.1.100, to point a host to this server, the entries are made on the host’s file as shown:
sudo cat /etc/resolv.conf
nameserver 192.168.1.100
If any of the server’s IP addresses change, only the DNS server is updated. This does not necessarily eliminate the need for host files, but it reduces the amount of time it takes to change domain names. The host’s file can be used to examine and resolve new hosts during testing.
In this case, the new host’s name and IP address are only added to the host’s file of particular machines. The new host will only be accessible to the machine with the updated host’s file. If a machine contains both a DNS configuration file and a Hosts file, the order in which names are resolved is defined in an entry on the /etc/nsswitch.conf
file.
The public DNS server 8.8.8.8
can be added to the resolv.conf
file. Google hosts this DNS server and references the IP addresses of most websites.
The domain name is typically structured into three parts:
- Top-Level Domain (TLD),
- Authoritative Domain, and
- Hostname.
For a domain name like www.mycompany.com, .com is the top-level domain and identifies the general purpose of the website. Here .mycompany is the authoritative domain that identifies the organization that owns the site. www is the hostname and indicates the specific machine being targeted by the DNS request.
When resolving a query, the DNS server uses a four-layered server architecture to deliver the final IP address to the client. The four layers of the DNS server are:
- Recursive Resolver
- Root Nameserver
- TLD Nameserver
- Authoritative Nameserver
When a client sends a DNS request, the recursive nameserver receives and starts requesting other servers to locate the IP address. The root server returns with a top-level domain server that stores information on the domain, e.g., .io, .com, .net, and .edu, among others.
The recursive server then queries the specific TLD nameserver, which returns the specific authoritative nameserver that will service the request. The recursive nameserver then queries the authoritative nameserver, which returns the specific IP address of the target host. The recursive nameserver then returns this IP address to the client (e.g., browser), who then directly queries the host by sending data that can be displayed on the browser.
To see this in action, consider a user wanting to access the apps.google.com webserver within an organization. The request first goes to the organization’s internal DNS server. Since the internal server does not know the google.com domain, it sends the request to the internet. A root DNS server will receive this request and then forward it to the server responding to .com requests. The .com DNS server then forwards this server to Google, which then checks the app’s service and returns its IP address to the organization’s server.
DNS Caching
This is the process of storing DNS search results to speed up the resolution process in the future. This way, the DNS server does not have to repeat the process every time a host needs to be reached.
A company can have multiple subdomains, such as mail.mycompany.com, hr.mycompany.com, web.mycompany.com, and sales.mycompany.com, for different functional units within the organization. To access these hosts via ping, the complete DNS name has to be stated in the command. To ensure these can be resolved by a machine using the hostname, the resolv.conf file is appended with mycompany.com as a search entry.
DNS records are the files in DNS servers that reference domain information. This information includes IP addresses in a domain and how to process queries from the domain. The most common types of DNS records include:
- A Records: maps a domain name to the IP address.
- AAAA Records: maps IpV6 addresses to the hostnames
- CNAME: maps one domain name to another
While ping
is the most popular tool for host resolution; other important tools that can be used include nslookup
and dig
.
Network Namespaces
Network namespaces are crucial in setting up container communication since they abstract a logical copy of the underlying host’s network stack. They allow isolation in containers since a container running in one namespace cannot directly communicate with a container in another namespace. All items in one namespace share routing table entities and network interfaces, allowing for modular communication.
Creating a Linux namespace is quite simple. It is achieved using a command taking the form:
sudo ip netns add <namespace-name>
For instance, to create two namespaces, Red and Blue, the following commands can be used:
sudo ip netns add red
sudo ip netns add blue
To check each namespace’s interface, the command used takes a form similar to:
sudo ip netns exec red ip link
Or,
sudo ip -n red link
These commands only function within the red
namespace, proving that namespaces actually provide isolation. To connect two namespaces, a -
(pipe) virtual connection is established.
First, the interfaces of the two namespaces are created and connected using a command similar to:
sudo ip link add veth-red type veth peer name veth-blue
The interfaces are then attached to their respective namespaces using the command:
sudo ip link set veth-red netns red
And,
sudo ip link set veth-blue netns blue
Each namespace is then assigned an IP address:
sudo ip -n red addr add 192.168.15.1/24 dev veth-red
And,
sudo ip -n blue addr add 192.168.15.2/24 dev veth-blue
The links are then brought up using the commands:
sudo ip -n red link set veth-red up
And,
sudo ip -n blue link set veth-blue up
The link is up, and this can be checked using ping
or by accessing the ARP table on each host.
sudo ip netns exec red ping 192.168.15.2
A virtual network is created within the machine to enable communication between different namespaces in a host. This is achieved by creating a Virtual Switch on the host. There are many virtual switch options available, including OpenVSwitch and Linux Bridge, among others.
A Linux bridge can be created using the v-net-0
interface on the host:
sudo ip link add v-net-0 type bridge
The link can then be brought up using the command:
sudo ip link set dev v-net-0 up
This network acts as a switch for the namespaces and as an interface for the host. To connect the namespaces to the virtual network, existing connections between the namespaces must first be deleted using the command:
sudo ip -n red link del veth-red
New ‘Virtual Cables’ are then created to connect namespaces to the bridge network. For instance, to connect the red namespace:
sudo ip link add veth-red type veth peer name veth-red-br
For the blue namespace:
sudo ip link add veth-blue type veth peer name veth-blue-br
These act as links between the namespaces and the bridge network. To connect the red namespace interface to the virtual cable:
sudo ip link set veth-red netns red
The other side is connected to the bridge network:
sudo ip link set veth-red-br master v-net-0
The same is repeated for the blue namespace:
sudo ip link set veth-blue netns blue
sudo ip link set veth-blue-br master v-net-0
The IP addresses are then set for the namespaces:
sudo ip -n red addr add 192.168.15.1/24 dev veth-red
sudo ip -n blue addr add 192.168.15.2/24 dev veth-blue
The links are then set up:
sudo ip -n red link set veth-red up
And,
sudo ip -n blue link set veth-blue up
The namespace links have already been set up, but there is no connectivity between the host and the namespaces. The bridge acts as a switch between the different namespaces and is also the host machine’s network interface. To establish connectivity between namespaces and the host machine, the interface is assigned an IP address:
sudo ip addr add 192.168.15.5/24 dev v-net-0
This is an isolated virtual network, and it cannot connect a host to an external network. The only passage to the world is through the eth0
interface on the host. Assuming a machine on the blue namespace intends to reach a host 192.168.1.3 on an external LAN whose IP address is 192.168.1.0, an entry is added to the machine’s routing table, providing an external gateway.
Since the host has an external-facing eth0
port and an interface connecting with the namespaces, it can be used as a gateway. This is enabled using the command:
sudo ip netns exec blue ip route add 192.168.1.0/24 via 192.168.15.5
While this establishes a connection between the namespace and the external LAN, a ping
request will still return a blank response. This is because the machine attempts to forward the message across two interfaces on the host machine. To enable the forwarding of messages between a namespace and the external LAN, a Network Address Translator (NAT) is enabled on the host. With NAT, the host sends the packets from the namespaces to external networks using its own name and IP address.
To enable NAT on a machine, IP tables are created to use the POSTROUTING chain to MASQUERADE all packets coming from the internal network as its own. This is done using the command:
sudo iptables -t nat -A POSTROUTING -s 192.168.15.0/24 -j MASQUERADE
If the LAN is connected to the internet and the namespace is also intended to connect to the internet, an entry is added specifying the host as the default gateway:
sudo ip netns exec blue ip route add default via 192.168.15.5
For an external host to reach a machine in the namespace, there are two options:
The private network can be exposed to the external host using iproute. This is not the most preferred option as it introduces some security issues.
The second option is to use a port-forwarding rule, which states that any traffic coming in through a specific port on the host is directed to a specific port on the private network. This can be achieved using the command:
iptables -t nat -A PREROUTING --dport 80 --to-destination 192.168.15.2:80 -j DNAT
Machines in the namespace are now reachable from an external network through the host’s port 80 at the internal network’s port 80.
Docker Networking
Docker offers powerful networking capabilities since Docker containers are platform-agnostic. Docker containers can be connected to each other and to non-docker containers, meaning with proper networking, containers can access services deployed even in non-Docker environments. It is important to understand how networks are established between containers to grasp how communication is set up in Kubernetes clusters.
When a container initiates, it is assigned one of three networks:
- None: In this network, the container is not assigned any IP address and is completely isolated from the host, external networks, and other containers. The container lacks an external network interface but has one local loopback interface. Containers connected with the none network can be used for batch jobs, and the
--network none
option is typically used to disable all networking capabilities. - Host: The host network creates a direct connection between hosts and containers inside them. This directly exposes the containers to public networks since containers and hosts share the same namespace. The host network is best applicable to standalone containers since the container shares ports available to the host and uses the same network stack as that of the host.
- Bridge: A Bridge is a software-defined network that allows containers to communicate with each other. When the Bridge is created, it also creates an internal interface for the host. Namespaces within the same bridge network can communicate with each other. Containers that are not connected to a bridge are completely isolated within a host. When the Bridge Link is set up, the host can act as a gateway for the containers to communicate with external networks.
Docker creates a default Bridge network on a host and attaches new containers to it unless it has been denied specifically. This section further explores the Bridge network as it uncovers how networking can be enabled between containers running on multiple distributed hosts.
Docker Bridge Configuration and Operations
When a Bridge network is initiated, its default IP Address is 172.17.0.0
. Whenever a new container is attached to the host, it is assigned an IP address of 172.17.x.x
. Docker recognizes this as the bridge
network, while the host recognizes it as a docker0
interface. This can be seen in the output of the docker network ls
and ip link
commands, respectively.
The bridge network acts as a switch for the namespaces and creates an internal interface on the host. This interface typically gets an IP address of 172.17.0.1
. Every time a container is created, Docker creates a namespace for it by default. The containers are then attached to the Bridge network using a ‘Virtual Cable’ pair, as described in the Network Namespaces section.
Containers that are connected via the bridge can communicate with each other but not with external hosts. To create an external connection, a Port Mapping Network Address Translation rule is added that sets the destination to include the container’s IP address.
Some commands used for Docker Networking operations include:
Command |
Usage |
docker network create |
Creates a Network |
docker network ls |
Lists networks running on a host |
ip netns |
Check namespaces created on a host |
ip link |
Check interfaces |
ip addr |
Checks IP Addresses |
CNI
The Container Network Interface (CNI) is a set of standards and libraries that define how networking should be implemented for containers and network plug-ins. CNI was developed to ensure interoperability between different container runtimes and orchestrators. In all runtimes, the procedure for creating networks follows a similar pattern, and differences only come about due to terminology between different platforms.
CNI outlines a set of rules and responsibilities to be followed by the network plug-in and the container orchestrator, ensuring that a container can be networked on any platform using any network solution. At its core, CNI exposes the interfaces responsible for the addition and removal of containers from networks.
In Kubernetes, CNI interfaces POD networking and network solution providers. CNI includes several default plug-ins that can be used on any platform. These include BRIDGE, VLAN, IPVLAN, and MACVLAN, among others. CNI also supports third-party plug-in solutions such as Weave, Flannel, Calico, and Cilium, among others.
Docker, however, does not implement CNI. It implements its own networking standard known as the Container Networking Model (CNM). When third-party plug-ins are used to replace the default docker driver, CNI can be implemented in Docker containers. Kubernetes achieves this by first assigning a none
network to the container, then manually invoking the bridge
network.
Cluster Networking
In Kubernetes, containerized workloads are encapsulated in PODs, which run inside machines called nodes. Nodes are of two types: worker and master nodes, and each kind contains various agents that run different services. A cluster typically contains at least one master node and several worker nodes.
Cluster networking allows different components in the cluster nodes to communicate. Different nodes in a cluster are exposed to its virtual private network using interfaces. The different agents then communicate with each other using assigned ports. The complete list of ports needed for cluster networking can be found in the official Kubernetes documentation.
POD Networking
One of the main problems to address in Kubernetes networking is enabling highly coupled container-to-container communications. POD networking is used to solve this. To eliminate the need for intentional network segmentation policies, Kubernetes outlines a few fundamental requirements for any networking solution. These include:
- Each POD gets assigned an IP Address so that direct links do not need to be made between PODs. This also eliminates the need to map container ports to host ports.
- A POD in a node can communicate with PODs on all other nodes without Network Address Translation (NAT)
- Node agents can communicate with all PODs in the node
- If the PODs are running in the host network, for instance, Linux, the PODs in the host network can connect with other PODs in all nodes without NAT
Every POD has a real IP address that it uses to communicate with each other. From a POD standpoint, it has its own ethernet namespace that needs to communicate with other network namespaces. Communication between these namespaces is achieved, as we saw in the ‘Network Namespaces’ section of this course.
Pod-to-Pod networking can be implemented so that Pods can communicate within the same node or across separate nodes. Every node has a Classless Inter-Domain Routing (CIDR) block with a defined set of IP addresses for all PODs hosted in the node. The PODs communicate through virtual ethernet devices (VETH pairs). A VETH pair is a link that connects network interfaces across a namespace. One part of the pair is assigned to the host’s root namespace, while the other interface is attached to the POD’s namespace.
In large production environments, routes are not configured on each server, as this would make network setup very complex. All hosts can use a special router as the default gateway. The router’s routing tables contain routing information about all other routers in the network. Each node’s bridge network becomes part of a larger cluster network that allows all PODs to communicate with other PODs in the cluster.
When a container is created, the kubelet
agent on the node looks at the CNI configuration specification in the command line argument. It then checks the CNI’s binaries directory and adds a namespace ID to the container by parsing the specifications in the script.
CNI in Kubernetes
Kubernetes uses CNI plugins to provide a network interface for cluster components and cleaning up POD networks.
The CNI plugin to be used for a specific cluster is specified by passing the --network-plugin=cni
option for the Kubelet service. Kubelet then sets up a network for every POD using the CNI configuration from the file --cni-conf-dir
. This file is created according to the CNI specification and includes a list of required network plugins, which should be present in --cni-bin-dir
.
CNI Weave
Weave enables flexible routing between different nodes by creating an overlay mesh network that covers every node in the cluster. The network relies on agents installed on each node to create a complete collection. These agents continually exchange information on the cluster’s topology to keep an up-to-date routing table.
Weave NET relies on two packet forwarding methods to send traffic to a POD in a different node: the Fast Datapath and Sleeve methods. With Fast Datapath, WeaveWorks relies on Linux Kernel’s Open vSwitch module so that packets can be forwarded to other nodes without leaving the userspace. When Fast Datapath routing can not be implemented, the Slower Sleeve method is employed, which relies on learned MAC addresses by hopping from POD to POD. This process is refined with time, allowing for more intelligent and faster routing for future requests.
If a CNI-enabled Kubernetes cluster already exists, Weave Net can be installed using a single command:
kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')"
When Weave is deployed in a cluster, it creates a Daemonset that should be present every time the cluster starts up. This includes a number of Weave PODs in the network’s deployment namespace. Each POD runs a Weave Net agent as a peer and typically encapsulates two containers: weave
and weave-npc
. weave-npc
is the controller responsible for implementing Kubernetes network policies.
IPAM Weave
The IP Address Manager (IPAM) is a suite of tools that enable the administration of the Domain Name Service (DNS) and Dynamic Host Control Protocol (DHCP). DNS and DHCP are the services in a network that enable the assignment and resolution of IP addresses for machines in a network. IPAM automates the discovery of IP addresses and DNS servers, allowing them to be managed from a single platform. IPAM also divides large blocks of the IP allocation range so that containers get unique IP addresses on the overlay mesh network.
Weave Net creates IP addresses in the 10.32.0.0/12
range by default. To override it, the --ipalloc-range
option in the plugin configuration has to be changed. Any time a new mesh network is created, the IP address range is declared. Weave Net then shares the range across all peers dynamically, and the addresses are assigned according to the cluster needs. IPAM data persists on disk so that IP address information and configuration are always ready any time a peer restarts. IPAM data is typically stored in a data volume container named weavedb
.
Service Networking
Pods in a node can communicate with each other directly without needing NAT or routing tables. Pod IP addresses are, however, not durable since Pods will be brought up and taken down in response to scaling needs or node crashes.
A Service in Kubernetes is an abstraction layer that defines a logical set of pods and a policy by which to access them. This enables a service to decouple user space from the actual pods, providing a stable IP address and DNS name for a set of pods. This makes it easier for clients to consume the services provided by the pods.
The Pods associated with the service are determined by a selector in the service configuration that points to a label in the Pod configuration files. Simply put, services in Kubernetes provide an abstract way to run several Pods as a network.
The default Kubernetes service is ClusterIP. This service has an internal IP address that is only accessible from within the cluster. This service exposes an application to Pods within the same cluster.
The nodePort service exposes a specified port on all nodes within a cluster, allowing external traffic to access the service directly. All Pods connect to the service through an IP address; then, the service exposes the application on a similar port on all nodes. External users can then access application services through this port.
Other popular services in Kubernetes include the loadBalancer service. LoadBalancer Service allows external users to access cluster resources.
When a service object is created, Kubernetes assigns it a stable, reliable IP Address. This is picked from a pool of available Kubernetes Service IP Addresses. Using an Internal Cluster DNS, Kubernetes also assigns a unique hostname to each service. Any healthy Pod can then be reached by running the application using both the hostname and ClusterIP.
The Kube-Proxy component manages the connection between Pods and services. It watches the Kube-API Server and then maps cluster IP addresses to healthy Pods by adding and removing Network Address Translation rules on the nodes. These rules can be made using various methods, including userspace, IPVS, and iptables (default). Service configuration also includes options to remap its listening port by specifying the port
through which clients access the application and a targetPort
where the application listens for POD traffic.
DNS in Kubernetes
This section explores how name resolution is handled for cluster Pods and services.
When a cluster is initiated, an internal DNS server is created by default to hold records for Pods and services. This makes it easy to access the services using DNS names that are simpler and more consistent.
If a Pod needs to reach a Service within its namespace, it can do so just by specifying the Service DNS name. To access a service in a different namespace, the namespace is specified in the DNS query.
The Kubernetes DNS server creates a subdomain name for every namespace. When looking to access the test-svc
service in the prod
namespace using a Pod in the default
namespace, the client URL can be used as shown:
curl http://test-svc.prod
All Pods and Services are grouped into two subdomains within a namespace. Every service falls under the svc
subdomain, while Pods fall under the pod
subdomain. All Pods and services are exposed through a root subdomain cluster.local
. This means that the complete DNS name for the test-svc
service is: test-svc.prod.svc.cluster.local
. This service can be accessed via the client URL as follows:
curl http://test-svc.prod.svc.cluster.local
By default, Kubernetes does not allow DNS records for Pods. The DNS name is obtained by replacing the (.) in the IP address with a (-). For instance, a Pod with an IP address 172.17.0.1
in the default
namespace within the cluster.local
domain will have its DNS name as:
172-17-0-1.default.pod.cluster.local
CoreDNS in Kubernetes
In Kubernetes versions lower than 1.12, the kube-DNS server performed an internal DNS resolution. Starting version 1.13+, Kubernetes started including CoreDNS as the default internal DNS service. CoreDNS is a flexible, fast, and efficient DNS server that can be used in various environments because it relies on chained plugins for specific DNS functions.
The Kubernetes plugin performs DNS-based Kubernetes Service Discovery. In earlier Kubernetes versions, the CoreDNS Kubernetes plugin can be configured to replace kube-dns in the cluster.
With the Kubernetes plugin, administrators have the ability to configure various options, including the application endpoint URL, TLS Certificates, namespaces to be exposed, labels of objects to be exposed, time-to-live (TTL) for responses, and fallthrough zones, among others. The complete list of the CoreDNS Kubernetes Plugin Configuration options can be accessed here. The plugin options and configurations are specified in a CoreFile
.
CoreDNS is deployed as a Deployment running 2 Pods in the kube-system
namespace. Each Pod runs an instance of the CoreDNS executable. The CoreDNS CoreFile configurations are then passed as a ConfigMap object.
The default CoreFile ConfigMap configuration looks like this:
apiVersion: v1
kind: ConfigMap
metadata:
name: coredns
namespace: kube-system
data:
Corefile: |
.:53 {
errors
health {
lameduck 5s
}
ready
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
ttl 30
}
prometheus :9153
forward . /etc/resolv.conf
cache 30
loop
reload
loadbalance
}
The CoreDNS Pod watches for any new Pods brought up and creates a record for them in a database. The database also includes search entries allowing services to be discovered using hostnames and other identifiable tags.
Ingress
In large, production-grade Kubernetes clusters, connecting users to an application will typically involve multiple services such as ClusterIP, NodePort, Reverse Proxy, and the Load Balancer services, among others. In Kubernetes, Ingress networking involves combining different services in a single object that is configured within the cluster.
The Ingress API supports path-based routing, virtual hosts, and TLS termination, which allow for the setup of a network load balancer that handles multiple backend services. This allows running cluster services to be accessible to users in a public or private cloud.
Just like any Kubernetes implementation, the Ingress Object includes a Controller and a resource object. An Ingress resource is created as a Custom Resource Definition (CRD). An appropriate Ingress controller implementation is then chosen for the cluster. Kubernetes supports and maintains three Ingress controllers: AWS, GCE, and Nginx. Other third-party controller projects available for Kubernetes clusters include HAProxy, Apache APISIX, Citrix, Gloo, Skipper, Traefik, and Voyager, among others. The complete list of available controllers can be found here.
Configuring the Ingress Resource
Just like any other Kubernetes resource, Ingress configuration files are defined using the apiVersion
, kind
, and metadata
fields. The apiVersion
is networking.k8s.io/v1
, and the kind
is ingress
. The metadata
field will include the Ingress’ name and the spec
field used to configure rules for incoming traffic requests and configuring the Load Balancer and Proxy services. The Ingress Name should be a valid DNS Subdomain name.
Every Ingress rule contains the following specifications:
- The host (optional) – The HTTP traffic rules are applied to the host specified in this field. If no hosts are specified, the rule will be enforced on all HTTP traffic coming in through a specified IP address.
- Paths – A list of paths defining an associated backend with a
service.name
and aservice.port.name
. For a Load Balancer to direct traffic to a particular service, the path and host should match those specified in the incoming HTTP request’s header. - Back-End – Combines service names with those of port names. This mapping is defined in a Custom Resource Definition (CRD). The backend receives requests that the Ingress has verified match the host and path of the rule.
Ingress can be implemented by different controllers within the same cluster. These controllers are configured differently. For each Ingress, a class is specified referring to an IngressClass
resource that points to the specific controller that implements the class.
Ingress Flavors
Ingress types vary by how they expose various cluster services. Kubernetes offers various options to expose a single service without needing Ingress. These include NodePort and LoadBalancer Services. Ingress can be implemented for a Single Service by specifying a default backend with no rules. The spec
field of the definition file will look similar to:
spec:
defaultBackend:
service:
name: test
port:
number: 80
The Simple Fanout Ingress Type forwards traffic from a single IP address to different services based on the HTTP URL specified in the request. In this case, the different services are distinguished using paths
. The spec
field of this type of Ingress will look similar to:
spec:
rules:
- host: darwin.com
http:
paths:
- path: /mail
pathType: Prefix
backend:
service:
name: service1
port:
number: 4200
- path: /sales
pathType: Prefix
backend:
service:
name: service2
port:
number: 8080
Ingress can be secured by specifying a secret containing a TLS certificate and Private Key. Referencing this secret in the Ingress then tells the Ingress Controller to secure communications between the Load Balancer and the client using Transport Layer Security (TLS).
Ingress Controllers come with default configurations and policies that enforce load balancing but do not feature advanced concepts. These features are accessible in Load Balancers defined for various services and deployment environments.
Some of the most common commands used when dealing with Ingress include:
Task |
Command |
Create an Ingress Resource |
kubectl apply -f <ingress-file-name> |
Viewing Ingress Configuration |
kubectl describe ingress <ingress-name> |
View the States of Ingress Added |
kubectl get ingress <ingress-name> |
Updating Ingress |
kubectl replace -f <ingress-file-name> |
Editing Ingress Configuration |
kubectl edit ingress <ingress-name> |
This concludes the Networking section of the CKA certification exam.
You can now proceed to the next part of this series: Certified Kubernetes Administrator Exam Series (Part-9): Troubleshooting
Here is the previous part of the Certified Kubernetes Administrator Exam Series: Certified Kubernetes Administrator Exam Series (Part-7): Storage
Research Questions
Here is a quick quiz with a few questions and sample tasks to help you assess your knowledge. Leave your answers in the comments below and tag us back.
Quick Tip – Questions below may include a mix of sample tasks, DOMC, and MCQ types.
1. If we use Docker as our container runtime. What is the interface/bridge created by Docker on this host?
[A] bridge
[B] docker0
[C] ens3
[D] eth0
2. What is the default port the kube-scheduler
is listening on in the control plane node?
[A] 8080
[B] 10259
[C] 6443
[D] 2380
3. 10.X.X.X
is the Pod IP address range configured by weave
[A] True
[B] False
4. What type of proxy is the kube-proxy
configured to use?
[A] firewalld
[B] ipvs
[C] iptables
[D] userspace
5. Where is the configuration file located for configuring the CoreDNS service?
[A] /etc/Corefile
[B] /etc/coredns/Corefile
[C] /var/coredns/Corefile
[D] /root/Corefile
[E] /etc/kubedns/Corefile
6. Task: You are requested to add a new path to your ingress to make the food delivery application available to your customers. Make the new application available at /eat
with the following specifications:
Ingress: ingress-wear-watch
Path: /eat
Backend Service: food-service
Backend Service Port: 8080
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
annotations:
nginx.ingress.kubernetes.io/rewrite-target: /
nginx.ingress.kubernetes.io/ssl-redirect: "false"
name: ingress-wear-watch
namespace: app-space
spec:
rules:
- http:
paths:
- backend:
service:
name: food-service
port:
number: 8080
path: /eat
pathType: Prefix
Summary
This section offers an in-depth exploration of all concepts needed to configure communication for applications running on Kubernetes. KodeKloud’s lectures include prerequisite networking lessons and extensive practical labs to ensure familiarity with all concepts related to networking in Containers and Kubernetes. With the concepts covered in this section, the candidate will be familiar with configuring & managing access to Kubernetes applications and cluster networks.
One of the best ways to prepare for Kubernetes certification exams is to learn and practice what you are learning. This will help you internalize the materials. Consider taking our exam preparation course.