2. Kubernetes Network Structure

After presenting some basic concept, now we can see basic structure of kubernetes.

2.1 Some points

I think we don’t need to present some details for the basic concept of k8s. Therefore, first, I want to make clear again, as we said above, in k8s the namespace is a space to hold all the Pods to make them isolated. This is similar with the basic usage of linux namespace, but linux namespace assign a separate IP, here in k8s, namespace not really effect on IP, so we will not consider k8s’s namespace here.

Each Pod assign an identical IP address

In k8s each Pod will assign an identical IP address, so that they can communicate by this identical IP address. For example in above picture, we can see

Inside Pod will share the same IP

Therefore, in above image, we can see in Pod1 there are two containers, container01 and container02. As IP of Pod1 is 192.168.101.1, the two containers share the same IP address as 192.168.101.1.

Service provides stable IP and tie with Pods

Service provides a an abstraction of a group of pods, allowing for load balancing and stable IPs/DNS for accessing pods. The Service abstraction (via iptables) ensures that traffic gets routed correctly.

Iptables work on both inter-nodes and intra-node mode

Normally, iptables works at Layer 3 (Network Layer), but here we will see, it also involves in Layer 2 (Data link Layer) traffic flow, when dealing with a ClusterIP Service, iptables rules are set up by Kubernetes to handle Service to pod translation.

kube-proxy’s core responsibility is to manage iptables (or IPVS) rules that map Service IPs to pod IPs.

It handles load balancing and routing, making it possible for a single Service IP to represent multiple pods. kube-proxy rewrites packet destinations to ensure traffic reaches the right pod, even though the incoming request targeted the Service IP.

2.2 4 Scenarios

2.2.1 Inside Pod containers

The first scenario is the containers communication in the same Pod. According to the points list in the first part, we know that the containers in the same pod will share the same IP address, therefore they can access each other by using localhost:PORT.

2.2.2 Intra-node communication

Intra-node scenario can be know from the name, it is the communications among Pods inside one Node.

As each Pod assign a identical IP address, therefore they can directly contact with each other by their IP address via bridge as we talked in previous article.

btw. these IPs are routable across nodes in the cluster, which manage by CNI. Therefore the approach by IP can be used to contact both in/out of the same Node.

Each pod gets its own IP address in Kubernetes. Pods communicate with each other via their IPs,

2.2.2.1 Service

However, actually it not always work that way. Normally, it ties Pod with the Service first. Why Service here? There are several reasons to make it better than directly contat:

Stable IP and DNS:
- Pod IPs are ephemeral: Pods in Kubernetes are not permanent. They can be created, deleted, or replaced (e.g., when scaling or restarting). Every time a pod is restarted or rescheduled on a different node, it gets a new IP address.
- Without a Service: imaging, for some reason, one pod is down, and k8s will recreate a new pod, then any application relying on that pod’s IP would break, as its IP would change.
- Services integrate with Kubernetes DNS. Each Service gets a DNS name that other pods can use to discover it. For example, myapp-svc can be resolved to the Service IP.
- Without a Service, you'd have to manage the DNS resolution or IP discovery for every pod, which becomes unmanageable as the number of pods or nodes increases.

2. Load balancing:

A Service can route traffic to multiple pods that serve the same application. This is essential for achieving horizontal scaling and load balancing.
Without a Service, you'd need to handle the logic of distributing traffic across multiple pod IPs manually.
The Service provides automatic load balancing between the pods (endpoints) behind it, ensuring even distribution of traffic.

3. Decoupling consumers from implementation:

Services provide a level of abstraction that decouples the consumers of the service (other pods, external clients) from the underlying pods running the application.
Consumers only need to know about the Service’s IP or DNS name (e.g., myapp-svc), without worrying about which pods are running the application or their IPs. This allows you to update, scale, or replace pods without impacting how they are accessed.

Therefore, without Services, managing dynamic pods and ensuring reliable access to applications in a Kubernetes cluster would be very difficult, especially in large or scalable deployments.

2.2.2.2 Some key points of intra-node traffic:

In this part we will give some critical points, because for the Pod to Pod communication, no matter the scenario is intra-node or inter-nodes, the traffic flow is similar, so I will put the detailed traffic flow in next part (inter-nodes traffic). And here we use ClusterIP service as example for explaination.

Pod1 makes a request to a ClusterIP Service (e.g., svc_B).
- Even if Pod1 (the source pod) and Pod2 (the destination pod) are on the same node, Pod1 does not directly send the request to Pod2’s IP. Instead, as we said above, in order to have a more stable IP, Pod1 will send the request to the Service IP (ClusterIP), which tied with Pod2.
iptables comes into play on the same node.
- Even for intra-node traffic, iptables rules are set up by Kubernetes to handle Service-to-pod translation (NAT).
- When Pod A sends a request to the ClusterIP, it hits the node’s iptables rules. These rules perform destination NAT (DNAT), which translates the ClusterIP into the actual IP address of one of the pods behind the Service (Pod2 in this case).
- The iptables rules ensure that the traffic is routed to the correct pod, based on endpoint selection (which pod the Service should route to).
The traffic is routed locally to Pod2:
- kube-proxy play the main role here during mapping the Service IP and pod’s IP by refereing or making the iptables.
- Once the ClusterIP is translated to Pod2’s IP address, the traffic is forwarded through the veth pair on the node.
- Once the translation is done (ClusterIP → pod IP), traffic is handed off to the Layer 2 bridge for local communication.

So basically the intra-node traffic via ClusterIP Service flow is like:

Pod1 sends traffic to the ClusterIP of the Service.
iptables (at Layer 3) intercepts the traffic and uses NAT to translate the ClusterIP into Pod2’s IP.
The traffic is then forwarded locally within the same node using the Linux bridge (Layer 2).
Pod2 receives the traffic on its eth0.

2.2.3 Inter-node communication

Next we will talk about the inter-node Pods communications, where the traffic flow from Pod1 to svc_B and then reach the destination Pod2.

Pod1 (IP: 192.168.1.1) sends a request.
- Source IP: 192.168.1.1 (Pod1's IP)
- Destination IP: 10.101.100.102 (svc_B's ClusterIP)
Request passes through Pod1's eth0 and enters the veth pair.
- Pod1's eth0 is paired with a veth interface on the node side (veth1).
Kubernetes Service (ClusterIP) comes into play:
- svc_B (10.101.100.102) is a ClusterIP service tied to Pod2 (192.168.1.2).
- Kubernetes performs load balancing across the pods tied to svc_B. This load balancing is typically done using iptables or IPVS (depending on the configuration of the cluster).
NAT (Network Address Translation) using iptables or IPVS:
- If the svc_B tied with more than one Pods. The ClusterIP (10.101.100.102) will not directly point to a single pod.
- Instead, when traffic hits svc_B’s ClusterIP (10.101.100.102), iptables (or IPVS) performs DNAT (Destination NAT) to map the ClusterIP to one of the pods behind the service, for example, either Pod3 (192.168.1.3) or Pod4 (192.168.1.4).
- The system uses a round-robin or other load balancing algorithm to decide which pod (Pod3 or Pod4) will receive the traffic.

This is the critical step where load balancing happens.

Packet rewriting:
- The destination IP is rewritten from 10.101.100.102 (svc_B’s ClusterIP) to 192.168.1.2 (Pod2). If more than noe Pods, it will decied based on the load balancing decision.
- The source IP remains 192.168.1.1 (Pod1’s IP), and the destination IP becomes 192.168.1.2 (Pod2).
Traffic flows via the node's bridge:
- The packet moves from veth1 (Pod1's veth on the node) across the bridge in the node’s network namespace.
- If Pod2 is on the same node (scenario 2: intra-node), the packet stays within the same node and reaches the respective pod's eth0.
- If Pod2 is on a different node, the packet leaves the node via the node’s eth0 interface and is routed to the appropriate node using the CNI (Calico, Flannel, etc.).
Pod2 receives the traffic:
- Once the packet reaches either Pod2 (192.168.1.2), the destination IP will be the actual Pod IP that received the traffic.

2.2.4 Exteranl request

Imagine a client outside the cluster try to send an HTTP request to an application hosted on a pod. How does the request move through the cluster network and reaches the pod. Here we will use NodePort rather than LoadBalancer to illustrate basic traffic flow during this process. As to the LoadBalancer or Ingress (Gateway API), we may discuss them later.

1. External Request to Node’s Public IP

Client sends the request directly to a NodeIP:NodePort which is mapped to the Service handling that application.

2. NodePort to ClusterIP Service

The request now reaches the NodePort on the node, which is configured to map traffic to the ClusterIP Service (a stable internal IP) associated with the destination pod(s).
NAT Translation:
- Source NAT (SNAT): To make the request appear as if it’s from the ClusterIP of the service rather than the client, improving consistency for the pod.
- Destination NAT (DNAT): To translate the destination IP from the ClusterIP to the selected pod’s IP.
After transalton, the ClusterIP Service can select and distribute incoming traffic to one of the backend pods for this service.

3. The selection from Service to Pod (intra-Node or inter-Nodes)

Intra-node traffic: If the selected pod is on the same node, the traffic flows directly from the Service to the pod’s IP within the same node.
Inter-nodes traffic: If the selected pod is on a different node, the Service routes the traffic across nodes via the CNI plugin, which establishes inter-node networking. The traffic is directed to the pod’s IP on the correct node.

4. Reaching the Pod’s eth0 Interface and Reply

Once the request reaches the destination node, it flows through the veth pair connecting the pod to the node’s network namespace.
The packet arrives at the pod’s eth0 interface, with the source IP appearing as the ClusterIP and the destination IP as the pod’s IP.
Finally, the application can process and respond to the request, sending a response back to the client by reversing the flow.

2.3 Further Reading

Above, I only gives a rough explain of the network models, if you are interesting in investigating more, here is a very detailed article:

A Guide to the Kubernetes Networking Model

In next article we will create a simple example for above scenarios.

Kubernetes Network (2)

How Kubernetes Network Architecture Works

Table of contents