Layer 2 Load Balancing and ARP
In cloud environments (like AWS or GCP), exposing a service to the outside world is as simple as setting its type to LoadBalancer. The cloud provider's integration controller intercepts this request, provisions a physical or virtual load balancer, assigns it a public IP, and routes traffic into your cluster.
In a bare-metal environment, this integration does not exist. If you create a LoadBalancer service, it will simply hang in the Pending state forever.
To bridge this gap, tools like MetalLB or kube-vip implement network load balancing directly within the cluster using standard networking protocols. The most common implementation for single-subnet homelabs is Layer 2 Load Balancing using ARP (Address Resolution Protocol).
How ARP Normally Works
When a computer (or router) wants to send an IP packet to a specific IP address on the local network (e.g., 192.168.1.200), it cannot just send the packet into the void. At the physical networking layer (Layer 2 of the OSI model), network switches only understand MAC Addresses, not IP addresses.
To find the MAC address associated with an IP address, the router uses ARP:
- The router sends a broadcast message to the entire local network: "Who has
192.168.1.200? Tell192.168.1.1(the router)." - The device that actually owns
192.168.1.200replies with a unicast message: "I have192.168.1.200, and my MAC address isAA:BB:CC:DD:EE:FF." - The router caches this mapping in its ARP table.
- The router forwards the physical Ethernet frames to
AA:BB:CC:DD:EE:FF.
How Layer 2 Load Balancing Hacks ARP
A bare-metal load balancer (like MetalLB) leverages ARP to "trick" the router into sending traffic to the Kubernetes cluster.
1. The IP Pool
You provide MetalLB with a block of IP addresses that are on the same subnet as your router, but are excluded from the router's DHCP pool (so the router never assigns them to a laptop or phone). For example: 192.168.1.200 - 192.168.1.250.
2. The Election (IP Ownership)
When you deploy a LoadBalancer service in Kubernetes:
- MetalLB assigns the service an IP from the pool (e.g.,
192.168.1.200). - The MetalLB "speaker" pods running on each physical node communicate with each other using a leader election protocol.
- They elect exactly one physical node to be the "owner" of
192.168.1.200.
3. The Gratuitous ARP Broadcast
The elected node (let's say it's Worker-01 with MAC 11:22:33:44:55:66) immediately sends out a Gratuitous ARP broadcast to the network.
A Gratuitous ARP is an unprompted announcement. The node essentially yells to the router and all switches:
"Update your caches! The IP address 192.168.1.200 is now located at my MAC address: 11:22:33:44:55:66!"
4. Traffic Flow
Now, when external traffic hits the router destined for the service at 192.168.1.200:
- The router looks at its ARP table and sees MAC
11:22:33:44:55:66. - It forwards the packets to
Worker-01. Worker-01receives the packets.kube-proxy(running onWorker-01) intercepts the packets usingiptablesorIPVSrules, and routes them to the correct backend Pods across the cluster.
Note: Even though
Worker-01is the single entry point for this specific IP, the actual application pods can be running onWorker-02orControl-Plane-01.kube-proxyhandles the internal cross-node routing automatically.
High Availability and Failover
The beauty of this design is its resilience. What happens if the physical machine Worker-01 loses power or crashes?
- The MetalLB speaker pods on the surviving nodes detect that the leader (
Worker-01) has stopped sending heartbeats. - They immediately hold a new election.
Worker-02(MACAA:BB:CC:11:22:33) is elected as the new owner for192.168.1.200.Worker-02instantly blasts a new Gratuitous ARP broadcast to the network: "Update your caches!192.168.1.200is now located at MY MAC address:AA:BB:CC:11:22:33!"- The physical router updates its ARP table.
- Within a few seconds, traffic seamlessly resumes flowing into the cluster via
Worker-02.
This allows your bare-metal Kubernetes cluster to achieve true high availability for workloads without needing a dedicated physical hardware load balancer appliance!
The strictARP Requirement
For MetalLB's ARP hacking to work correctly, there is one critical Kubernetes configuration that must be changed.
By default, the kube-proxy component (which manages iptables rules for routing traffic to pods) is configured to answer ARP requests for all IP addresses it knows about. If kube-proxy answers an ARP request for the 192.168.1.200 VIP before the MetalLB speaker pod can answer it, traffic will be blackholed.
To prevent this, kube-proxy must be configured with strictARP: true. This setting forces kube-proxy to strictly only answer ARP requests for the physical network interface's primary IP, leaving the Virtual IPs completely under MetalLB's control.
Layer 4 vs Layer 7 (MetalLB + Ingress Controllers)
A common point of confusion is the relationship between MetalLB and Ingress Controllers (like NGINX or Traefik). Why do you need both?
- MetalLB operates at Layer 4 (Transport/Network): It only understands IP addresses and TCP/UDP ports. It gets traffic from the router to a node.
- NGINX Ingress operates at Layer 7 (Application): It understands HTTP requests, URLs, and Domain Names.
If you have 10 different web applications in your cluster, you could use MetalLB to give each one a different IP address (e.g., 192.168.1.201, .202, .203). But you only have 50 IPs in your pool, and IPv4 addresses are scarce.
Instead, the modern pattern is to deploy one Ingress Controller.
- MetalLB assigns a single VIP (e.g.,
192.168.1.200) to the NGINX Ingress Controller service. - You point 10 different DNS records (
app1.homelab.local,app2.homelab.local) to that single IP. - Traffic arrives at NGINX. NGINX inspects the HTTP
Hostheader of the request to figure out which application the user is asking for, and routes the traffic to the correct backend pod.