Skip to content

Operating Systems & Linux Theory

This section covers foundational theoretical knowledge about the Linux kernel, operating system architecture, POSIX standards, and system administration concepts (e.g., LPIC-1 or CompTIA Linux+ notes).

A common hurdle in Linux system administration occurs when an upstream software project assumes one directory structure, but an OS package manager (like Debian's apt) enforces another.

For example, when installing Kubernetes networking plugins (CNI):

  • The upstream kube-flannel installer strictly places its binaries in /opt/cni/bin/.
  • Debian's compiled containerd package is hardcoded to look for CNI plugins in /usr/lib/cni/.

When containerd attempts to execute the flannel binary, it fails with a file not found error because of this mismatch.

Rather than altering the global configuration files of core services (which might be overwritten during the next apt upgrade), a common Linux administration pattern is to use symbolic links (symlinks).

ln -s /opt/cni/bin/flannel /usr/lib/cni/flannel

A symlink acts as a pointer. It satisfies the strict path requirements of the Debian-compiled service (containerd) by immediately redirecting the filesystem lookup to the actual location established by the upstream project (/opt/cni/bin/). Understanding how to bridge these path gaps using ln -s is a fundamental survival skill for Linux sysadmins.

Case Study: Containerd & Flannel CNI

In a bare-metal Kubernetes cluster, if a pod is stuck in ContainerCreating with the error failed to find plugin "flannel" in path [/usr/lib/cni], it's because:

  • kube-flannel (upstream) installs binaries to /opt/cni/bin/
  • containerd (Debian package) looks in /usr/lib/cni/

Instead of altering the global config.toml for containerd (which breaks idempotency during upgrades), a symlink (ln -sf /opt/cni/bin/* /usr/lib/cni/) eleganty resolves the conflict.


Memory Management: The Swap Problem

Swap space is disk storage used as overflow when physical RAM is full. The Linux kernel moves inactive memory pages to swap to prevent Out-Of-Memory (OOM) crashes. While excellent for general-purpose servers, Kubernetes fundamentally forbids swap memory.

Theoretical Conflict

The Kubernetes Scheduler (kube-scheduler) is responsible for placing Pods onto Worker Nodes based on resource requests and limits.

If a Pod requests 512Mi of RAM, the scheduler subtracts that from the node's total available capacity. It relies on deterministic mathematics: if Node A has 8Gi of RAM and runs 10 Pods using 8Gi, the node is completely full.

If the OS is allowed to quietly swap memory to disk:

  1. The scheduler loses its deterministic view of available resources. It might think memory is available when it's actually thrashing on a slow hard drive.
  2. Performance becomes highly unpredictable. A database pod requiring low latency might be silently swapped to disk by the kernel, ruining its performance without Kubernetes knowing why.

To guarantee Quality of Service (QoS), Kubernetes requires that when a process allocates RAM, it is allocated exclusively to fast, physical memory. Thus, swapoff -a is a hard requirement for the kubelet to run.


Systemd and Power Management

Modern Linux distributions (like Debian) use systemd as their init system and service manager. It is responsible for booting the OS, managing daemons, and handling hardware events like power buttons and laptop lids.

systemd-logind

The specific daemon responsible for managing user logins, seat management, and hardware power states is systemd-logind. By default on most Linux distributions, logind is configured for a desktop user experience. This means if it detects a laptop lid closing event from the ACPI subsystem, it assumes the user is putting the computer in a bag and triggers a suspend or hibernate state.

The Headless Server Conflict

When converting old laptops into headless Kubernetes nodes (bare-metal servers), this default behavior becomes a critical failure point. A server must remain online 24/7. Closing the lid to stack the laptops neatly will instantly sever network connections and take the node offline.

Configuration via logind.conf

To convert a laptop into a proper server, we must instruct systemd-logind to ignore the lid switch event. This is controlled via the /etc/systemd/logind.conf configuration file.

By changing the HandleLidSwitch directive:

HandleLidSwitch=ignore

We override the default suspend action. When the lid is closed, the hardware will still physically turn off the LCD backlight (saving power), but the OS will continue running uninterrupted. After modifying the configuration, the daemon must be restarted (systemctl restart systemd-logind) to apply the changes.