How To Automate Node Recovery with Ansible
When a bare-metal node suffers a hard power loss, it often boots into Emergency Mode with a locked, Read-Only filesystem (preventing containerd from starting) or a rogue Swap partition remounted (causing kubelet to crash).
Manually SSHing into each broken node to run fsck, remount the drive, edit /etc/fstab, and restart services is slow and error-prone. This guide shows you how to automate the entire recovery process using a single Bash script wrapping Ansible Ad-Hoc commands.
Prerequisites
- Ansible installed on your admin workstation.
- SSH key-based access to the nodes.
- A local
.envfile containing the sudo passwords for your nodes (e.g.,PASS_k8s_worker_01=mysecretpassword).
Step 1: Create the Recovery Script
Create a new file named recover-node.sh and make it executable (chmod +x recover-node.sh).
#!/usr/bin/env bash
#
# Purpose: Automates bare-metal node recovery.
#
# What it does:
# 1. Reads the sudo password dynamically from .env to bypass interactive prompts
# 2. Removes swap from /etc/fstab and runs swapoff to fix kubelet crash loops
# 3. Runs an automatic filesystem check (fsck -y) on the root drive
# 4. Remounts the root drive as Read/Write to fix the read-only lockout
# 5. Restarts containerd and kubelet
set -eo pipefail
if [ -z "$1" ]; then
echo "Usage: ./recover-node.sh <node-hostname>"
exit 1
fi
NODE="$1"
# Dynamically parse the password from .env for the requested node
ENV_VAR="PASS_${NODE//-/_}"
if ! grep -q "^${ENV_VAR}=" .env; then
echo "❌ Error: Could not find ${ENV_VAR} in .env file."
exit 1
fi
# We use cut instead of sourcing to avoid executing arbitrary .env code
BECOME_PASS=$(grep "^${ENV_VAR}=" .env | cut -d '=' -f2-)
echo "============================================================"
echo "🚑 Initiating Node Recovery for: $NODE"
echo "============================================================"
# Define a helper function to run Ansible ad-hoc commands silently using the password
run_ansible() {
local cmd="$1"
ansible "$NODE" -i ansible/inventory/hosts.yaml -m shell -a "$cmd" -b -e "ansible_become_pass=$BECOME_PASS" > /dev/null
}
echo "==> [1/4] Disabling Swap and removing from fstab..."
run_ansible "swapoff -a || true"
run_ansible "sed -i '/swap/d' /etc/fstab"
echo "==> [2/4] Attempting filesystem repair (fsck)..."
# Note: If the drive is already mounted R/W, fsck will abort cleanly. We use || true to ignore the expected non-zero exit code.
run_ansible "ROOT_DEV=\$(findmnt -n -o SOURCE /) && fsck -y \$ROOT_DEV || true"
echo "==> [3/4] Remounting root filesystem as Read/Write..."
run_ansible "mount -o remount,rw /"
echo "==> [4/4] Restarting containerd and kubelet..."
run_ansible "systemctl restart containerd kubelet"
echo "✅ Recovery commands issued for $NODE."
echo "Run 'kubectl get nodes' in a few moments to verify."
Step 2: Understand the Mechanics
There are several powerful DevOps techniques used in this script:
Bypassing Ansible Prompts
Normally, to run a sudo command with Ansible, you use -K (--ask-become-pass), which halts the script and waits for human input. By grabbing the password from the .env file and passing it directly via -e "ansible_become_pass=$BECOME_PASS", the script runs 100% autonomously.
Dynamic Root Discovery
Instead of hardcoding /dev/sda1 (which might be wrong if the node has NVMe drives, e.g., /dev/nvme0n1p2), the script uses findmnt -n -o SOURCE / to dynamically ask the Linux kernel exactly which block device is serving the root filesystem before running fsck.
The || true Safety Net
When you run fsck on a drive that is already healthy and fully mounted, it refuses to run and returns an error code (aborting). If we didn't add || true, our set -e script would crash here. By appending || true, we tell Bash: "If fsck fails because the drive is fine, just ignore it and keep going."
Repairing Read-Only Lockdowns (fsck -y)
If a node suffers a sudden power loss, the Ext4 filesystem journal may not flush correctly, resulting in a "dirty" state. To protect your data, the Linux kernel will forcefully mount the drive as Read-Only upon reboot. This instantly breaks containerd and locks out Ansible.
Running fsck -y (filesystem check, auto-yes to all repairs) forces the kernel to replay the journal and fix corrupted inodes. Once clean, mount -o remount,rw / seamlessly flips the drive back to read/write mode without requiring a reboot!
Killing Rogue Swap Mounts
Kubernetes strictly forbids swap memory. Even if you ran swapoff -a during installation, Debian's systemd auto-generator will often aggressively remount swap partitions (like /dev/sda3) on every reboot if they are still present in /etc/fstab. If swap activates, the kubelet will intentionally crash-loop.
The script aggressively runs swapoff -a and deletes any line containing "swap" from /etc/fstab using sed to ensure the kubelet can start cleanly.
Step 3: Run the Recovery
When a node drops out of the cluster, simply run:
Within seconds, the read-only and swap locks are removed, containerd is restarted, the kubelet connects to the CRI, and the node flips back to the Ready state in Kubernetes.