commit ac695f506ff7dfa71b07b9c5c95fc114ea4c5919 Author: Michal Rydlikowski Date: Sun Mar 15 23:50:43 2026 +0000 first commit diff --git a/architecture.md b/architecture.md new file mode 100644 index 0000000..0788c2e --- /dev/null +++ b/architecture.md @@ -0,0 +1,246 @@ +# Architecture Decisions + +## Core Principles + +1. Build for homelab first, design for AWS/multi-cloud from the start +2. Labels as the universal abstraction — config attaches to labels, not machines +3. Code is the policy — declarations grant access, no separate policy management +4. Availability over consistency — stale data is acceptable, no data is not +5. No single point of failure — everything works offline with local cache +6. Don't reinvent the wheel — wrap existing tools, build the glue and UX +7. One engine everywhere — CLI, server, and init all use the same code path + +## The Tool: "lab" + +Unified infrastructure lifecycle platform. Full spec in `lab-tool-spec.md`. + +### Component Dependency Map + +``` +┌─────────────────────────────────────────────────────────────────────┐ +│ LAB PLATFORM │ +│ │ +│ ┌─────────────────────────────────────────────────────────────┐ │ +│ │ CORE (no external deps) │ │ +│ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌───────────────┐ │ │ +│ │ │ Label │ │ Group │ │ Targeting│ │ Render Engine │ │ │ +│ │ │ Engine │ │ Engine │ │ Engine │ │ (CLI tables, │ │ │ +│ │ │ │ │ │ │ │ │ TUI, diff) │ │ │ +│ │ └──────────┘ └──────────┘ └──────────┘ └───────────────┘ │ │ +│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────────┐ │ │ +│ │ │ Profile │ │ State Store │ │ Plugin Registry │ │ │ +│ │ │ Engine │ │ (SQLite + │ │ │ │ │ +│ │ │ (t-shirt │ │ Litestream) │ │ │ │ │ +│ │ │ sizes) │ │ │ │ │ │ │ +│ │ └──────────────┘ └──────────────┘ └──────────────────┘ │ │ +│ └─────────────────────────────────────────────────────────────┘ │ +│ ▲ depends on core │ +│ ┌────┴────────────────────────────────────────────────────────┐ │ +│ │ LIFECYCLE (depends on: core + providers) │ │ +│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────────┐ │ │ +│ │ │ Lifecycle │ │ Artifact │ │ K8s Deployer │ │ │ +│ │ │ Manager │ │ Builder │ │ │ │ │ +│ │ │ (plan/apply/ │ │ (puppet → │ │ │ │ │ +│ │ │ destroy) │ │ container) │ │ │ │ │ +│ │ └──────────────┘ └──────────────┘ └──────────────────┘ │ │ +│ └─────────────────────────────────────────────────────────────┘ │ +│ ▲ depends on lifecycle │ +│ ┌────┴────────────────────────────────────────────────────────┐ │ +│ │ IDENTITY & SECRETS (depends on: lifecycle) │ │ +│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────────┐ │ │ +│ │ │ Identity │ │ Secret Store │ │ Token Issuer │ │ │ +│ │ │ Manager │ │ (privileged │ │ (one-time join │ │ │ +│ │ │ (enroll, │ │ label, local│ │ tokens) │ │ │ +│ │ │ DNS, certs, │ │ cache, git │ │ │ │ │ +│ │ │ SSH keys) │ │ backup) │ │ │ │ │ +│ │ └──────────────┘ └──────────────┘ └──────────────────┘ │ │ +│ └─────────────────────────────────────────────────────────────┘ │ +│ ▲ depends on identity │ +│ ┌────┴────────────────────────────────────────────────────────┐ │ +│ │ OBSERVABILITY (depends on: core + identity) │ │ +│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────────┐ │ │ +│ │ │ Health │ │ Alert │ │ Audit Log │ │ │ +│ │ │ Aggregator │ │ Generator │ │ │ │ │ +│ │ │ │ │ (auto + user │ │ │ │ │ +│ │ │ │ │ defined) │ │ │ │ │ +│ │ └──────────────┘ └──────────────┘ └──────────────────┘ │ │ +│ └─────────────────────────────────────────────────────────────┘ │ +│ │ +│ ┌─────────────────────────────────────────────────────────────┐ │ +│ │ INTERFACES (depends on: everything above) │ │ +│ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────────┐ │ │ +│ │ │ gRPC/REST│ │ CLI │ │ TUI │ │ Web UI │ │ │ +│ │ │ API │ │ (cobra) │ │(bubbletea)│ │ (future) │ │ │ +│ │ └──────────┘ └──────────┘ └──────────┘ └──────────────┘ │ │ +│ └─────────────────────────────────────────────────────────────┘ │ +└─────────────────────────────────────────────────────────────────────┘ + +PROVIDER PLUGINS (external, loaded at runtime): + ┌────────────┐ ┌────────────┐ ┌──────────────┐ ┌────────────┐ + │provider-aws│ │provider- │ │provider- │ │provider-k8s│ + │ (Pulumi) │ │xcpng (XO) │ │baremetal │ │ (Pulumi) │ + └────────────┘ └────────────┘ │(Tinkerbell) │ └────────────┘ + └──────────────┘ +HEALTH PLUGINS: IDENTITY PLUGINS: + ┌────────────┐ ┌──────────┐ ┌───────────┐ ┌─────────────┐ + │health- │ │health- │ │id-openvox │ │id-dns │ + │prometheus │ │naemon │ │ │ │ │ + └────────────┘ └──────────┘ └───────────┘ └─────────────┘ + ┌────────────┐ ┌───────────┐ ┌─────────────┐ + │health- │ │id-ssh-ca │ │id-secret │ + │cloudwatch │ │ │ │ │ + └────────────┘ └───────────┘ └─────────────┘ +``` + +### Build Order (what depends on what) + +``` +Phase 1: CORE (can be built and tested independently) + ├── Label Engine + ├── Group Engine (depends on: labels) + ├── Targeting Engine (depends on: labels, groups) + ├── Profile Engine (t-shirt sizes) + ├── Render Engine + ├── State Store (SQLite + Litestream) + ├── Plugin Registry + ├── CLI framework (cobra) + └── gRPC/REST API skeleton + +Phase 2: PROVIDERS (can be built in parallel, each independent) + ├── provider-ssh (simplest, needed for onboarding existing machines) + ├── provider-baremetal (PXE boot — embedded DHCP/TFTP/HTTP server) + ├── provider-portainer (deploy via Portainer API) + ├── provider-k8s (needed for k8s deployments) + ├── provider-aws (Pulumi AWS) + └── provider-xcpng (Pulumi XO / XO REST API) + +Phase 3: LIFECYCLE (depends on: core + at least one provider) + ├── Lifecycle Manager (plan/apply/destroy) + ├── Onboarding (lab onboard — SSH detect + PXE boot + auto-enroll) + ├── Hardware detection (suggest labels from detected CPU/GPU/RAM/disk) + ├── Local mode (lab init --local, engine on user device) + ├── Self-deploy (lab init — deploy to remote target) + ├── Self-migration (lab server migrate) + └── Artifact Builder (puppet → container) + +Phase 4: IDENTITY (depends on: lifecycle) + ├── Token Issuer (one-time join tokens) + ├── OpenVox Enrollor (cert signing, node classification) + ├── DNS Manager (auto-registration, IP mobility) + ├── SSH CA integration + └── Secret Store (privileged label, local cache, git backup) + +Phase 5: OBSERVABILITY (depends on: core + identity) + ├── Health Aggregator (Prometheus, Naemon, CloudWatch plugins) + ├── Alert Generator (auto + user-defined, targeting engine) + ├── Four-pillar status (sync + puppet + health + identity) + └── Audit log + +Phase 6: UX POLISH + ├── TUI (bubbletea, k9s-style, cross-linked navigation) + ├── lab show / lab targets (visibility commands) + ├── lab render (multi-provider comparison) + └── Web UI (future) +``` + +### Key Concepts + +| Concept | Description | +|---------|-------------| +| **Labels** | Universal abstraction. Config (puppet classes, alerts, secrets, sizes) attaches to labels | +| **Groups** | Composable, nested, with exclusions. Target by label, group, server, environment | +| **Targeting** | Unified query syntax used everywhere: alerts, secrets, puppet, queries | +| **Four Pillars** | Every resource shows: Sync + Puppet + Health + Identity | +| **Profiles** | T-shirt sizing with per-provider mappings, user-owned | +| **Secret Store** | Privileged label holding all secrets, machines get only entitled subset | +| **Code = Policy** | `lab::secret()` in puppet code = usage AND access declaration | +| **Artifact Builder** | Same puppet modules → VM config OR container image | +| **Self-deploy** | Lab deploys itself using same engine as everything else | +| **Visibility** | Two-way: server→everything applied, label→all servers affected | + +## Infrastructure Stack + +| Layer | Homelab | AWS Equivalent | Status | +|-------|---------|----------------|--------| +| Orchestration | k3s | EKS | Decided | +| IaC engine | Pulumi | Pulumi | Decided | +| GitOps | ArgoCD | ArgoCD | Decided | +| Monitoring (k8s) | Prometheus + Grafana | Prometheus + Grafana | Decided | +| Monitoring (infra) | Naemon | N/A (bare metal only) | Decided | +| Secrets backend | TBD | TBD | Needs investigation | +| DNS | PowerDNS + ExternalDNS | Route53 + ExternalDNS | Decided — see `dns-research.md` | +| TLS / CA | TBD | TBD | Needs investigation | +| SSH CA | TBD | TBD | Needs investigation | +| Storage | Longhorn | EBS CSI | Decided | +| Config mgmt | OpenVox | OpenVox | Decided | +| Bare metal boot | Tinkerbell / iPXE | N/A | Needs investigation | +| State store | SQLite + Litestream | SQLite + Litestream | Leading candidate | +| Container build | Buildah / Docker | Buildah / Docker | Needs investigation | + +## Decisions Made + +| Decision | Choice | Why | Alternatives Considered | +|----------|--------|-----|------------------------| +| IaC engine | Pulumi | Real languages, plan/preview, component packages, XCP-ng provider exists | Terraform (no abstraction), Crossplane (no plan) | +| Config mgmt | OpenVox | Puppet fork, Apache 2.0, existing modules, active community | Puppet (Perforce EULA, 25-node limit) | +| Multi-cloud abstraction | Custom (Lab) | Nothing exists that does labels + plan + bare metal + XCP-ng | Crossplane (no plan), Terraform (re-implement per cloud) | +| Kubernetes | k3s | Puppet-friendly, multi-arch, lightweight, same K8s API as EKS | OpenShift (fights puppet), Talos (no SSH/puppet), MicroK8s (snap-based) | +| Target OS list | Ubuntu, Debian, Fedora, AlmaLinux, XCP-ng, VyOS | Multi-arch, each with different install automation | See `os-install-research.md` | +| State store | NOT etcd | etcd crashes over serving stale data, availability > consistency | Leading: SQLite + Litestream | +| Secret access model | Code = policy | Declarations in code/labels auto-grant access, no manual Vault policies | Manual Vault policy management | +| Secret distribution | Privileged store + local cache | Prevents secret sprawl, machines only get entitled secrets | Peer-to-peer sync (leaks secrets sideways) | +| Resilience model | Offline-capable | Local cache keeps everything running, git backup for DR | Central server dependency (FreeIPA burned us) | +| Bootstrap | Self-deploying | lab init uses same engine as lab apply, no special codepath | Separate init provider interface | + +## Evaluated and Rejected + +| Tool | Why Rejected | Details | +|------|-------------|---------| +| **Crossplane** | No plan/preview — dealbreaker for enterprise | `crossplane-evaluation.md` | +| **Foreman** | Obsolete, poor UX, user has used it | Memory: `feedback_foreman.md` | +| **Terraform/OpenTofu** | No multi-platform abstraction | Re-implement per cloud at thousands of nodes | +| **MAAS** | Bare metal only | No cloud VMs, no Puppet integration | +| **OpenShift** | Fights external config mgmt, heavy, limited ARM | See `kubernetes-flavors.md` | +| **Talos** | Immutable OS, no SSH, no puppet | Incompatible with our approach | +| **MicroK8s** | Snap-based | Puppet managing snaps is awkward | +| **HashiCorp Vault** | Not impressed, central-server mindset | Will evaluate alternatives (OpenBao, Infisical, etc.) | +| **etcd** | Consistency over availability | Crashes rather than serving stale data | +| **FreeIPA** | Unstable | Good features (DNS, SSH, CA, secrets) but unreliable | + +## Investigation Queue + +Things we've identified but haven't evaluated yet, in rough priority order: + +| # | Topic | Context | Options to Investigate | +|---|-------|---------|----------------------| +| 1 | Secret backend | Distributed, offline-capable, policy-filtered | OpenBao, Infisical, Conjur, SOPS+age, custom encrypted SQLite | +| 2 | ~~DNS auto-registration~~ | ~~Every managed resource auto-registered~~ | **DECIDED: PowerDNS + ExternalDNS** — see `dns-research.md` | +| 3 | SSH CA | CA-signed host keys, short-lived user certs | Vault SSH engine, OpenVox CA, step-ca, Teleport, Boundary | +| 4 | TLS / Internal CA | Machine certs, auto-renewal | OpenVox CA, Vault PKI, step-ca, cert-manager | +| 5 | Bare metal provisioning | Universal PXE agent + rootfs deploy (NOT native installers) | Wrap Tinkerbell vs build own agent — see `os-install-research.md` | +| 6 | State store | Embedded, auto-backup, auto-recover | SQLite+Litestream, bbolt, Badger | +| 7 | Container build | Puppet modules → OCI images | Buildah, Docker, Kaniko | +| 8 | Local cache encryption | Machine-specific key for secret cache | TPM 2.0, kernel keyring, LUKS-bound, secure enclave | +| 9 | Alert rendering | Generate monitoring configs from lab alerts | Prometheus rules, Naemon configs, CloudWatch | +| 10 | Input format | How users define resources and labels | YAML (Compose-like), Pkl, KCL, CUE, TypeScript | +| 11 | Auth (CLI to server) | Secure CLI-to-lab-server communication | mTLS, OIDC, Vault tokens | +| 12 | XCP-ng Pulumi provider | May need Upjet wrapper or direct API | Existing Terraform provider via Upjet, Pulumi XO provider | +| 13 | Multi-tenancy | Team scoping for labels/resources | Namespaces, RBAC, org hierarchy | +| 14 | Image production pipeline | Build rootfs tarballs per OS per arch | mkosi, debootstrap, dnf --installroot, Packer | +| 15 | Tinkerbell evaluation | Hands-on: does wrapping it work, or build our own agent? | HookOS + actions vs custom LinuxKit agent | +| 16 | XCP-ng rootfs extraction | How to produce deployable XCP-ng rootfs (not native installer) | Extract from ISO, capture installed system | +| 17 | VyOS rootfs extraction | How to produce deployable VyOS rootfs | VyOS build system, published images, Docker mode | +| 18 | Multi-arch PXE | Different boot chains for x86 BIOS, x86 UEFI, ARM UEFI | Per-arch agent OS builds, iPXE configs | + +## Project Files + +| File | Contents | +|------|----------| +| `lab-tool-spec.md` | Full platform specification (CLI examples, plugin interfaces, secrets, identity, bootstrap) | +| `architecture.md` | This file — decisions, dependencies, investigation queue | +| `hardware.md` | Homelab hardware inventory and node roles | +| `crossplane-evaluation.md` | Crossplane evaluation and rejection rationale | +| `config-format-research.md` | YAML alternatives research (Pkl, KCL, CUE, CDK8s, etc.) | +| `os-install-research.md` | OS install automation, rootfs production, image pipeline, deployment matrix | +| `kubernetes-flavors.md` | k3s chosen, OpenShift/Talos/MicroK8s rejected with rationale | +| `dns-research.md` | PowerDNS + ExternalDNS chosen, domain claims, health-checked DNS | diff --git a/bastion.sh b/bastion.sh new file mode 100755 index 0000000..f7e8a9a --- /dev/null +++ b/bastion.sh @@ -0,0 +1,337 @@ +#!/usr/bin/env bash +# ───────────────────────────────────────────────────────────────────── +# Lab PXE Bastion — ephemeral PXE server for bare-metal provisioning +# +# Turns this machine into a temporary PXE boot server. Target machines +# on the same network can PXE boot and get Fedora installed automatically. +# +# Usage: +# sudo bash bastion.sh # interactive, auto-detect everything +# sudo TARGET_HOSTNAME=puppet SSH_PUBKEY=~/.ssh/id_ed25519.pub bash bastion.sh +# +# Requirements: Fedora/RHEL host with dnsmasq, python3, curl +# ───────────────────────────────────────────────────────────────────── +set -euo pipefail + +# ──── Defaults (override via environment) ────────────────────────── +FEDORA_VERSION="${FEDORA_VERSION:-41}" +ARCH="${ARCH:-x86_64}" +HTTP_PORT="${HTTP_PORT:-8080}" +TARGET_HOSTNAME="${TARGET_HOSTNAME:-lab-node}" +TARGET_DISK="${TARGET_DISK:-}" # empty = anaconda auto-picks +SSH_PUBKEY="${SSH_PUBKEY:-}" # path to .pub file, auto-detected +TIMEZONE="${TIMEZONE:-Europe/London}" +LOCALE="${LOCALE:-en_GB.UTF-8}" +BASTION_DIR="${BASTION_DIR:-/tmp/lab-bastion}" + +# ──── Colors ─────────────────────────────────────────────────────── +RED='\033[0;31m'; GREEN='\033[0;32m'; YELLOW='\033[1;33m' +CYAN='\033[0;36m'; BOLD='\033[1m'; NC='\033[0m' + +log() { echo -e "${GREEN}[bastion]${NC} $*"; } +warn() { echo -e "${YELLOW}[bastion]${NC} $*"; } +err() { echo -e "${RED}[bastion]${NC} $*" >&2; } +die() { err "$@"; exit 1; } + +# ──── Preflight ──────────────────────────────────────────────────── +[[ $EUID -eq 0 ]] || die "Must run as root (need DHCP/TFTP ports). Use: sudo bash bastion.sh" + +command -v python3 >/dev/null || die "python3 not found" +command -v curl >/dev/null || die "curl not found" + +# Install dnsmasq if missing +if ! command -v dnsmasq >/dev/null; then + log "Installing dnsmasq..." + if command -v dnf >/dev/null; then + dnf install -y dnsmasq + elif command -v apt-get >/dev/null; then + apt-get install -y dnsmasq + else + die "Cannot install dnsmasq — install it manually" + fi +fi + +# ──── Auto-detect network ───────────────────────────────────────── +IFACE="${IFACE:-$(ip route | awk '/default/ {print $5; exit}')}" +SERVER_IP="$(ip -4 addr show "$IFACE" | awk '/inet / {split($2,a,"/"); print a[1]; exit}')" +NETWORK="$(echo "$SERVER_IP" | awk -F. '{print $1"."$2"."$3".0"}')" + +[[ -n "$SERVER_IP" ]] || die "Cannot detect IP on interface $IFACE" +log "Interface: ${BOLD}$IFACE${NC} IP: ${BOLD}$SERVER_IP${NC} Network: ${BOLD}$NETWORK${NC}" + +# ──── Auto-detect SSH pubkey ─────────────────────────────────────── +if [[ -z "$SSH_PUBKEY" ]]; then + # When run via sudo, check the real user's home + REAL_HOME="${HOME}" + if [[ -n "${SUDO_USER:-}" ]]; then + REAL_HOME="$(getent passwd "$SUDO_USER" | cut -d: -f6)" + fi + for keyfile in "$REAL_HOME/.ssh/id_ed25519.pub" "$REAL_HOME/.ssh/id_rsa.pub" "$REAL_HOME/.ssh/id_ecdsa.pub"; do + if [[ -f "$keyfile" ]]; then + SSH_PUBKEY="$keyfile" + break + fi + done +fi + +if [[ -n "$SSH_PUBKEY" && -f "$SSH_PUBKEY" ]]; then + SSH_KEY_CONTENT="$(cat "$SSH_PUBKEY")" + log "SSH key: ${BOLD}$SSH_PUBKEY${NC}" +else + warn "No SSH public key found. Root password will be set to 'changeme'." + warn "Set SSH_PUBKEY=/path/to/key.pub to use key-based auth instead." + SSH_KEY_CONTENT="" +fi + +# ──── Prepare directories ───────────────────────────────────────── +TFTPDIR="$BASTION_DIR/tftp" +HTTPDIR="$BASTION_DIR/http" +mkdir -p "$TFTPDIR" "$HTTPDIR" + +# ──── Cleanup handler ───────────────────────────────────────────── +DNSMASQ_PID="" +HTTP_PID="" +FW_OPENED=false + +cleanup() { + echo "" + log "Shutting down..." + [[ -n "$DNSMASQ_PID" ]] && kill "$DNSMASQ_PID" 2>/dev/null && log "Stopped dnsmasq" + [[ -n "$HTTP_PID" ]] && kill "$HTTP_PID" 2>/dev/null && log "Stopped HTTP server" + + if $FW_OPENED && command -v firewall-cmd >/dev/null; then + log "Removing firewall rules..." + firewall-cmd --quiet --remove-service=dhcp 2>/dev/null || true + firewall-cmd --quiet --remove-service=tftp 2>/dev/null || true + firewall-cmd --quiet --remove-port=${HTTP_PORT}/tcp 2>/dev/null || true + firewall-cmd --quiet --remove-service=proxy-dhcp 2>/dev/null || true + fi + + log "Done. Bastion artifacts remain in $BASTION_DIR" + log "Re-run this script to reprovision. Remove with: rm -rf $BASTION_DIR" +} +trap cleanup EXIT INT TERM + +# ──── Download artifacts (cached) ───────────────────────────────── +download() { + local url="$1" dest="$2" label="$3" + if [[ -f "$dest" ]]; then + log " ${label} — cached" + return + fi + log " ${label} — downloading..." + curl -# -L -o "$dest" "$url" || die "Failed to download $label from $url" +} + +FEDORA_MIRROR="https://download.fedoraproject.org/pub/fedora/linux/releases/${FEDORA_VERSION}/Everything/${ARCH}/os" + +log "Fetching boot artifacts (Fedora ${FEDORA_VERSION} ${ARCH})..." +download "https://boot.ipxe.org/undionly.kpxe" "$TFTPDIR/undionly.kpxe" "iPXE BIOS" +download "https://boot.ipxe.org/ipxe.efi" "$TFTPDIR/ipxe.efi" "iPXE UEFI" +download "${FEDORA_MIRROR}/images/pxeboot/vmlinuz" "$HTTPDIR/vmlinuz" "Fedora kernel" +download "${FEDORA_MIRROR}/images/pxeboot/initrd.img" "$HTTPDIR/initrd.img" "Fedora initrd" + +# ──── Generate kickstart ────────────────────────────────────────── +log "Generating kickstart for ${BOLD}${TARGET_HOSTNAME}${NC}..." + +# Disk config +if [[ -n "$TARGET_DISK" ]]; then + DISK_CMDS="ignoredisk --only-use=${TARGET_DISK} +clearpart --all --initlabel --drives=${TARGET_DISK} +autopart --type=plain" +else + DISK_CMDS="clearpart --all --initlabel +autopart --type=plain" +fi + +# Auth config +if [[ -n "$SSH_KEY_CONTENT" ]]; then + AUTH_CMDS="rootpw --lock +sshkey --username=root \"${SSH_KEY_CONTENT}\"" +else + AUTH_CMDS='rootpw --plaintext changeme' +fi + +cat > "$HTTPDIR/ks.cfg" << KICKSTART +# Lab Bastion — Fedora ${FEDORA_VERSION} kickstart +# Generated: $(date -Iseconds) +# Target: ${TARGET_HOSTNAME} + +# Install mode +text +reboot + +# Locale +lang ${LOCALE} +keyboard uk +timezone ${TIMEZONE} --utc + +# Network +network --bootproto=dhcp --activate --hostname=${TARGET_HOSTNAME} + +# Auth +${AUTH_CMDS} + +# Disk +${DISK_CMDS} + +# Bootloader +bootloader --append="console=tty0 console=ttyS0,115200n8" + +# Install source +url --mirrorlist=https://mirrors.fedoraproject.org/mirrorlist?repo=fedora-\$releasever&arch=\$basearch + +# Packages — minimal server + essentials +%packages +@core +@server-product +openssh-server +vim-enhanced +tmux +git +curl +python3 +dnf-plugins-core +%end + +# Post-install +%post --log=/root/bastion-post-install.log +#!/bin/bash +set -x + +# Ensure SSH is enabled +systemctl enable --now sshd + +# Allow root SSH with key (password auth disabled) +sed -i 's/^#\?PermitRootLogin.*/PermitRootLogin prohibit-password/' /etc/ssh/sshd_config +sed -i 's/^#\?PasswordAuthentication.*/PasswordAuthentication no/' /etc/ssh/sshd_config + +# Set hostname +hostnamectl set-hostname ${TARGET_HOSTNAME} + +# Leave a breadcrumb +echo "Provisioned by lab-bastion on $(date -Iseconds)" > /etc/lab-provisioned + +# Placeholder: puppet enrollment will go here later +# puppet is not installed yet — this IS the puppet server +echo "# Lab bootstrap node — puppet server setup pending" > /root/README + +%end +KICKSTART + +log "Kickstart written to ${HTTPDIR}/ks.cfg" + +# ──── Generate iPXE boot script ─────────────────────────────────── +cat > "$HTTPDIR/boot.ipxe" << IPXE +#!ipxe + +echo +echo ======================================= +echo Lab PXE Bastion — Fedora ${FEDORA_VERSION} +echo Target: ${TARGET_HOSTNAME} +echo ======================================= +echo + +kernel http://${SERVER_IP}:${HTTP_PORT}/vmlinuz inst.ks=http://${SERVER_IP}:${HTTP_PORT}/ks.cfg inst.repo=${FEDORA_MIRROR} inst.text +initrd http://${SERVER_IP}:${HTTP_PORT}/initrd.img +boot +IPXE + +# ──── Generate dnsmasq config ───────────────────────────────────── +cat > "$BASTION_DIR/dnsmasq.conf" << DNSMASQ +# Lab PXE Bastion — dnsmasq config +# ProxyDHCP mode: adds PXE options without replacing existing DHCP + +# Disable DNS (we only want DHCP/TFTP) +port=0 + +# Listen on the right interface +interface=${IFACE} +bind-interfaces + +# ProxyDHCP — works alongside existing DHCP (UniFi etc) +dhcp-range=${NETWORK},proxy + +# TFTP for initial PXE boot +enable-tftp +tftp-root=${TFTPDIR} + +# Detect client architecture +dhcp-match=set:bios,option:client-arch,0 +dhcp-match=set:efi64,option:client-arch,7 +dhcp-match=set:efi64,option:client-arch,9 + +# Detect iPXE clients (already chainloaded) +dhcp-userclass=set:ipxe,iPXE + +# First PXE boot → serve iPXE binary via TFTP +dhcp-boot=tag:bios,tag:!ipxe,undionly.kpxe +dhcp-boot=tag:efi64,tag:!ipxe,ipxe.efi + +# iPXE clients → chain to boot script via HTTP +dhcp-boot=tag:ipxe,http://${SERVER_IP}:${HTTP_PORT}/boot.ipxe + +# Verbose logging (see what's happening) +log-dhcp +DNSMASQ + +# ──── Open firewall ─────────────────────────────────────────────── +if command -v firewall-cmd >/dev/null && firewall-cmd --state >/dev/null 2>&1; then + log "Opening firewall ports (DHCP, TFTP, HTTP:${HTTP_PORT})..." + firewall-cmd --quiet --add-service=dhcp + firewall-cmd --quiet --add-service=tftp + firewall-cmd --quiet --add-port=${HTTP_PORT}/tcp + # ProxyDHCP uses port 4011 + firewall-cmd --quiet --add-port=4011/udp 2>/dev/null || true + FW_OPENED=true +fi + +# ──── Stop conflicting services ─────────────────────────────────── +# dnsmasq might be running as a system service +if systemctl is-active --quiet dnsmasq 2>/dev/null; then + warn "System dnsmasq is running — stopping it temporarily" + systemctl stop dnsmasq + RESTART_DNSMASQ=true +fi + +# ──── Start HTTP server ─────────────────────────────────────────── +log "Starting HTTP server on :${HTTP_PORT}..." +(cd "$HTTPDIR" && python3 -m http.server "$HTTP_PORT" --bind 0.0.0.0 >/dev/null 2>&1) & +HTTP_PID=$! +sleep 0.5 + +if ! kill -0 "$HTTP_PID" 2>/dev/null; then + die "HTTP server failed to start — is port ${HTTP_PORT} in use?" +fi + +# ──── Start dnsmasq (proxyDHCP + TFTP) ──────────────────────────── +log "Starting PXE server (proxyDHCP on ${IFACE})..." +echo "" +echo -e "${CYAN}${BOLD}════════════════════════════════════════════════════════${NC}" +echo -e "${CYAN}${BOLD} PXE Bastion ready!${NC}" +echo -e "${CYAN}${BOLD}════════════════════════════════════════════════════════${NC}" +echo "" +echo -e " Network: ${BOLD}${NETWORK}/24${NC} via ${BOLD}${IFACE}${NC}" +echo -e " HTTP: ${BOLD}http://${SERVER_IP}:${HTTP_PORT}/${NC}" +echo -e " OS: ${BOLD}Fedora ${FEDORA_VERSION} (${ARCH})${NC}" +echo -e " Hostname: ${BOLD}${TARGET_HOSTNAME}${NC}" +echo -e " Kickstart: ${BOLD}http://${SERVER_IP}:${HTTP_PORT}/ks.cfg${NC}" +echo "" +echo -e " ${YELLOW}Now PXE-boot the target machine.${NC}" +echo -e " ${YELLOW}Set boot order to Network/PXE in BIOS, or use one-time boot menu.${NC}" +echo "" +echo -e " Press ${BOLD}Ctrl-C${NC} to stop the bastion." +echo "" +echo -e "${CYAN}──── dnsmasq log (watch for DHCP/PXE requests) ────${NC}" +echo "" + +# Run dnsmasq in foreground so logs stream to terminal +dnsmasq --no-daemon --conf-file="$BASTION_DIR/dnsmasq.conf" & +DNSMASQ_PID=$! + +# Wait for dnsmasq — if it exits, something went wrong +wait "$DNSMASQ_PID" || { + err "dnsmasq exited unexpectedly. Check if another DHCP/TFTP service is running." + err "Try: ss -ulnp | grep -E ':(67|69|4011) '" + exit 1 +} diff --git a/config-format-research.md b/config-format-research.md new file mode 100644 index 0000000..66b7fb4 --- /dev/null +++ b/config-format-research.md @@ -0,0 +1,121 @@ +# Configuration Format Research + +## Decision: PENDING — exploring alternatives to raw Kubernetes YAML + +## The Problem + +Kubernetes YAML is verbose, repetitive, lacks type safety, and forces users to specify +every layer of concern (intent, team defaults, org standards, k8s boilerplate) in one file. +Helm "solves" this with Go templating, which produces unreadable template spaghetti. + +Docker Compose is the gold standard for UX — 6 lines vs 35 for the same deployment. +The problem was never YAML itself; it was being forced to write too much of it. + +## Core Design Principle + +Users should only define what they care about. Everything else should be inherited from +expert-defined defaults. YAML (or JSON) can exist underneath as: +- Easy, non-binary backup format +- Live editing capability +- Debugging / inspection output + +## Layered Architecture + +``` +Layer 1: User intent "I want an api service running myapp" ← USER WRITES THIS +Layer 2: Team defaults "Our services get health checks, limits" ← Team lead defines +Layer 3: Org standards "All pods need security context, labels" ← Platform team defines +Layer 4: Output Full YAML/JSON for kubectl, backup, debug ← GENERATED +``` + +Docker Compose feels good because it's only Layer 1 — Docker handles the rest. +Kubernetes forces all 4 layers into one file. + +## Evaluated Alternatives + +### Tier 1 — Strong Contenders + +**Pkl (Apple)** +- Best syntax for "amend a template" via `amends` keyword +- Strong static typing, clean readable syntax +- Lowest ceremony for simple cases +- Risk: Apple may abandon it, requires JVM runtime +- K8s support: `pkl-k8s` package exists + +**KCL (CNCF Sandbox)** +- Python-like syntax, lowest learning curve of typed options +- Schema defaults, validation, constraints built in +- CNCF backing gives legitimacy +- Risk: primarily driven by Ant Group (Alibaba) + +**CUE** +- Most principled — constraint-based unification, not inheritance +- Used by Timoni (Helm replacement), KubeVela, Dagger +- Defaults marked with `*`, types and values on same spectrum +- Risk: steep learning curve, novel paradigm +- Most mature K8s ecosystem of the three + +### Tier 2 — Viable But Weaker Fit + +**CDK8s+ (TypeScript)** +- Full IDE support, strongest type safety +- cdk8s+ has intent-driven APIs ("I want a web service" → generates Deployment+Service) +- Risk: brings software engineering complexity into config, AWS-centric +- Good if team is TypeScript-native + +**Jsonnet (via Tanka)** +- Proven at scale (Grafana uses it across hundreds of services) +- Object mixins via `+` operator for composition +- Risk: weak type safety, no compile-time validation of field names + +### Tier 3 — Not Recommended + +**Dhall** — strongest type safety but Haskell-like syntax, small/stale community +**Nickel** — elegant contracts system but tiny K8s ecosystem +**Starlark** — no type safety, no schema system, just a scripting layer +**HCL** — great for infra provisioning, wrong fit for k8s manifests + +### Dead Projects +- **Winglang** — shut down April 2025 +- **Klotho** — archived, pivoted to InfraCopilot +- **Acorn** — pivoted to AI agents (Obot) + +## Compose-Like Input Format (Preferred Direction) + +The user prefers Docker Compose brevity. The tool we build could use a Compose-inspired +input format at Layer 1, generating full k8s manifests + provider-specific resources underneath: + +```yaml +# What the user writes +services: + api: + image: myapp:latest + size: medium + ports: [8080] + env: + DB_HOST: postgres + +# System generates: full k8s Deployment, Service, NetworkPolicy, +# resource limits, security context, health checks, etc. +``` + +YAML is fine for Layer 1 if it's short enough. The problem was never the format — +it was the verbosity. Compose proves short YAML works. + +## Open Questions + +1. Should Layer 1 input be YAML (Compose-like), or a typed language (Pkl/KCL/CUE)? +2. How do team defaults (Layer 2) and org standards (Layer 3) get defined and distributed? +3. Should the render view show the generated YAML diff when changing Layer 1 input? +4. How does this integrate with the Pulumi multi-cloud abstraction layer? +5. Could the input format support both k8s workloads AND infrastructure resources + (VMs, networks, storage) in the same spec? + +## GUI/TUI Space — Underserved Opportunity + +No tool has achieved significant adoption for visually *defining* infrastructure. +Existing tools (K9s, Lens, Rancher) are for monitoring/management, not authoring. + +The ideal: platform engineers define schemas with constraints/defaults, +developers interact with a form/wizard showing only fields they need, +validated config generated underneath. Nobody has built this well yet. diff --git a/crossplane-evaluation.md b/crossplane-evaluation.md new file mode 100644 index 0000000..5b033d7 --- /dev/null +++ b/crossplane-evaluation.md @@ -0,0 +1,106 @@ +# Crossplane Evaluation + +## Decision: NOT ADOPTING + +Crossplane will not be used in this stack. The lack of a plan/preview mechanism is a dealbreaker +for enterprise adoption and safe infrastructure management. + +--- + +## Why We Evaluated It + +The core problem: Terraform/OpenTofu requires re-implementing the same infrastructure concepts +per platform (AWS, XCP-ng, bare metal). At thousands of nodes across multiple platforms, this is +a massive maintenance burden. Crossplane's XRD/Composition model promised a unified API: + +``` +XRD: "VirtualMachine" (universal API) + ├── Composition: AWS → EC2 instance + ├── Composition: XCP-ng → XO VM + └── Composition: bare metal → MAAS / Ansible +``` + +One API, multiple backends — teams request a "VirtualMachine" and the right composition handles it. + +## Strengths + +- **CNCF Graduated** (Nov 2025, v2.2) — Apache 2.0 license, top-tier maturity +- **Continuous drift detection** — automatically reverts manual changes, unlike Terraform's on-demand plan/apply +- **No state file management** — no remote backends, locking issues, or state corruption +- **Kubernetes-native** — works with ArgoCD, Flux, kubectl, RBAC out of the box +- **XRDs/Compositions** — genuine multi-platform abstraction layer, solves the "re-implement per cloud" problem +- **Eventual consistency** — resources with complex dependencies don't get stuck like Terraform's dependency graph +- **Enterprise adoption** — Deutsche Kreditbank, Elastic, Nike, Apple, NASA, Grafana Labs, 60+ orgs +- **Deutsche Kreditbank** replaced Terraform; deployments went from weeks to under one hour + +## Dealbreaker: No Plan/Preview + +The single biggest issue. Terraform's `terraform plan` lets operators see exactly what will change +before applying. Crossplane applies changes immediately upon resource creation/modification. + +- Discussed in the community for 2+ years with no resolution +- A Kubernetes-native solution would be a `Plan` CRD that shows proposed changes before approval +- ArgoCD `sync --dry-run` is a partial workaround but only shows k8s resource diffs, not what the + cloud provider will actually do underneath +- **For regulated environments and SRE teams at scale, change preview is non-negotiable** + +Possible reasons it hasn't been implemented: +- The continuous reconciliation architecture may make point-in-time snapshots fundamentally hard +- Upbound (commercial entity) may be reserving it for their paid platform +- Or simply not prioritised + +## Other Significant Concerns + +### CRD Bloat +- `provider-aws` installs 900+ CRDs — can make API server unresponsive for up to an hour (GitHub #2649) +- Exceeds Kubernetes' recommended ~500 CRD limit +- Mitigated by "Provider Families" (install per-service sub-providers) but requires careful planning + +### Debugging Difficulty +- Errors propagate through layers: Claim → XR → Composition → Managed Resource → Provider → Cloud API +- Multiple sources report debugging compositions is painful +- Pipeline Inspector (alpha in v2.2) is being introduced but not production-ready + +### Chicken-and-Egg Problem +- Crossplane runs inside Kubernetes — cannot provision the cluster it runs on +- Requires a "management cluster" bootstrapped by other means (Terraform, Puppet, etc.) +- If the management cluster dies, no drift detection or reconciliation runs +- Recovery: applying YAMLs to a new cluster works if deterministic resource names are used, + otherwise risks creating duplicate cloud resources + +### Cluster Loss / Immutability Concerns +- State lives in etcd, not a versionable state file +- No independent audit trail or easy way to diff historical states +- On new cluster: resources with explicit external names get adopted; auto-named resources get duplicated +- Need etcd backups as insurance, and deterministic naming everywhere + +### Performance at Scale +- ~2000 composites took 6+ minutes to reconcile on k3d (GitHub #2256) +- Reconciliation interval not easily configurable globally (GitHub #5934) + +### YAML Limitations +- No native loops, conditionals, or programming constructs +- Complex compositions require changes in multiple locations + +## XCP-ng Provider Gap + +- No Crossplane provider for XCP-ng exists today +- A mature Terraform provider (`terraform-provider-xenorchestra`) exists, maintained by Vates +- Could be wrapped via Upjet to auto-generate a Crossplane provider — but nobody has done it +- Would be a greenfield open-source project + +## Real Issues Reported + +- API server unresponsiveness with too many CRDs (GitHub #2649) +- CRD scaling issues beyond ~500 CRDs (GitHub #2895) +- GCP SQL resources randomly marked for deletion — dangerous for production databases +- Reconciliation rate limiting at scale (GitHub #2256) + +## Conclusion + +Crossplane solves a real problem (multi-platform abstraction) that we need, but the lack of +plan/preview makes it unsuitable for enterprise-scale production infrastructure management. +The operational concerns (CRD bloat, debugging, cluster dependency) add further risk. + +We need to find an alternative approach to the multi-platform abstraction problem that Crossplane +solves, while retaining plan/preview capabilities. diff --git a/dns-research.md b/dns-research.md new file mode 100644 index 0000000..5c07ac0 --- /dev/null +++ b/dns-research.md @@ -0,0 +1,143 @@ +# DNS Solution Research + +## Decision: PowerDNS Authoritative + ExternalDNS + +### Why PowerDNS + +| Feature | PowerDNS | CoreDNS | BIND9 | Technitium | +|---------|----------|---------|-------|------------| +| REST API | Full | No (needs etcd) | No (nsupdate) | Yes | +| Database backend | PostgreSQL/MySQL/SQLite | etcd | Zone files | Custom | +| Health-aware DNS | Lua records (ifportup, ifurlup) | No | No | No | +| ExternalDNS provider | Yes | Yes (via etcd) | Yes (RFC 2136) | No | +| DNSSEC | Yes | Limited | Best | Yes | +| Split DNS | dnsdist routing | Corefile blocks | Views (best) | APP records | +| Maturity | ISP-grade | K8s-focused | Oldest | Newer | + +PowerDNS wins on: REST API (critical for Lab), health-check-aware Lua records, +database backend for HA, and ExternalDNS integration. + +### Architecture + +``` + Lab Server + (control plane) + │ + │ PowerDNS REST API + ▼ + ┌───────────────┐ + │ PowerDNS │ + │ Authoritative│──── PostgreSQL/SQLite backend + │ Server │ + └───────┬───────┘ + │ + ┌───────────┼───────────┐ + │ │ │ + ▼ ▼ ▼ + Internal DNS ExternalDNS dnsdist + .lab.internal (k8s syncs (split DNS + Services/ routing) + Ingress) +``` + +### How Lab Uses DNS + +#### Auto-registration on onboard +When `lab onboard` completes, Lab calls PowerDNS API: +- A record: `.lab.internal → ` +- PTR record: `.in-addr.arpa → .lab.internal` +- Both created/updated atomically + +#### Domain claims via labels +Labels can claim shared domain names: +```yaml +labels: + mailserver: + dns: + records: + - type: A + name: "{{server.name}}.lab.internal" + claims: + - name: mail.example.com + type: A + health_check: { port: 25 } +``` +All servers with label `mailserver` contribute to `mail.example.com` round-robin. +PowerDNS Lua records remove unhealthy servers automatically. + +#### IP mobility +Lab agent on machine reports IP change → Lab server updates PowerDNS API → +A record, PTR, and all claimed domains updated. + +#### K8s integration +ExternalDNS runs in k8s, syncs Service/Ingress records to same PowerDNS instance. +Same DNS server serves both bare metal and k8s records. + +#### Groups claiming domains +Groups can claim domains for all member servers: +```yaml +groups: + production-web: + match: + labels: [web-frontend] + environment: prod + dns: + claims: + - name: www.example.com + type: A + health_check: { url: "https://{{server.ip}}/healthz" } +``` + +### DNS Plugin Interface + +```go +type DNSPlugin interface { + Name() string + + // Record management + CreateRecord(zone, name, recordType string, targets []string, ttl int) error + UpdateRecord(zone, name, recordType string, targets []string, ttl int) error + DeleteRecord(zone, name, recordType string) error + ListRecords(zone string) ([]Record, error) + + // Health-checked records + CreateHealthCheckedRecord(zone, name string, targets []string, check HealthCheck) error + + // Zone management + CreateZone(name string, kind string) error + DeleteZone(name string) error +} +``` + +Built-in: +- `dns-powerdns` — PowerDNS REST API (primary) +- `dns-route53` — AWS Route53 (for cloud deployments) +- `dns-rfc2136` — RFC 2136 dynamic updates (BIND/Knot fallback) + +### Split DNS Setup + +Internal zones (`.lab.internal`) served by PowerDNS authoritatively. +External queries forwarded upstream (8.8.8.8, ISP DNS). + +Options: +- **dnsdist** (PowerDNS ecosystem) routes by source subnet +- **CoreDNS as resolver** — serves internal from PowerDNS, forwards external +- **BIND views** — if we need view-based split on same zone (unlikely) + +### Evaluated and Not Chosen + +| Tool | Why Not | +|------|---------| +| CoreDNS | No REST API, needs etcd intermediary, k8s-focused | +| BIND9 | No REST API, nsupdate is cumbersome for automation | +| Technitium | No ExternalDNS provider, newer/smaller community | +| dnsmasq | Not suitable — caching forwarder, no API, ~1000 client limit | +| Knot DNS | No REST API, better as secondary/downstream | + +### DNS-as-Code (Optional Layer) + +For static DNS infrastructure (SOA, NS, MX, base zone config): +- **octoDNS** (GitHub) or **DNSControl** (Stack Exchange) +- GitOps workflow: PR → review → merge → sync to PowerDNS +- Dynamic records (server A records, claims) managed by Lab directly via API +- Static records managed via DNS-as-code in Git diff --git a/hardware.md b/hardware.md new file mode 100644 index 0000000..8c7fcab --- /dev/null +++ b/hardware.md @@ -0,0 +1,37 @@ +# Homelab Hardware Inventory + +## Compute Nodes + +| Node | CPU Arch | RAM | Role | Cost | +|------|----------|-----|------|------| +| Beelink SER9 MAX | x86_64 | 64GB | k3s worker, ROCm GPU, Longhorn storage | ~£869 | +| Beelink SER9 Pro | x86_64 | 32GB | Bootstrap: Puppet, DNS, UniFi, Vault, Naemon | ~£300 | +| Minisforum MS-R1 | ARM (aarch64) | 64GB | k3s node | ~£500-640 | +| Nvidia DGX Spark | ARM (Grace) | 128GB | CUDA/AI inference | ~£3,700 | +| Mac Studio M1 Max | ARM (aarch64) | 32GB | k3s server #1 (etcd) | ~£775 | + +## Networking + +| Device | Specs | Cost | +|--------|-------|------| +| USW-Flex-XG x2 | 8x 10GbE ports total (4 per switch) | £458 | + +## Summary + +- **Total RAM:** 320GB +- **Architectures:** x86_64, aarch64 (Apple Silicon + ARM + Grace) +- **GPU compute:** ROCm (SER9 MAX), CUDA (DGX Spark) +- **Estimated total:** ~£6,600-6,740 + +## Node Roles + +### Bootstrap Node (Beelink SER9 Pro) — Outside k3s +- Puppet (bare metal config management) +- DNS (CoreDNS or PowerDNS) +- UniFi controller +- Vault (secrets management) +- Naemon (bare metal, network, black-box endpoint monitoring) + +### k3s Cluster +- **Server (control plane + etcd):** Mac Studio M1 Max +- **Workers:** Beelink SER9 MAX, Minisforum MS-R1, DGX Spark diff --git a/kubernetes-flavors.md b/kubernetes-flavors.md new file mode 100644 index 0000000..b81e2c4 --- /dev/null +++ b/kubernetes-flavors.md @@ -0,0 +1,120 @@ +# Kubernetes Flavor Decision + +## Decision: k3s (confirmed) + +k3s is the best fit for Lab. OpenShift and most other flavors conflict with +the puppet-managed, multi-arch, lightweight approach. + +## Evaluation + +| Flavor | Puppet-Friendly | ARM | Multi-arch | Enterprise | License | Verdict | +|--------|:-:|:-:|:-:|:-:|---------|---------| +| **k3s** | ✓ binary + config files | ✓ | ✓ | Rancher/SUSE | Apache 2.0 | **CHOSEN** | +| **k0s** | ✓ single binary, config-driven | ✓ | ✓ | Mirantis | Apache 2.0 | Good alternative | +| **kubeadm** | ✓ well-understood bootstrap | ✓ | ✓ | Upstream K8s | Apache 2.0 | Viable but heavier | +| **RKE2** | ✓ config files | ✓ | ✓ | Rancher/SUSE | Apache 2.0 | Heavier k3s | +| **OpenShift** | ✗ operator-driven, fights puppet | ✗ limited | ✗ limited | Red Hat | Proprietary | REJECTED | +| **MicroK8s** | ⚠ snap-based, puppet+snaps awkward | ✓ | ✓ | Canonical | Apache 2.0 | Not great | +| **Talos** | ✗ immutable OS, no SSH, no puppet | ✓ | ✓ | Sidero Labs | MPL 2.0 | Incompatible | + +## Why NOT OpenShift — Deep Analysis + +### OpenShift Does Overlap With Lab + +OpenShift is the closest existing thing to what Lab does. The overlap is real: + +| Capability | OpenShift | Lab | +|-----------|-----------|-----| +| Manages nodes end-to-end | Yes (RHCOS + MCO) | Yes (OpenVox + labels) | +| Immutable infrastructure | Yes (rpm-ostree, operator-driven) | Yes (puppet convergence) | +| Fights config drift | Yes (operators reconcile) | Yes (puppet + sync pillar) | +| Built-in monitoring | Yes (Prometheus + Alertmanager bundled) | Yes (health aggregator) | +| Built-in secrets | Yes (etcd-encrypted secrets) | Yes (secret store + local cache) | +| Certificate management | Yes (internal CA, auto-rotation) | Yes (identity layer) | +| Node lifecycle | Yes (MachineSet, MachinePools) | Yes (onboard, labels, providers) | +| Self-managing | Yes (operators update themselves) | Yes (lab manages itself) | + +### Why OpenShift Still Doesn't Fit + +**1. Single OS** — OpenShift control plane = RHCOS only. Can't run on Apple Silicon, + Asahi Linux, or any non-RHCOS system. Lab needs Ubuntu, Debian, Fedora, AlmaLinux, + XCP-ng, VyOS across x86 and ARM. + +**2. K8s only** — OpenShift manages k8s nodes. Lab manages everything: k8s nodes, + standalone VMs, bare metal hypervisors, network appliances, physical servers that + will never run k8s. Not everything is a container. + +**3. Single cluster scope** — OpenShift manages one cluster. Lab manages homelab k3s + + enterprise AWS EKS + XCP-ng hypervisors + bare metal + OVH vRack. Cross-provider, + cross-cluster. + +**4. Fights puppet** — OpenShift has ~30+ operators that each own a piece of the system. + If puppet changes kubelet config, the Machine Config Operator detects "drift" and + reverts it. Two reconciliation loops fighting each other, possibly rebooting nodes + in a loop. You're supposed to change everything via CRDs, not external tools. + +**5. No XCP-ng/hypervisor management** — Can't provision VMs on XCP-ng, manage Xen + hosts, or understand hypervisors that aren't VMware/OpenStack. + +**6. Throws away puppet modules** — Company has existing puppet modules. OpenShift's + model is operators, not puppet. Complete rewrite of config management. + +**7. Heavyweight** — Minimum 6 nodes, 88GB RAM just for the platform. k3s uses 512MB. + Our entire homelab is 5 nodes, 320GB RAM. + +**8. ARM limited** — RHCOS on Apple Silicon doesn't exist. ARM support is limited to + AWS Graviton and some server ARM platforms. + +### The Scope Difference + +``` +OpenShift: "I am your platform. Everything runs in me. I control the OS." + Scope: Kubernetes cluster + its nodes + +Lab: "I manage your infrastructure. K8s is one thing I deploy." + Scope: Everything — VMs, bare metal, hypervisors, k8s, + network gear, containers, across any provider +``` + +Lab is closer to what OpenShift + Satellite + RHCOS + ACM (Advanced Cluster Management) +do **together** — but unified, lighter, open source, and not locked to Red Hat's ecosystem. + +## Why k3s + +- **Puppet-friendly** — it's just a binary and config files in `/etc/rancher/k3s/` +- **Ultra-light** — runs on Mac Studio, ARM boxes, small VMs +- **Multi-arch** — native x86 and ARM +- **Same K8s API** as EKS/GKE — portable to cloud +- **Single binary** — trivial to manage with puppet +- **Proven** — CNCF certified, widely used in edge/IoT/homelab + +## k3s via Puppet (OpenVox) + +```puppet +# Label: k8s-server → puppet class +class kubernetes::server { + class { 'k3s::server': + token => lab::secret('k8s/cluster-token'), + cluster_init => true, + tls_san => [$facts['fqdn'], 'k8s.lab.internal'], + } +} + +# Label: k8s-worker → puppet class +class kubernetes::worker { + class { 'k3s::worker': + server_url => 'https://k8s.lab.internal:6443', + token => lab::secret('k8s/cluster-token'), + } +} +``` + +Same puppet classes work on bare metal, XCP-ng VM, EC2 instance, any architecture. + +## k0s as Backup Option + +If k3s ever becomes problematic, k0s is the closest alternative: +- Also single binary, config-driven, multi-arch +- `k0sctl` adds cluster management (bootstrap, upgrade, reset) +- Mirantis backing (Lens, Docker EE) +- Worth monitoring but no reason to switch from k3s today diff --git a/lab-tool-spec.md b/lab-tool-spec.md new file mode 100644 index 0000000..cd4c3db --- /dev/null +++ b/lab-tool-spec.md @@ -0,0 +1,1537 @@ +# Lab — Unified Infrastructure Lifecycle Platform + +## What It Is + +A tool that abstracts infrastructure lifecycle across clouds, hypervisors, bare metal, +and Kubernetes — using labels as the universal abstraction and existing tools under the hood. + +**Not reinventing the wheel.** Uses Pulumi, OpenVox, Tinkerbell, Prometheus, Naemon, +existing Puppet modules, cloud APIs — but provides a unified interface over all of them. + +## Architecture + +``` +┌────────────────────────────────────────────────────────────┐ +│ lab-server (control plane) │ +│ │ +│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌────────────┐ │ +│ │ Provider │ │ Label │ │ Lifecycle│ │ Artifact │ │ +│ │ Registry │ │ Engine │ │ Manager │ │ Builder │ │ +│ └──────────┘ └──────────┘ └──────────┘ └────────────┘ │ +│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌────────────┐ │ +│ │ OpenVox │ │ Health │ │ K8s │ │ Render │ │ +│ │ Enrollor │ │ Aggregator│ │ Deployer │ │ Engine │ │ +│ └──────────┘ └──────────┘ └──────────┘ └────────────┘ │ +│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌────────────┐ │ +│ │ Identity │ │ DNS │ │ Secret │ │ Token │ │ +│ │ Manager │ │ Manager │ │ Manager │ │ Issuer │ │ +│ └──────────┘ └──────────┘ └──────────┘ └────────────┘ │ +│ │ +│ API (gRPC + REST) │ +└──────────────┬─────────────────────────────────────────────┘ + │ + ┌──────────┴──────────┐ + │ │ +┌───┴───┐ ┌────┴────┐ +│ lab │ │ lab-tui │ +│ (CLI) │ │ (k9s) │ +└───────┘ └─────────┘ +``` + +### Control Plane (lab-server) + +Runs as a service (on bootstrap node, or in k8s). Hosts: + +- **Provider Registry** — pluggable providers (AWS, XCP-ng, bare metal, GCP, etc.) +- **Label Engine** — resolves labels → puppet classes, sizes, ports, config +- **Lifecycle Manager** — orchestrates provision → enroll → configure → observe +- **Artifact Builder** — puppet classes → container images +- **OpenVox Enrollor** — secure cert signing, node classification, environment assignment +- **Health Aggregator** — queries Prometheus, Naemon, cloud health APIs +- **K8s Deployer** — manages workloads on k3s/EKS clusters +- **Render Engine** — side-by-side provider comparison, cost estimates, drift detection +- **Identity Manager** — tracks enrollment state, certs, Vault auth, SSH keys per resource +- **DNS Manager** — auto-registers/updates DNS for every managed resource +- **Secret Manager** — controls which resources can access which secrets (per-label policies) +- **Token Issuer** — generates one-time join tokens at provision time (no hardcoded secrets) + +### CLI (lab) + +kubectl-like interface for browsing and managing resources: + +``` +$ lab get servers +NAME PROVIDER LABELS SIZE SYNC PUPPET HEALTH IDENTITY +api-1 aws app,prod,eu-west medium ✓ sync ✓ ok ✓ ok ✓ enrolled +api-2 aws app,prod,eu-west medium ✓ sync ✓ ok ✓ ok ✓ enrolled +mail-1 xcpng mailserver,prod medium ✓ sync ✓ ok ✓ ok ✓ enrolled +db-1 baremetal postgres,prod large ⚠ drift ✓ ok ✓ ok ✓ enrolled +worker-3 aws k8s-worker,staging large ✓ sync ✗ failed ⚠ 2 alrt ✓ enrolled +gateway-1 baremetal k8s-server,prod small ✓ sync ✓ ok ✓ ok ⚠ cert exp + +$ lab get servers --label mailserver +NAME PROVIDER SIZE SYNC PUPPET HEALTH IDENTITY +mail-1 xcpng medium ✓ sync ✓ ok ✓ ok ✓ enrolled +mail-2 aws medium ✓ sync ✓ ok ✓ ok ✓ enrolled + +$ lab describe server db-1 +Name: db-1 +Provider: baremetal +Labels: [postgres, prod, eu-west] +Size: large (8 cores, 32GB, 500GB NVMe) +Status: DRIFT DETECTED + Expected: size=large, disk=500GB + Actual: size=large, disk=500GB, extra_mount=/data (unmanaged) +Puppet: + Environment: production + Role: postgres + Classes: [postgresql::server, backup::pgbackrest, node_exporter] + Last run: 2026-03-15 14:22:03 (success) + Next run: 2026-03-15 14:52:03 +Health: + Prometheus: ✓ all targets up + Naemon: ✓ all checks passing + Alerts: none active + +$ lab get labels +LABEL PUPPET CLASSES SERVERS CONTAINERS +mailserver postfix, dovecot, spamassassin 2 1 +k8s-worker kubernetes::worker, containerd 12 0 +postgres postgresql::server, pgbackrest 3 1 +app nginx, app::deploy 4 2 + +$ lab get containers +NAME IMAGE LABEL K8S CLUSTER STATUS +mailserver ghcr.io/org/mailserver:2026.03.15 mailserver homelab running +postgres ghcr.io/org/postgres:2026.03.14 postgres homelab running +app ghcr.io/org/app:2026.03.15 app production running + +$ lab diff server db-1 + size: large + disk: 500GB ++ extra_mount: /data ← unmanaged, not in spec + +$ lab sync server db-1 # reconcile drift +$ lab plan server new-mail-3 --label mailserver --provider aws # preview +$ lab apply server new-mail-3 # create it + +$ lab build --label mailserver # puppet modules → container image +Building mailserver from puppet classes: + ✓ postfix + ✓ dovecot + ✓ spamassassin + ✓ fail2ban +→ ghcr.io/org/mailserver:2026.03.15 + +$ lab render --label mailserver --all-providers +┌──────────────┬──────────────┬──────────┬────────────┐ +│ │ AWS │ XCP-ng │ Bare Metal │ +├──────────────┼──────────────┼──────────┼────────────┤ +│ Compute │ t3.large │ 4c/8GB │ IPMI boot │ +│ Puppet │ postfix,... │ postfix,.│ postfix,...│ +│ Est. Cost │ ~$62/mo │ — │ — │ +└──────────────┴──────────────┴──────────┴────────────┘ +``` + +### TUI (lab-tui) + +k9s-style interactive terminal UI: +- Real-time server list with sync/puppet/health status +- Drill into any server for details +- Watch puppet runs live +- Filter by labels, providers, health status +- Trigger actions (sync, plan, apply, build) + +## Core Concepts + +### Labels — The Universal Abstraction + +Everything is a thing with labels. Configuration attaches to labels, not machines. + +```yaml +labels: + mailserver: + puppet_classes: + - postfix + - dovecot + - spamassassin + - fail2ban + ports: [25, 587, 993] + size: medium + alerts: + - smtp_connect # auto-generated: is SMTP responding? + - imap_connect # auto-generated: is IMAP responding? + - mail_queue_length # auto-generated: is mail queue healthy? + secrets: + - mail/tls-cert + - mail/dkim-key + + k8s-worker: + puppet_classes: + - kubernetes::worker + - containerd + - node_exporter + size: large + alerts: + - kubelet_healthy + - node_ready + secrets: + - k8s/join-token +``` + +### Groups — Nested Targeting with Exclusions + +Groups compose labels, other groups, and individual servers into reusable targets. +Groups can nest (subgroups). Exclusions allow fine-grained control. + +```yaml +groups: + # Simple group: all production servers + production: + match: + environment: prod + + # Group by label combination + production-mail: + match: + labels: [mailserver] + environment: prod + + # Nested group with subgroups + eu-infrastructure: + groups: + - eu-west-compute + - eu-west-storage + - eu-west-network + exclude: + servers: [test-box-1] # exclude specific server + labels: [experimental] # exclude servers with this label + + eu-west-compute: + match: + labels: [k8s-worker, k8s-server] + region: eu-west + exclude: + servers: [legacy-node-3] + + # Group targeting everything except a subgroup + all-except-staging: + match: + environment: [prod, dev] + exclude: + environment: staging + + # Custom group by explicit membership + database-tier: + servers: [db-1, db-2, db-3] + groups: [replica-set-eu] +``` + +### Alerts — Auto-Generated and User-Defined + +Alerts attach to labels, groups, servers, or environments — same targeting as everything else. + +#### Auto-Generated Alerts + +When Lab provisions a resource, it generates baseline alerts based on: +- **Label**: mailserver label → SMTP/IMAP checks +- **Puppet classes**: `postgresql::server` → postgres process, replication lag +- **Ports**: if port 443 is declared → HTTPS health check +- **Size**: resource limits → CPU/memory threshold alerts +- **Identity**: cert expiry alerts auto-generated for all enrolled machines + +#### User-Defined Alerts + +Users can add custom alerts targeting any scope: + +```yaml +alerts: + # Target by label + - name: mail_queue_critical + target: + labels: [mailserver] + condition: mail_queue_length > 1000 + severity: critical + for: 5m + + # Target by group + - name: disk_space_low + target: + groups: [production] + condition: disk_usage_percent > 85 + severity: warning + + # Target by environment + - name: high_cpu + target: + environment: prod + condition: cpu_usage_percent > 90 + for: 10m + severity: warning + + # Target specific servers + - name: gpu_temperature + target: + servers: [dgx-spark, beelink-ser9-max] + condition: gpu_temp_celsius > 80 + severity: critical + + # Target by label but exclude some + - name: memory_pressure + target: + labels: [k8s-worker] + exclude: + servers: [batch-worker-1] # this one is expected to run hot + condition: memory_usage_percent > 90 + severity: warning +``` + +Alerts are rendered to the underlying monitoring system (Prometheus rules, Naemon checks, +CloudWatch alarms) — we don't build an alerting engine, we generate configs for existing ones. +Which monitoring backend to use for each alert type: **needs investigation**. + +### Targeting — Unified Query System + +The same targeting syntax works everywhere: alerts, puppet classes, secrets, and queries. +Target by label, group, server name, environment, region, or any combination with exclusions. + +``` +# CLI targeting syntax +$ lab get servers --label k8s-worker +$ lab get servers --group production +$ lab get servers --environment staging +$ lab get servers --label k8s-worker --environment prod --exclude worker-3 + +# What's applied WHERE (server → everything) +$ lab show server worker-5 +``` + +### Visibility — Show What's Applied Where + +Two directions of querying: "what does this server get?" and "where does this thing apply?" + +#### Server View: Everything applied to a server + +``` +$ lab show server worker-5 + +Server: worker-5 (aws, eu-west-1) +Labels: [k8s-worker, production, eu-west] +Groups: [production, eu-west-compute, eu-infrastructure] +Environment: prod + +Puppet Classes (6): + FROM LABEL k8s-worker: + ├── kubernetes::worker + ├── containerd + └── node_exporter + FROM LABEL production: + ├── base::hardening + └── base::monitoring + FROM LABEL eu-west: + └── base::ntp_eu + +Alerts (8): + FROM LABEL k8s-worker: + ├── kubelet_healthy + └── node_ready + FROM GROUP production: + ├── disk_space_low + └── high_cpu + AUTO-GENERATED: + ├── cpu_threshold (from size: large) + ├── memory_threshold (from size: large) + ├── cert_expiry (from identity) + └── puppet_run_failed (from enrollment) + +Secrets (2): + FROM LABEL k8s-worker: + ├── k8s/join-token (read) + └── tls/node-cert (dynamic) + +Excluded From: + └── alert "memory_pressure" (explicitly excluded) +``` + +#### Label/Group View: Where does this apply? + +``` +$ lab show label mailserver + +Label: mailserver +Applied to: 2 servers + +Servers: + ├── mail-1 (xcpng, prod) ✓ sync ✓ puppet ✓ health ✓ identity + └── mail-2 (aws, prod) ✓ sync ✓ puppet ✓ health ✓ identity + +Provides: + Puppet Classes: postfix, dovecot, spamassassin, fail2ban + Alerts: smtp_connect, imap_connect, mail_queue_length + Secrets: mail/tls-cert, mail/dkim-key + Ports: 25, 587, 993 + Size: medium + +$ lab show group eu-infrastructure + +Group: eu-infrastructure +Contains: 3 subgroups, 47 servers (2 excluded) + +Subgroups: + ├── eu-west-compute (28 servers) + ├── eu-west-storage (12 servers) + └── eu-west-network (9 servers) + +Excluded: + ├── test-box-1 (by name) + └── 1 server with label "experimental" + +Alerts targeting this group: + ├── disk_space_low (warning) + └── network_latency_high (critical) +``` + +#### Alert View: Where does this alert fire? + +``` +$ lab show alert disk_space_low + +Alert: disk_space_low +Severity: warning +Condition: disk_usage_percent > 85 +Target: group "production" +Excludes: none + +Applies to 63 servers: + ├── api-1 (aws) currently: 42% ✓ + ├── api-2 (aws) currently: 38% ✓ + ├── mail-1 (xcpng) currently: 71% ✓ + ├── db-1 (baremetal) currently: 83% ⚠ approaching + └── ... (59 more) + +Rendered to: + ├── Prometheus: rule "disk_space_low" in rules/production.yaml + └── Naemon: service check on 4 bare-metal hosts +``` + +#### Reverse Query: What targets this server? + +``` +$ lab targets server db-1 + +Everything targeting db-1: + Labels: [postgres, production, eu-west] + Groups: [production, database-tier, eu-infrastructure, eu-west-storage] + Environment: prod + + Alerts (11): + ├── postgres_replication_lag (from label: postgres) + ├── postgres_connections (from label: postgres) + ├── disk_space_low (from group: production) + ├── high_cpu (from group: production) + ├── storage_iops (from group: eu-west-storage) + ├── cert_expiry (auto-generated) + └── ... (5 more) + + Puppet Classes (9): + ├── postgresql::server (from label: postgres) + ├── backup::pgbackrest (from label: postgres) + └── ... (7 more) + + Secrets (4): + ├── postgres/master-password (from label: postgres) + └── ... (3 more) +``` + +### TUI Visualization (lab-tui) + +The k9s-style TUI should support navigating these relationships interactively: + +``` +┌─ lab-tui ──────────────────────────────────────────────────────────┐ +│ View: Servers > worker-5 [?]Help│ +├────────────────────────────────────────────────────────────────────┤ +│ │ +│ ┌─ Server: worker-5 ──────────────────────────────────────────┐ │ +│ │ Provider: aws Size: large Env: prod │ │ +│ │ Sync: ✓ Puppet: ✓ Health: ✓ Identity: ✓ │ │ +│ └─────────────────────────────────────────────────────────────┘ │ +│ │ +│ [L]abels [A]lerts [P]uppet [S]ecrets [G]roups │ +│ │ +│ Labels ──────────────────── Alerts ────────────────────────── │ +│ ► k8s-worker ● kubelet_healthy ✓ OK │ +│ ► production ● node_ready ✓ OK │ +│ ► eu-west ● disk_space_low ✓ 42% │ +│ ● high_cpu ✓ 12% │ +│ Groups ────────────────── ● cert_expiry ✓ 347d │ +│ ► production │ +│ ► eu-infrastructure Puppet Classes ────────────────── │ +│ ► eu-west-compute ● kubernetes::worker ✓ applied │ +│ ● containerd ✓ applied │ +│ Secrets ───────────────── ● node_exporter ✓ applied │ +│ ● k8s/join-token (read) ● base::hardening ✓ applied │ +│ ● tls/node-cert (dyn) ● base::monitoring ✓ applied │ +│ │ +│ [Enter] drill down [Esc] back [/] search [Tab] switch pane │ +└────────────────────────────────────────────────────────────────────┘ +``` + +Navigation: +- From server → drill into label → see all other servers with that label +- From alert → see all servers it applies to, current values +- From group → see subgroups, expand tree, see members +- From label → see puppet classes, alerts, secrets it provides +- Everything is cross-linked — follow any relationship in either direction + +### Deployment Targets + +Same label → multiple targets: + +| Target | What happens | +|--------|-------------| +| VM (any cloud) | Provision VM → enroll OpenVox → apply classes live | +| Bare metal | PXE boot → enroll OpenVox → apply classes live | +| Container | Build image with classes baked in → push to registry | +| ASG | Launch template with OpenVox enrollment → auto-apply | +| K8s pod | Deploy container artifact to cluster | + +### Four-Pillar Status + +Every resource shows four things: + +1. **Sync** — is the actual infrastructure state matching the declared spec? + (instance type, security groups, disks, network — via Pulumi state) +2. **Puppet** — did OpenVox successfully apply all classes? + (last run status, any failures, catalog compilation errors) +3. **Health** — are monitoring checks passing? + (aggregates from Prometheus alerts, Naemon checks, cloud health APIs) +4. **Identity** — is the resource fully enrolled? + (DNS registered, certs valid, Vault authenticated, SSH host key signed) + +### Provider Plugin System + +Extensible provider model — each provider implements an interface: + +```go +type Provider interface { + Name() string + + // Lifecycle + Plan(spec ResourceSpec) (*PlanResult, error) + Apply(spec ResourceSpec) (*Resource, error) + Destroy(id string) error + + // State + Get(id string) (*Resource, error) + List(filters Filters) ([]*Resource, error) + Diff(spec ResourceSpec) (*DiffResult, error) + + // Introspection (like DA's type-writer) + DiscoverResources() ([]*Resource, error) + AvailableSizes() ([]Size, error) + AvailableImages() ([]Image, error) +} +``` + +Built-in providers: +- `provider-aws` — wraps Pulumi AWS +- `provider-xcpng` — wraps Pulumi XO / Xen Orchestra API +- `provider-baremetal` — wraps Tinkerbell / iPXE + IPMI/Redfish +- `provider-k8s` — wraps Pulumi Kubernetes + +Community can add: GCP, Azure, Hetzner, Proxmox, etc. + +### Health Aggregator Plugin System + +```go +type HealthSource interface { + Name() string + CheckHealth(resource *Resource) (*HealthResult, error) +} +``` + +Built-in sources: +- `health-prometheus` — queries Prometheus alerting rules targeting the resource +- `health-naemon` — queries Naemon host/service checks +- `health-cloudwatch` — queries AWS CloudWatch alarms + +### Profiles — T-Shirt Sizing + +User-owned mappings: + +```yaml +sizes: + medium: + abstract: { cores: 4, memory: 8GB } + providers: + aws: { instance_type: t3.large } + xcpng: { cores: 4, memory: 8192MB } + baremetal: { min_cores: 4, min_memory: 8GB, maas_tag: medium } +``` + +### Artifact Builder + +Puppet modules → container images: + +``` +label "mailserver" + → puppet classes [postfix, dovecot, spamassassin] + → Dockerfile generated: + FROM ubuntu:24.04 + RUN apt-get install -y puppet-agent + COPY modules/ /etc/puppetlabs/code/modules/ + RUN puppet apply --classes postfix,dovecot,spamassassin + # Clean up puppet, leave only configured services + → Image pushed to registry + → Available as k8s deployment or standalone container +``` + +## Tech Stack + +| Component | Technology | Why | +|-----------|-----------|-----| +| Server | Go | Performance, single binary, Pulumi SDK, gRPC native | +| CLI | Go (cobra) | Same binary, kubectl-style | +| TUI | Go (bubbletea) | Same binary, k9s-style | +| API | gRPC + REST (grpc-gateway) | Type-safe, fast, REST fallback | +| IaC engine | Pulumi (Go SDK) | Multi-provider, plan/preview, component packages | +| Config mgmt | OpenVox | Puppet modules, ENC, cert management | +| Bare metal | Tinkerbell or custom iPXE | PXE boot, IPMI/Redfish | +| Container build | Buildah or Docker | OCI images from puppet classes | +| State store | TBD — NOT etcd (see State Storage section) | Resource state, label definitions | +| K8s integration | client-go | Direct k8s API for deployments | + +## Under The Hood — What We DON'T Build + +- Cloud APIs → Pulumi providers handle this +- Puppet language/runtime → OpenVox handles this +- Container runtime → containerd/Docker handles this +- Monitoring → Prometheus/Naemon handle this +- K8s orchestration → k3s/EKS handles this +- PXE/DHCP/TFTP → Tinkerbell handles this +- Certificate management → OpenVox CA handles this + +**We build the glue, the abstraction, the UX, and the lifecycle orchestration.** + +## Kubernetes Management + +Lab also controls what runs on k8s clusters: + +``` +$ lab get deployments +NAME CLUSTER LABEL REPLICAS IMAGE STATUS +mailserver homelab mailserver 2/2 org/mailserver:03.15 ✓ running +api production app 4/4 org/app:03.15 ✓ running +postgres homelab postgres 1/1 org/postgres:03.14 ✓ running + +$ lab deploy --label app --cluster production --replicas 4 +$ lab scale --label app --cluster production --replicas 6 +``` + +Deployments reference labels — same label that defines puppet classes also defines +the container image, ports, health checks, and k8s resources. + +## Bootstrap, Onboarding, and Self-Deployment + +### Core Idea: Your Device Is The First Coordinator + +You don't need a server to start. Your laptop/workstation runs the full lab engine +locally. You onboard servers from it — including bare metal PXE boot. When ready, +you migrate the coordinator role to one of the servers you've onboarded. + +``` +┌────────────┐ ┌────────────┐ ┌────────────┐ ┌────────────┐ +│ Phase 0 │ │ Phase 1 │ │ Phase 2 │ │ Phase 3 │ +│ │ │ │ │ │ │ │ +│ lab init │────►│ Onboard │────►│ Move lab │────►│ Onboard │ +│ --local │ │ servers │ │ to a real │ │ remaining │ +│ │ │ from your │ │ server │ │ from the │ +│ Your device│ │ laptop │ │ │ │ server │ +│ = lab │ │ │ │ │ │ │ +└────────────┘ └────────────┘ └────────────┘ └────────────┘ +``` + +### Architecture: CLI = Embedded Server + +The CLI binary contains the full lab-server engine. The difference between modes +is where state lives and whether the engine runs persistently. + +``` +┌──────────────────────────────────────┐ +│ lab (single binary) │ +│ │ +│ ┌─────────────────────────────────┐ │ +│ │ Core Engine │ │ +│ │ (providers, labels, render, │ │ +│ │ lifecycle, identity, secrets, │ │ +│ │ PXE server, everything) │ │ +│ └─────────────────────────────────┘ │ +│ │ +│ Modes: │ +│ ├── $ lab init --local → local mode │ +│ │ State: ~/.lab/state.db │ +│ │ PXE/DHCP: served from laptop │ +│ │ Full engine, no remote server │ +│ │ │ +│ ├── $ lab server → daemon mode │ +│ │ State: /var/lib/lab/state.db │ +│ │ PXE/DHCP: served from this box │ +│ │ Persistent API on port 7443 │ +│ │ │ +│ └── $ lab → client mode │ +│ Talks to remote lab-server │ +│ (or local engine if no server) │ +└──────────────────────────────────────┘ +``` + +### Onboarding Flow + +`lab onboard` is the command to bring a new machine under management. It handles +two scenarios: machines with an OS already installed, and bare metal that needs +network boot + OS installation. + +#### Scenario A: Machine has OS (SSH onboard) + +For machines that already have an OS (like DGX Spark with Ubuntu, or Mac Studio): + +``` +$ lab onboard dgx-spark --provider ssh --host 192.168.1.50 --user admin + +Step 1: Render + ┌──────────────┬────────────────────────┐ + │ Name │ dgx-spark │ + │ Provider │ ssh (existing machine) │ + │ Host │ 192.168.1.50 │ + │ OS │ Ubuntu (detected) │ + │ Arch │ aarch64 (Grace) │ + │ RAM │ 128GB │ + │ GPU │ CUDA (detected) │ + └──────────────┴────────────────────────┘ + + Onboarding will: + + Install lab agent + + Generate one-time enrollment token + + Register in DNS: dgx-spark.lab.internal + + Sign OpenVox certificate + + Assign labels (interactive or --labels flag) + + Proceed? [y/N]: y + +Step 2: Detect & assign labels + Detected hardware: + GPU: NVIDIA GB10 Grace Blackwell → suggesting label: cuda + RAM: 128GB → suggesting label: ai-inference + Arch: aarch64 → suggesting label: arm + + Assign labels [cuda,ai-inference,arm]: cuda,ai-inference,dgx-spark + +Step 3: Apply (same engine as lab apply) + → SSH into 192.168.1.50 + → Install lab agent binary + → Generate one-time token + → Lab agent enrolls: + → OpenVox cert signed, classified in environment "production" + → DNS A record: dgx-spark.lab.internal → 192.168.1.50 + → Identity established + → Apply puppet classes from labels: + → cuda: nvidia-drivers, cuda-toolkit + → ai-inference: inference-runtime + → Machine fully managed + +$ lab get servers +NAME PROVIDER LABELS SYNC PUPPET HEALTH IDENTITY +dgx-spark ssh cuda,ai-inference,dgx-spark ✓ ✓ ok ✓ ✓ enrolled +``` + +#### Scenario B: Bare metal (PXE network boot) + +For machines with no OS. Lab (on your laptop or server) becomes a PXE server +on the local network, serves the OS installer, and onboards after installation: + +``` +$ lab onboard beelink-max --provider baremetal \ + --mac AA:BB:CC:DD:EE:FF \ + --image ubuntu-24.04 \ + --labels k8s-worker,rocm,longhorn + +Step 1: Render + ┌──────────────┬────────────────────────┐ + │ Name │ beelink-max │ + │ Provider │ baremetal (PXE boot) │ + │ MAC │ AA:BB:CC:DD:EE:FF │ + │ Image │ ubuntu-24.04 │ + │ Labels │ k8s-worker,rocm,longhorn│ + │ PXE server │ this device (laptop) │ + └──────────────┴────────────────────────┘ + + Onboarding will: + + Start PXE/DHCP/TFTP on local network interface + + Wait for machine with MAC AA:BB:CC:DD:EE:FF to boot + + Serve unattended Ubuntu 24.04 installer + + After install: auto-enroll with one-time token baked into installer + + Assign labels, apply puppet classes + + ⚠ PXE requires: network interface on same L2 segment as target machine + ⚠ DHCP: will respond ONLY to MAC AA:BB:CC:DD:EE:FF (safe for existing networks) + + Proceed? [y/N]: y + +Step 2: PXE boot phase + → Starting PXE server on en0 (192.168.1.x) + → DHCP offer scoped to MAC AA:BB:CC:DD:EE:FF only + → Waiting for network boot request... + + ⏳ Power on the Beelink SER9 MAX and set it to boot from network (PXE) + + → Boot request received from AA:BB:CC:DD:EE:FF + → Serving iPXE → kernel + initrd → autoinstall config + → OS installation in progress... + → Installation complete, machine rebooting + +Step 3: Post-install enrollment (same as SSH onboard from here) + → Machine boots with installed OS + → Lab agent runs on first boot (installed during OS setup) + → Uses one-time token (baked into autoinstall config) to enroll: + → OpenVox cert signed + → DNS: beelink-max.lab.internal → 192.168.1.100 + → Identity established + → Apply puppet classes from labels: + → k8s-worker: kubernetes::worker, containerd + → rocm: rocm-drivers + → longhorn: longhorn::node + → Machine fully managed + +$ lab get servers +NAME PROVIDER LABELS SYNC PUPPET HEALTH IDENTITY +dgx-spark ssh cuda,ai-inference ✓ ✓ ok ✓ ✓ enrolled +beelink-max baremetal k8s-worker,rocm,longhorn ✓ ✓ ok ✓ ✓ enrolled +``` + +#### Scenario C: Onboard with IPMI/Redfish (remote power control) + +For bare metal where you have IPMI/BMC access — Lab can power on the machine +and set PXE boot remotely, fully hands-free: + +``` +$ lab onboard beelink-max --provider baremetal \ + --mac AA:BB:CC:DD:EE:FF \ + --ipmi 192.168.1.200 --ipmi-user admin \ + --image ubuntu-24.04 \ + --labels k8s-worker,rocm,longhorn + + → IPMI: setting next boot to PXE + → IPMI: powering on machine + → PXE server waiting for boot request... + → (fully automated from here) +``` + +### Homelab Bootstrap Walkthrough + +The complete flow for setting up the homelab from zero: + +``` +# Phase 0: Local mode on your laptop +$ lab init --local + ✓ Lab engine running locally + ✓ State: ~/.lab/state.db + ✓ Ready to onboard servers + +# Phase 1: Onboard servers that already have an OS +$ lab onboard dgx-spark --provider ssh --host 192.168.1.50 + → Labels: [cuda, ai-inference, dgx-spark] + +$ lab onboard mac-studio --provider ssh --host 192.168.1.51 + → Labels: [k8s-server, etcd, arm] + +# Phase 2: Onboard bare metal (PXE from your laptop) +$ lab onboard beelink-ser9-pro --provider baremetal --mac XX:XX:XX:XX:XX:01 \ + --image ubuntu-24.04 --labels bootstrap,lab-server + → PXE boot from laptop → install OS → enroll + → This will become the permanent lab-server host + +# Phase 3: Move lab-server to a real server +$ lab server migrate --target ssh --host beelink-ser9-pro + → Lab-server deployed on Beelink SER9 Pro + → State migrated from ~/.lab/state.db + → PXE/DHCP now served from Beelink, not your laptop + → CLI config updated: lab talks to beelink-ser9-pro:7443 + +# Phase 4: Onboard remaining servers (PXE from beelink-ser9-pro now) +$ lab onboard beelink-ser9-max --provider baremetal --mac XX:XX:XX:XX:XX:02 \ + --image ubuntu-24.04 --labels k8s-worker,rocm,longhorn + → PXE served by beelink-ser9-pro (not your laptop anymore) + +$ lab onboard minisforum-ms-r1 --provider baremetal --mac XX:XX:XX:XX:XX:03 \ + --image ubuntu-24.04 --labels k8s-worker,arm + +# Phase 5: Set up k8s +$ lab apply cluster homelab --servers mac-studio,beelink-ser9-max,minisforum-ms-r1 + → mac-studio becomes k3s server (etcd) + → beelink-ser9-max joins as worker + → minisforum-ms-r1 joins as worker + → All via puppet classes from labels + +# Phase 6: Optionally move lab-server into k8s +$ lab server migrate --target kubernetes --cluster homelab + → Lab-server now runs as k8s pod + → Still manages everything including the cluster it runs on + +# Final state: +$ lab get servers +NAME PROVIDER LABELS SYNC PUPPET HEALTH IDENTITY +dgx-spark ssh cuda,ai-inference ✓ ✓ ok ✓ ✓ enrolled +mac-studio ssh k8s-server,etcd,arm ✓ ✓ ok ✓ ✓ enrolled +beelink-ser9-pro baremetal bootstrap ✓ ✓ ok ✓ ✓ enrolled +beelink-ser9-max baremetal k8s-worker,rocm,longhorn ✓ ✓ ok ✓ ✓ enrolled +minisforum-ms-r1 baremetal k8s-worker,arm ✓ ✓ ok ✓ ✓ enrolled +lab-server kubernetes lab,control-plane ✓ ✓ ok ✓ ✓ enrolled +``` + +### Enterprise Application: XCP-ng Bare Metal Deploy + +Same onboarding flow works for deploying XCP-ng to enterprise bare metal: + +``` +$ lab onboard xen-host-42 --provider baremetal \ + --mac AA:BB:CC:DD:EE:FF \ + --ipmi 10.0.0.142 --ipmi-user admin \ + --image xcpng-8.3 \ + --labels xen-host,production,eu-west + + → IPMI: power on, PXE boot + → Install XCP-ng 8.3 (unattended) + → Enroll, apply puppet classes: + → xen-host: xcpng::host, xcpng::networking, xcpng::storage + → Host registered in Xen Orchestra pool + → Ready to provision VMs on it + +# Now create VMs on the XCP-ng host we just onboarded: +$ lab apply server app-12 --provider xcpng --labels app,production + → VM created on xen-host-42 via Xen Orchestra API + → OS installed, enrolled, puppet applied + → Same flow as AWS EC2, just different provider +``` + +### PXE Server Capabilities + +When running in local or server mode, Lab includes an embedded PXE server: + +- **DHCP**: scoped to specific MACs only (safe for existing networks with DHCP) +- **TFTP**: serves iPXE bootloader +- **HTTP**: serves kernel, initrd, autoinstall configs +- **Autoinstall generation**: creates unattended install configs per-machine with: + - Lab agent pre-installed + - One-time enrollment token baked in + - Network config for the target environment + - Disk layout per label/profile +- **Supported images**: Ubuntu, Debian, RHEL/Rocky, XCP-ng (extensible) + +PXE serving moves with lab-server — if you migrate lab to a new host, +PXE is served from there. If lab is on your laptop, PXE is on your laptop. +Same engine, same binary. + +### Hardware Detection During Onboard + +When onboarding via SSH (existing OS), Lab detects hardware and suggests labels: + +``` +$ lab onboard new-server --provider ssh --host 10.0.0.50 + +Detected hardware: + CPU: AMD EPYC 7763 (x86_64, 64 cores) → suggest: compute + RAM: 256 GB → suggest: high-memory + GPU: NVIDIA A100 80GB → suggest: cuda, ai-training + Disk: 2x NVMe 1.92TB, 4x SSD 3.84TB → suggest: storage + NIC: 2x 25GbE, 1x 1GbE IPMI → suggest: high-bandwidth + + Suggested labels: [compute, high-memory, cuda, ai-training, storage, high-bandwidth] + Assign labels [accept/edit]: _ +``` + +For PXE onboard, hardware detection happens after OS installation, and labels +can be auto-confirmed or require interactive approval. + +### No Server? CLI Runs Locally + +If no remote server is configured, every `lab` command runs the engine locally. +This means you can use Lab in permanent local mode for simple setups: + +``` +$ lab get servers # no remote server configured + ⓘ Running locally (~/.lab/state.db) + Tip: run `lab server migrate --target ` to deploy a persistent server + +NAME PROVIDER LABELS SYNC PUPPET HEALTH IDENTITY +... +``` + +### Self-Migration + +Migration uses the same plan/apply as everything else: + +``` +$ lab server migrate --target ssh --host beelink-ser9-pro + +Step 1: Plan + ~ migrate lab-server from local (~/.lab) to ssh://beelink-ser9-pro + + deploy lab-server container on beelink-ser9-pro + + copy state.db to remote host + + start PXE/DHCP services on remote host + + stop local PXE/DHCP services + + update CLI config to new endpoint + +Step 2: Apply + → Deploy lab-server on beelink-ser9-pro + → Copy state to remote + → Verify remote is healthy + → Switch CLI config + → Stop local engine + +$ lab server migrate --target kubernetes --cluster homelab + +Step 1: Plan + ~ migrate lab-server from ssh://beelink-ser9-pro to kubernetes://homelab + + k8s Deployment lab-server (1 replica) + + k8s Service lab-server (port 7443) + + PersistentVolumeClaim lab-server-state (10Gi) + + migrate state.db to PVC + + PXE services: move to k8s hostNetwork pod or keep on bootstrap node + + ⚠ Note: PXE/DHCP requires L2 network access. If k8s node is on the same + L2 segment, use hostNetwork. Otherwise, keep PXE on the bootstrap node + and only migrate the API/state to k8s. + +Step 2: Apply + → Deploy to k8s + → Migrate state + → Verify healthy + → Update CLI config + → Tear down old deployment +``` + +### Key Design Principles + +1. **One engine everywhere** — CLI, local mode, server mode, and init all share the same code +2. **Your device is the first coordinator** — no chicken-and-egg, start from nothing +3. **Onboard uses the same pipeline as apply** — render, plan, apply, enroll +4. **PXE is embedded** — no external PXE/DHCP server needed, Lab serves it +5. **Hardware detection suggests labels** — but the user confirms +6. **Migration is just plan/apply for lab-server** — same engine, no special case +7. **Enterprise and homelab are the same flow** — onboard XCP-ng bare metal = onboard homelab Beelink + +## Identity and Trust Layer + +Inspired by what FreeIPA did well (auto-DNS, centralized SSH, server-scoped secrets, +internal CA, IP mobility) without what it did badly (instability, hardcoded join secrets). + +Lab controls the full lifecycle — it knows when a machine is born — so it can solve +the enrollment problem properly: generate a one-time join token at provision time, +inject it via cloud-init or iPXE userdata. No hardcoded secrets in images. + +### Provision-to-Enrolled Flow + +``` +$ lab apply server new-worker-5 --label k8s-worker --provider aws + +1. PROVISION → Pulumi creates EC2 instance +2. IDENTITY → Lab generates one-time join token (short-lived, single-use) + → Token injected via cloud-init (or iPXE userdata for bare metal) + → Token is NOT in the image — generated per-instance at provision time +3. ENROLL → Machine boots, uses token to: + → Register with OpenVox (cert signed, node classified) + → Register in DNS (A record + PTR) + → Authenticate with Vault (get identity + policies per label) + → Get SSH CA-signed host key (no more TOFU) +4. CONFIGURE → OpenVox applies classes + → Machine pulls secrets it's allowed to access from Vault + → e.g. k8s join token retrieved from Vault, node joins cluster +5. ENROLLED → Lab marks resource identity as ✓ enrolled +``` + +### What Each Machine Gets on Enrollment + +| Capability | What happens | Tool underneath (TBD — needs investigation) | +|-----------|-------------|----------------------------------------------| +| DNS auto-registration | A + PTR records created/updated automatically | CoreDNS API? ExternalDNS? PowerDNS? needs investigation | +| IP mobility | Machine restarts with new IP → DNS updated automatically | Lab agent on machine reports changes? DHCP hook? needs investigation | +| Server certificate | TLS cert issued for the machine, auto-renewed | OpenVox CA? Vault PKI secrets engine? cert-manager? needs investigation | +| SSH host key signing | Host key signed by CA, clients trust CA not individual keys | Vault SSH secrets engine? OpenVox CA? step-ca? needs investigation | +| SSH user access | Users get short-lived SSH certs, centrally managed | Vault SSH + OIDC? Teleport? Boundary? needs investigation | +| Secret access (RBAC) | Machine authenticates with Vault, gets label-scoped policy | Vault AppRole? Vault cert auth? needs investigation | +| K8s join tokens | Retrieved from Vault by entitled machines, used to join cluster | Vault KV + policy per label? needs investigation | +| OpenVox enrollment | Cert signed, environment + role + classes assigned | OpenVox CA + ENC — this one we know | +| One-time join tokens | Generated per-instance at provision, single-use, short-lived | Lab itself generates these — or delegate to Vault? needs investigation | + +**Important: We don't need to build any of these from scratch.** Each row is a capability +that likely has an existing tool we can wrap. Just like we use Pulumi for cloud APIs and +OpenVox for config management, we'll find the right tool for each identity concern. +Each position requires investigation — we'll evaluate options together, one by one. + +### CLI: Identity Information + +``` +$ lab get servers +NAME PROVIDER LABELS SYNC PUPPET HEALTH IDENTITY +worker-5 aws k8s-worker ✓ ✓ ok ✓ ✓ enrolled +worker-6 xcpng k8s-worker ✓ ✓ ok ✓ ✓ enrolled +worker-7 baremetal k8s-worker ✓ ✗ fail ⚠ ⚠ cert expiring +new-box aws k8s-worker ✓ … … ⏳ enrolling + +$ lab describe server worker-5 +... +Identity: + DNS: worker-5.lab.internal (A: 10.0.1.45, PTR: ✓) + OpenVox: ✓ cert signed (expires 2027-03-15) + Vault: ✓ authenticated (policy: k8s-worker) + SSH Host Key: ✓ CA-signed (fingerprint: SHA256:abc...) + Secrets: k8s/join-token, tls/node-cert (2 accessible) + Enrolled: 2026-03-15 14:22:03 (one-time token, consumed) + Last Check-in: 2026-03-15 15:01:12 (38 seconds ago) + +$ lab get secrets --label k8s-worker +SECRET TYPE ACCESSIBLE BY LAST ROTATED +k8s/join-token dynamic k8s-worker (12 srv) 2026-03-15 +tls/cluster-ca static k8s-worker, k8s-server 2026-01-01 +monitoring/api-key static k8s-worker, monitoring 2026-02-28 + +$ lab identity renew worker-5 # force cert/key renewal +$ lab identity revoke worker-5 # revoke all creds, remove from DNS, unenroll +``` + +### Secrets — Code Is The Policy + +**Design principle:** If your code/config declares "I use secret X", that IS the access +grant. No one goes to a separate UI to edit policies. Default is locked — if not +mentioned, no access. If mentioned, access is automatic. + +**The declaration IS the policy:** + +```yaml +labels: + mailserver: + puppet_classes: + - postfix + - dovecot + secrets: + - mail/tls-cert + - mail/dkim-key + - mail/relay-credentials + ports: [25, 587, 993] +``` + +When Lab applies label `mailserver` to a server, it automatically: +1. Grants that server access to `mail/tls-cert`, `mail/dkim-key`, `mail/relay-credentials` +2. Denies access to everything else +3. No separate policy file, no Vault admin, no ticket to security team + +When a puppet class references a secret: + +```puppet +# modules/postfix/manifests/init.pp +class postfix { + $relay_creds = lab::secret('mail/relay-credentials') + + file { '/etc/postfix/sasl_passwd': + content => $relay_creds, + mode => '0600', + } +} +``` + +The `lab::secret()` call is both the usage AND the declaration that this class +needs this secret. Lab scans puppet classes, discovers secret references, +and auto-generates the access policy. If `postfix` class is applied to a server +via a label, that server gets access to `mail/relay-credentials`. Remove the +class → access revoked. + +**Secrets must be equally easy to access from anywhere:** + +| Runtime | How you get a secret | Same underneath | +|---------|---------------------|-----------------| +| Puppet code | `lab::secret('mail/tls-cert')` | Lab agent on machine fetches from secret backend | +| App on VM | `LAB_SECRET_MAIL_TLS_CERT` env var, or `/run/secrets/mail/tls-cert` file | Lab agent provides via env or tmpfs mount | +| App in Kubernetes | Same env var or volume mount | Lab k8s operator syncs to K8s Secret object | +| App in Docker (standalone) | `--env-file` or bind mount from lab agent | Lab agent on host provides | +| Script / cron job | `lab secret get mail/tls-cert` CLI call | Lab CLI authenticated via machine identity | +| cloud-init / bootstrap | Injected at provision time via one-time token | Lab server provides during enrollment | + +**One way to consume secrets, regardless of where you run.** The lab agent (or k8s +operator, or CLI) handles authentication and fetching transparently. The app just +reads an env var or file. + +#### How Access Flows + +``` + Label "mailserver" + declares secrets: + - mail/tls-cert + - mail/dkim-key + │ + ▼ + ┌───────────────────────┐ + │ Lab compiles policy │ + │ │ + │ server mail-1: │ + │ CAN access: │ + │ mail/tls-cert │ + │ mail/dkim-key │ + │ CANNOT access: │ + │ k8s/* │ + │ postgres/* │ + │ (everything else)│ + └───────────┬───────────┘ + │ + ▼ + ┌───────────────────────┐ + │ Secret backend │ + │ (TBD — needs │ + │ investigation) │ + │ │ + │ Enforces policy at │ + │ backend level, not │ + │ just in Lab │ + └───────────────────────┘ +``` + +#### Secret Sources + +Secrets themselves can come from multiple places: + +```yaml +secrets: + mail/tls-cert: + type: dynamic # generated/rotated automatically + generator: acme # cert-manager / Let's Encrypt + rotate_every: 90d + + mail/dkim-key: + type: static # manually set, stored encrypted + set_by: admin # who last set it + + mail/relay-credentials: + type: static + set_by: admin + + k8s/join-token: + type: dynamic + generator: kubernetes # fetched from k8s API + rotate_every: 24h + + tls/node-cert: + type: dynamic + generator: ca # issued per-machine from internal CA + per_machine: true # each machine gets its own +``` + +#### CLI for Secrets + +``` +$ lab get secrets +SECRET TYPE USED BY LAST ROTATED +mail/tls-cert dynamic mailserver (2 srv) 2026-03-14 +mail/dkim-key static mailserver (2 srv) 2026-01-15 +mail/relay-credentials static mailserver (2 srv) 2026-02-01 +k8s/join-token dynamic k8s-worker (12 srv) 2026-03-15 +tls/node-cert dynamic * (all enrolled) per-machine + +$ lab secret set mail/relay-credentials + Enter value: **** + ✓ Updated. Accessible by: mailserver (2 servers) + ✓ Servers will pick up new value within 60s + +$ lab show secret mail/relay-credentials +Secret: mail/relay-credentials +Type: static +Last set: 2026-03-15 by admin + +Accessible by (derived from code): + Label "mailserver" → puppet class "postfix" → lab::secret('mail/relay-credentials') + ├── mail-1 (xcpng) last fetched: 12m ago + └── mail-2 (aws) last fetched: 12m ago + + No other references found in any applied code. + +$ lab secret audit + ✓ All secrets are referenced by at least one applied class/label + ⚠ Secret "old/api-key" is defined but not referenced by any code — orphaned? + ⚠ Secret "db/password" referenced by class "app::database" but never set — empty! +``` + +#### Secret Architecture — Distributed, Offline-Capable + +**Critical requirement:** Nothing breaks if the central secret server (or any server) +is unreachable. Everything continues to work — including making new pods, deployments, +puppet runs — using local encrypted cache. This is not an edge case, it's a core design. + +**This means secrets are NOT a central server you query.** They're a distributed, +synced, encrypted dataset with offline capability. + +``` +┌─────────────────────────────────────────────────────────────┐ +│ Secret Distribution Model │ +│ │ +│ NOT this (central server): THIS (distributed sync): │ +│ │ +│ ┌─────────┐ ┌──────┐ ┌──────┐ │ +│ │ Vault │ │ Node │◄─►│ Node │ │ +│ └────┬────┘ └──┬───┘ └──┬───┘ │ +│ ┌────┼────┐ │ ▲ │ │ +│ │ │ │ ▼ │ ▼ │ +│ ┌┴┐ ┌┴┐ ┌┴┐ ┌──────┐ ┌──────┐ │ +│ │N│ │N│ │N│ │ Node │◄─►│ Node │ │ +│ └─┘ └─┘ └─┘ └──┬───┘ └──────┘ │ +│ (all dead if vault │ │ +│ is unreachable) ▼ │ +│ ┌──────────┐ │ +│ │ Git repo │ (encrypted │ +│ │ (backup) │ backup of │ +│ └──────────┘ last resort) │ +└─────────────────────────────────────────────────────────────┘ +``` + +#### How It Works + +**Layer 1: Local Encrypted Cache (on every machine)** +- Every machine that has access to secrets stores them locally, encrypted at rest +- Encrypted with machine-specific key (derived from machine identity/TPM/secure enclave) +- Puppet runs, app starts, pod deployments — all read from local cache +- If cache is fresh → use it, no network call needed +- Cache has TTL per secret, but stale cache is better than no secret + +**Layer 2: Secret Store (privileged nodes that hold all secrets)** +- One or more nodes with the `secret-store` label hold the COMPLETE encrypted dataset +- This is NOT a special server type — it's a label, applied to pods, VMs, or bare metal +- Should have at least 2 replicas for HA +- Machines fetch ONLY the secrets their labels entitle them to from the store +- The store enforces policy — a machine with label `mailserver` gets `mail/*`, nothing else +- Machines NEVER sync with each other directly — they only talk to the store +- This prevents secret sprawl (no machine accumulates secrets it shouldn't have) + +**Layer 3: Git Encrypted Backup (last resort recovery)** +- All secrets (encrypted with a master key) backed up to a Git repo +- If a machine has empty cache AND no peers available → restore from Git backup +- SOPS/age style encryption — secrets encrypted, metadata (paths, policies) in plaintext +- Git gives versioning, audit trail, and disaster recovery for free +- The Git repo alone is useless without the decryption key + +**Layer 4: Lab-server (coordinator, NOT single point of failure)** +- Lab-server is the preferred interface to set/rotate secrets (via CLI/API) +- Lab-server does NOT need to be the secret-store (but can be, via label) +- If lab-server is down, machines keep running from local cache +- No new secrets can be distributed while secret-store is down +- But nothing breaks — existing workloads continue uninterrupted +- When secret-store comes back, machines sync and catch up + +**Separation of concerns:** +- `lab-server` = coordination, API, lifecycle management +- `secret-store` label = holds all secrets, serves policy-filtered requests +- These CAN be the same node (apply both labels) or separate nodes +- For homelab: same node is fine. For enterprise: separate for isolation + +#### Recovery Scenarios + +``` +Scenario 1: Lab-server down, secret-store up + → All machines continue working from local cache + → Machines can still fetch/refresh secrets from secret-store + → No new resources can be provisioned (lab-server manages lifecycle) + → But existing workloads are unaffected + +Scenario 2: Secret-store down, lab-server up + → All machines continue working from local cache + → Lab-server can still manage lifecycle (provision, plan, apply) + → No new secrets can be distributed + → No secret rotations until store is back + → Lab-server shows: ⚠ secret-store unreachable + +Scenario 3: Both down + → All machines continue working from local cache + → Nothing new can happen, but nothing breaks + → Recovery priority: restore secret-store first (from Git backup) + +Scenario 4: Machine reboots, cache intact + → Reads from local encrypted cache immediately + → Refreshes from secret-store in background to catch up + → No dependency on lab-server for startup + +Scenario 5: Machine rebuilt, cache empty + → Machine has its identity (from enrollment) but no secrets + → Fetches entitled secrets from secret-store (policy-filtered) + → If secret-store unreachable → cannot start (needs secrets) + → Operator can restore secret-store from Git backup to unblock + +Scenario 6: Total disaster, only Git backup survives + → Deploy new node, apply `secret-store` label + → Restore encrypted secrets from Git backup + → Deploy lab-server (lab init) + → New machines enroll and receive their entitled secrets + → System fully recovered + +Scenario 7: New pod in k8s, secret-store unreachable + → K8s node has local secret cache for its entitled secrets + → Lab k8s operator serves pod secrets from node's local cache + → Pod starts with cached secrets + → No interruption to deployments +``` + +#### CLI for Secret Distribution + +``` +$ lab secret status +SECRET DISTRIBUTION STATUS: + Local cache: ✓ 8 secrets cached (of 8 entitled), encrypted, fresh (< 5m old) + Secret store: ✓ connected (2 replicas: store-1, store-2) + Lab-server: ✓ connected + Git backup: ✓ last push 2026-03-15 14:30:00 (47 total secrets) + +$ lab secret status --store +SECRET STORE: + Replicas: 2/2 healthy + store-1 k8s pod ✓ synced 47 secrets (all) + store-2 vm/xcpng ✓ synced 47 secrets (all) + Git backup: ✓ synced 2026-03-15 14:30:00 + Total secrets: 47 + Entitled consumers: + k8s-worker (12 machines) → 3 secrets each + mailserver (2 machines) → 5 secrets each + postgres (3 machines) → 4 secrets each + lab-server (1 machine) → 2 secrets + +$ lab secret cache +LOCAL CACHE: +SECRET CACHED TTL STATUS +mail/tls-cert ✓ 89d left fresh +mail/dkim-key ✓ no expiry fresh +k8s/join-token ✓ 23h left fresh +tls/node-cert ✓ 346d left fresh + +$ lab secret recover --from git + → Fetching encrypted backup from git@github.com:org/lab-secrets.git + → Decrypting with master key... + → Restored 23 secrets + → Syncing with available peers... +``` + +#### Local Cache Security + +The local cache must be stored securely — needs investigation: +- Encrypted at rest with machine-specific key +- Key derived from: TPM 2.0? Secure enclave? LUKS-bound? needs investigation +- Memory-mapped, not swappable (mlock) +- Accessible only by lab agent (file permissions + MAC/SELinux) +- Wiped on machine decommission (`lab identity revoke`) +- Possibly use kernel keyring on Linux — needs investigation + +#### Secret Backend — NOT Decided + +The underlying secret storage/sync mechanism is pluggable: + +```go +type SecretBackend interface { + Name() string + + // CRUD + Get(path string, identity *MachineIdentity) ([]byte, error) + Set(path string, value []byte) error + Delete(path string) error + List(prefix string) ([]string, error) + + // Policy (auto-generated from code/labels) + GrantAccess(path string, identity *MachineIdentity) error + RevokeAccess(path string, identity *MachineIdentity) error + + // Dynamic + Generate(path string, generator GeneratorConfig) ([]byte, error) + Rotate(path string) error + + // Distribution + SyncWith(peer PeerInfo) error + CacheLocally(secrets []Secret) error + RestoreFromBackup(source BackupSource) error +} +``` + +Possible approaches (each needs investigation): +- **SOPS + age + Git** — simplest, encrypted files in Git, but no peer sync +- **OpenBao** — Vault fork, has replication, but still central-server mindset +- **Sealed Secrets / External Secrets Operator** — k8s-native, but not universal +- **Infisical** — developer-friendly, but SaaS-oriented +- **Custom: encrypted SQLite + peer sync** — simple, we control the sync protocol +- **etcd with encryption** — distributed by nature, but might be overkill +- **CockroachDB** — distributed SQL, encrypted, survives node failures +- **Consul** — distributed KV with gossip, HashiCorp though +- **Lab's own sync protocol** — gossip-based, encrypted, purpose-built + +The right answer might be a combination: +- SOPS/age for encryption format (proven, auditable) +- Custom gossip sync for distribution (lightweight) +- Git for backup (free versioning and DR) +- Or wrap an existing distributed KV that already handles sync + +**This is the most complex subsystem in Lab and needs careful investigation.** + +### Identity Plugin System + +Same extensible pattern as providers and health sources: + +```go +type IdentityPlugin interface { + Name() string + + // Enrollment + Enroll(resource *Resource, token string) (*Identity, error) + Revoke(resource *Resource) error + + // Status + Status(resource *Resource) (*IdentityStatus, error) + + // Renewal + Renew(resource *Resource) error +} +``` + +This allows swapping identity backends without changing the rest of Lab. +We might start with Vault + OpenVox CA and later add/replace components. + +## State Storage — Design Principles + +**NOT etcd.** etcd prioritizes consistency over availability — it would rather crash and +stay down than serve potentially inconsistent data. For Lab, availability wins: + +- Losing a few events is better than total outage +- Should auto-backup and auto-restore on corruption +- Should degrade gracefully, never crash and refuse to start +- Stale data is acceptable, no data is not + +Requirements: +- Stores: resource state, label definitions, group membership, alert configs, audit log +- Must survive lab-server restart +- Must be migratable (lab-server can move between hosts) +- Should auto-backup (to Git, S3, or local snapshots) +- Should auto-recover from corruption without operator intervention +- Embedded (no external dependency) preferred for simplicity + +Candidates (needs investigation): +- **SQLite** — embedded, simple, proven, WAL mode for concurrent reads, easy to backup (copy file) +- **bbolt/BoltDB** — embedded KV, used by etcd ironically, simpler than etcd itself +- **Badger** — embedded KV in Go, LSM-tree, good performance +- **DuckDB** — embedded analytical DB, might be overkill +- **PostgreSQL** — if we need multi-server state, but adds external dependency +- **Litestream** — SQLite + continuous replication to S3/GCS/Azure (interesting combo) + +**SQLite + Litestream** is the current leading candidate: +- SQLite for simplicity and embeddability +- Litestream for continuous backup to S3/GCS/local without stopping the database +- Auto-restore: if DB is missing, Litestream restores from latest backup +- Single file, easy to migrate when lab-server moves +- But needs investigation to confirm it handles our scale + +## Open Questions + +1. Name: "lab" is simple but generic. Alternatives? +2. GitOps integration — should label/profile changes go through Git, or direct API? +3. Multi-tenancy — how to scope labels/resources per team? +4. Auth — mTLS between CLI and server? OIDC? Vault-issued tokens? +5. Input format — TypeScript (DA-style), YAML (Compose-style), or both? +7. Should `lab init` deploy lab-server as a container (portable) or native binary (simpler)? diff --git a/os-install-research.md b/os-install-research.md new file mode 100644 index 0000000..19a4798 --- /dev/null +++ b/os-install-research.md @@ -0,0 +1,356 @@ +# OS Installation Research + +## Target Operating Systems + +All must support unattended network installation and automated OpenVox enrollment. +All must work across multiple CPU architectures where the OS supports it. + +| OS | Install System | Answer Format | Architectures | PXE Difficulty | +|-----|---------------|--------------|---------------|---------------| +| Ubuntu 24.04 | autoinstall (cloud-init) | YAML | x86_64, aarch64, RISC-V | Easy | +| Debian 12 | preseed | preseed.cfg | x86_64, aarch64, many others | Medium | +| Fedora 41+ | Anaconda/kickstart | .ks file | x86_64, aarch64 | Easy | +| AlmaLinux 9 | Anaconda/kickstart | .ks file | x86_64, aarch64 | Easy | +| XCP-ng 8.3 | Custom Python TUI | XML answer file | x86_64 only | HARD | +| VyOS 1.4 | Custom installer | config.boot | x86_64, aarch64 | Medium | + +## XCP-ng Network Install — Known Hard + +### Why it's difficult +- iPXE UEFI is fundamentally broken (open bug, multiboot module corruption) +- Serial/headless install hangs after detecting storage — no fix +- No VNC installer mode (unlike RHEL/Debian) +- TFTP agonizingly slow for large install.img +- Custom Python TUI designed for VGA console, not automation +- No major provisioning tool has first-class XCP-ng support + +### What works +- **BIOS PXE** more reliable than UEFI +- **IPMI virtual media** with remastered ISO is most reliable +- Answer file XML with `` and ` + +``` + +### Post-install puppet enrollment +The `filesystem-populated` stage script drops a firstboot script: +```bash +#!/bin/bash +MOUNT=$1 +cat > "$MOUNT/etc/firstboot.d/99-lab-enroll" << 'SCRIPT' +#!/bin/bash +# Install puppet agent (XCP-ng is CentOS-based, yum works) +yum install -y puppet-agent +# Configure and start +puppet config set server puppet.lab.internal +systemctl enable --now puppet +SCRIPT +chmod +x "$MOUNT/etc/firstboot.d/99-lab-enroll" +``` + +## Lab Install Profile Abstraction + +Lab needs an `InstallerPlugin` interface so the same `lab onboard` command works +for all OS types. Each plugin handles answer file generation, PXE chain setup, +and post-install enrollment for its OS type. + +```go +type InstallerPlugin interface { + Name() string + SupportedArchitectures() []string + + // Generate the answer/config file for unattended install + GenerateAnswerFile(config InstallConfig) ([]byte, error) + + // Set up PXE boot artifacts (kernel, initrd, bootloader configs) + PreparePXE(config PXEConfig) error + + // Generate post-install enrollment script + GenerateEnrollmentScript(token string, labels []string) ([]byte, error) +} +``` + +Built-in installer plugins: +- `installer-autoinstall` — Ubuntu (cloud-init based autoinstall YAML) +- `installer-kickstart` — Fedora, AlmaLinux, RHEL (kickstart .ks files) +- `installer-preseed` — Debian (preseed.cfg) +- `installer-xcpng` — XCP-ng (custom XML + firstboot.d scripts) +- `installer-vyos` — VyOS (config.boot) + +## Auto-Onboard Rules + +Automatic onboarding based on detected hardware characteristics: + +```yaml +auto-onboard: + rules: + - name: large-compute-to-xcpng + conditions: + cores: ">= 40" + memory: ">= 500GB" + provider: ovh + action: + image: xcpng-8.3 + labels: [xen-host, production] + + - name: arm-to-ubuntu + conditions: + arch: aarch64 + action: + image: ubuntu-24.04 + labels: [arm, k8s-worker] +``` + +Must support: +- Preview: show which existing servers match/don't match rules +- Dry-run: show what would happen for pending servers +- Apply: actually onboard matching servers + +## Deployment Approach: Universal PXE Agent + Rootfs Images + +### Decision: NOT using native installers + +Instead of dealing with 6 different installer formats (autoinstall, kickstart, preseed, +XCP-ng XML, VyOS config), Lab uses a universal approach: + +1. PXE boot ONE agent OS (same for all target distros) +2. Agent contacts Lab server, gets instructions +3. Agent partitions disk, deploys rootfs tarball, injects config, reboots +4. Target OS boots with lab-agent, enrolls with OpenVox + +This avoids the nightmare of maintaining 6 installer plugins × 3 architectures. + +### Tool Evaluation + +| Tool | What It Does | For Lab? | +|------|-------------|----------| +| **Tinkerbell (CNCF)** | PXE → HookOS agent → workflow actions (partition, deploy, inject) | **Best candidate to wrap** | +| **LinuxKit** | Build minimal agent OS (used by Tinkerbell's HookOS) | Build our PXE agent | +| **mkosi** | Build rootfs tarballs for any distro (Fedora, Ubuntu, Debian, etc.) | **Image production** | +| **iPXE** | Universal PXE bootloader with scripting | PXE foundation | +| **Pixiecore** | Simple Go PXE server with per-MAC API mode | PXE building block | +| **bootc** | Bootable OCI containers → install to disk (RHEL-family) | Image format option | +| **cloud-init** | First-boot config injection | Post-deploy config | +| **Packer** | Build VM/machine images | Golden image building | +| **MAAS/Curtin** | Production-grade, same pattern, but Ubuntu-centric + heavy | Too opinionated | +| **Warewulf** | Stateless/diskless boot from container images | Wrong model (RAM-only) | +| **Kairos** | Immutable k8s-focused OS from containers | Too opinionated | +| **FOG/Clonezilla** | Block-level disk cloning | Too rigid | +| **FAI** | Debian-centric installer framework | Too narrow | +| **Razor (Puppet)** | Dead (archived 2019) | Dead | +| **netboot.xyz** | PXE boot menu into native installers | Opposite of what we want | + +### Tinkerbell — Closest Match + +Tinkerbell already implements this pattern: +- **HookOS**: minimal agent OS built with LinuxKit, boots via PXE, multi-arch (x86 + ARM) +- **Tink Worker**: runs inside HookOS, contacts server via gRPC, executes workflows +- **Workflow Actions**: + - `rootio` — partition disks, create filesystems + - `archive2disk` — stream compressed rootfs tarball to mounted filesystem + - `image2disk` — write raw disk image (dd-style) + - `oci2disk` — pull OCI container image, write to disk + - `writefile` — write individual files (puppet certs, config, enrollment token) + - `cexec` — chroot and run commands (install bootloader, etc.) + - `kexec` — kexec into new kernel (avoids reboot) + +**Tinkerbell's limitation:** requires Kubernetes to run (Tink Server is k8s-native). +Options: +- Run on bootstrap node's k3s (works but adds k3s dependency before we have k3s) +- Extract just HookOS + actions, replace Tink Server with Lab's own API +- Use Tinkerbell after initial bootstrap + +### Option A: Wrap Tinkerbell +Use Tinkerbell's HookOS and actions, Lab translates `lab onboard` into Tinkerbell +workflows. Proven, multi-arch, battle-tested by Equinix Metal. + +### Option B: Build our own lightweight agent +If Tinkerbell's k8s dependency is too heavy: +- Build agent OS with LinuxKit (like HookOS but simpler) +- Small Go binary as the agent: contacts lab-server, gets instructions, partitions, + deploys rootfs, injects files, installs bootloader, reboots +- Embedded in Lab binary — no k8s dependency +- Essentially "Tinkerbell actions without Tinkerbell's workflow engine" + +### Decision: TBD — needs hands-on evaluation of Tinkerbell + +### VyOS Inspiration + +VyOS proves this pattern works: +- Image-based install (rootfs deployed to partition) +- Also runs as Docker container (same config system) +- Same concept as Lab: one definition → VM image, bare metal, or container + +### Image Production Pipeline + +Lab needs to produce rootfs tarballs for each OS × architecture: + +``` +$ lab image build ubuntu-24.04 --arch x86_64,aarch64 + → Uses mkosi or debootstrap to build rootfs + → Injects lab-agent, cloud-init datasource + → Produces: ubuntu-24.04-x86_64.tar.gz, ubuntu-24.04-aarch64.tar.gz + +$ lab image build xcpng-8.3 --arch x86_64 + → Extract/capture rootfs from XCP-ng installer/installed system + → Produces: xcpng-8.3-x86_64.tar.gz + +$ lab image list +IMAGE ARCH SIZE BUILT +ubuntu-24.04 x86_64, aarch64 850MB 2026-03-15 +debian-12 x86_64, aarch64 620MB 2026-03-14 +fedora-41 x86_64, aarch64 920MB 2026-03-14 +almalinux-9 x86_64, aarch64 780MB 2026-03-13 +xcpng-8.3 x86_64 1.2GB 2026-03-10 +vyos-1.4 x86_64, aarch64 450MB 2026-03-12 +``` + +Image build tools per OS: +- Ubuntu/Debian: debootstrap or mkosi +- Fedora/AlmaLinux: dnf --installroot or mkosi +- XCP-ng: install in QEMU + Packer, capture rootfs (only viable method) +- VyOS: extract squashfs from ISO (`unsquashfs /mnt/live/filesystem.squashfs`) +- Asahi Linux: NOT BUILDABLE — SSH onboard only, OS already installed by user + +## XCP-ng Rootfs Production — Detailed + +### Why package-based build doesn't work +- `install.img` is the installer ramdisk, NOT the target system +- The installer (`host-installer/backend.py`) does post-install XAPI setup that + can't be replicated with just yum --installroot +- Nobody has successfully built XCP-ng from packages alone +- `create-install-image` scripts only produce ISOs + +### Viable approach: Packer + QEMU capture +``` +1. Boot XCP-ng ISO in QEMU with answerfile (unattended) +2. Installer runs normally, does all XAPI/Xen setup +3. Mount resulting disk image +4. Tar up root partition +5. Generalize: remove SSH keys, XAPI state.db, hostname, UUIDs, persistent net rules +6. Output: xcpng-8.3-x86_64.tar.gz +``` + +### XCP-ng partition layout (PXE agent must recreate this) +``` +sda1: 18GB ext3 / (dom0 root) +sda2: 18GB ext3 (backup) (upgrade slot) +sda3: rest LVM (SR) (VM storage repository) +sda4: 512MB vfat /boot/efi (UEFI ESP) +sda5: 4GB ext3 /var/log +sda6: 1GB swap +``` + +## Asahi Linux — Special Case + +### Why it can't follow the standard path +- No PXE boot — Apple Silicon only boots from internal NVMe or USB (iBoot) +- Firmware partition — m1n1 must be in Apple's APFS container, coexists with macOS +- Device tree — generated per-chip at install time +- GPU drivers — Asahi's reverse-engineered drivers are kernel-specific +- Boot chain: iBoot → m1n1 → U-Boot/GRUB → Linux (completely non-standard) + +### How Lab handles it +- SSH onboard only: `lab onboard mac-studio --provider ssh --host ` +- Asahi is already installed (user did this manually or via Asahi installer) +- Lab manages the userspace (Fedora-based) via puppet normally +- Kernel updates from Asahi repos, managed by puppet/dnf +- m1n1/U-Boot/firmware layer is untouched by Lab + +### Lesson +Not everything is PXE-bootable. Lab needs two onboard paths: +- **PXE onboard**: bare metal with no OS (Beelinks, OVH servers, XCP-ng hosts) +- **SSH onboard**: OS already installed (Mac Studio, DGX Spark, cloud VMs) + +## Image Deployment Matrix + +``` + PXE Deploy SSH Onboard Container VM Image +Ubuntu 24.04 ✓ rootfs ✓ ✓ ✓ qcow2 +Debian 12 ✓ rootfs ✓ ✓ ✓ qcow2 +Fedora 41 ✓ rootfs ✓ ✓ ✓ qcow2 +AlmaLinux 9 ✓ rootfs ✓ ✓ ✓ qcow2 +XCP-ng 8.3 ✓ rootfs ✓ (existing) ✗ ✗ +VyOS 1.4 ✓ rootfs ✓ (existing) ✓ docker ✓ qcow2 +Asahi Linux ✗ impossible ✓ (only way) ✗ ✗ +``` + +## Automated Image Pipeline + +Images must be rebuilt regularly to include security updates and new lab-agent versions. + +### Pipeline Configuration +```yaml +image-pipelines: + ubuntu-24.04: + method: debootstrap + schedule: weekly + architectures: [x86_64, aarch64] + outputs: [rootfs-tarball, container-base, qcow2] + retention: 4 builds + + xcpng-8.3: + method: packer-qemu # install in QEMU, capture + schedule: monthly + architectures: [x86_64] + outputs: [rootfs-tarball] + retention: 3 builds + + vyos-1.4: + method: squashfs-extract # extract from ISO + schedule: monthly + architectures: [x86_64, aarch64] + outputs: [rootfs-tarball, container-base] + retention: 3 builds +``` + +### Build runs on Lab itself (dogfooding) +- x86 images build on x86 machines (Beelink SER9 MAX) +- ARM images build on ARM machines (DGX Spark, Minisforum) +- XCP-ng builds on any x86 with QEMU/KVM +- Lab picks the right builder based on architecture + +### Upgrade flow +- New image built → Lab knows which servers run old version +- `lab image diff` shows package changes +- `lab image promote` makes new image the default for new deploys +- Existing servers: puppet manages package updates (not re-imaged unless requested) + +### Connection to Puppet → Container Artifact Builder + +Same pipeline, different output targets: + +``` +Label "mailserver" + base image "ubuntu-24.04": + → rootfs + puppet classes = bare metal image (tar.gz for PXE deploy) + → rootfs + puppet classes = container image (OCI for k8s/docker) + → rootfs + puppet classes = VM image (qcow2/vmdk for XCP-ng/AWS) + +One label, one set of puppet modules, three deployment formats. +``` + +## Multi-Architecture Considerations + +- PXE boot chain differs between x86 (BIOS/UEFI) and ARM (UEFI only) +- Need separate kernel/initrd per architecture for the agent OS +- Rootfs tarballs are architecture-specific +- Some OS images don't exist for all architectures (XCP-ng = x86 only) +- Lab must track architecture per image and refuse mismatches +- Tinkerbell's HookOS already builds for x86_64 and aarch64