first commit
This commit is contained in:
246
architecture.md
Normal file
246
architecture.md
Normal file
@@ -0,0 +1,246 @@
|
||||
# Architecture Decisions
|
||||
|
||||
## Core Principles
|
||||
|
||||
1. Build for homelab first, design for AWS/multi-cloud from the start
|
||||
2. Labels as the universal abstraction — config attaches to labels, not machines
|
||||
3. Code is the policy — declarations grant access, no separate policy management
|
||||
4. Availability over consistency — stale data is acceptable, no data is not
|
||||
5. No single point of failure — everything works offline with local cache
|
||||
6. Don't reinvent the wheel — wrap existing tools, build the glue and UX
|
||||
7. One engine everywhere — CLI, server, and init all use the same code path
|
||||
|
||||
## The Tool: "lab"
|
||||
|
||||
Unified infrastructure lifecycle platform. Full spec in `lab-tool-spec.md`.
|
||||
|
||||
### Component Dependency Map
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────────┐
|
||||
│ LAB PLATFORM │
|
||||
│ │
|
||||
│ ┌─────────────────────────────────────────────────────────────┐ │
|
||||
│ │ CORE (no external deps) │ │
|
||||
│ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌───────────────┐ │ │
|
||||
│ │ │ Label │ │ Group │ │ Targeting│ │ Render Engine │ │ │
|
||||
│ │ │ Engine │ │ Engine │ │ Engine │ │ (CLI tables, │ │ │
|
||||
│ │ │ │ │ │ │ │ │ TUI, diff) │ │ │
|
||||
│ │ └──────────┘ └──────────┘ └──────────┘ └───────────────┘ │ │
|
||||
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────────┐ │ │
|
||||
│ │ │ Profile │ │ State Store │ │ Plugin Registry │ │ │
|
||||
│ │ │ Engine │ │ (SQLite + │ │ │ │ │
|
||||
│ │ │ (t-shirt │ │ Litestream) │ │ │ │ │
|
||||
│ │ │ sizes) │ │ │ │ │ │ │
|
||||
│ │ └──────────────┘ └──────────────┘ └──────────────────┘ │ │
|
||||
│ └─────────────────────────────────────────────────────────────┘ │
|
||||
│ ▲ depends on core │
|
||||
│ ┌────┴────────────────────────────────────────────────────────┐ │
|
||||
│ │ LIFECYCLE (depends on: core + providers) │ │
|
||||
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────────┐ │ │
|
||||
│ │ │ Lifecycle │ │ Artifact │ │ K8s Deployer │ │ │
|
||||
│ │ │ Manager │ │ Builder │ │ │ │ │
|
||||
│ │ │ (plan/apply/ │ │ (puppet → │ │ │ │ │
|
||||
│ │ │ destroy) │ │ container) │ │ │ │ │
|
||||
│ │ └──────────────┘ └──────────────┘ └──────────────────┘ │ │
|
||||
│ └─────────────────────────────────────────────────────────────┘ │
|
||||
│ ▲ depends on lifecycle │
|
||||
│ ┌────┴────────────────────────────────────────────────────────┐ │
|
||||
│ │ IDENTITY & SECRETS (depends on: lifecycle) │ │
|
||||
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────────┐ │ │
|
||||
│ │ │ Identity │ │ Secret Store │ │ Token Issuer │ │ │
|
||||
│ │ │ Manager │ │ (privileged │ │ (one-time join │ │ │
|
||||
│ │ │ (enroll, │ │ label, local│ │ tokens) │ │ │
|
||||
│ │ │ DNS, certs, │ │ cache, git │ │ │ │ │
|
||||
│ │ │ SSH keys) │ │ backup) │ │ │ │ │
|
||||
│ │ └──────────────┘ └──────────────┘ └──────────────────┘ │ │
|
||||
│ └─────────────────────────────────────────────────────────────┘ │
|
||||
│ ▲ depends on identity │
|
||||
│ ┌────┴────────────────────────────────────────────────────────┐ │
|
||||
│ │ OBSERVABILITY (depends on: core + identity) │ │
|
||||
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────────┐ │ │
|
||||
│ │ │ Health │ │ Alert │ │ Audit Log │ │ │
|
||||
│ │ │ Aggregator │ │ Generator │ │ │ │ │
|
||||
│ │ │ │ │ (auto + user │ │ │ │ │
|
||||
│ │ │ │ │ defined) │ │ │ │ │
|
||||
│ │ └──────────────┘ └──────────────┘ └──────────────────┘ │ │
|
||||
│ └─────────────────────────────────────────────────────────────┘ │
|
||||
│ │
|
||||
│ ┌─────────────────────────────────────────────────────────────┐ │
|
||||
│ │ INTERFACES (depends on: everything above) │ │
|
||||
│ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────────┐ │ │
|
||||
│ │ │ gRPC/REST│ │ CLI │ │ TUI │ │ Web UI │ │ │
|
||||
│ │ │ API │ │ (cobra) │ │(bubbletea)│ │ (future) │ │ │
|
||||
│ │ └──────────┘ └──────────┘ └──────────┘ └──────────────┘ │ │
|
||||
│ └─────────────────────────────────────────────────────────────┘ │
|
||||
└─────────────────────────────────────────────────────────────────────┘
|
||||
|
||||
PROVIDER PLUGINS (external, loaded at runtime):
|
||||
┌────────────┐ ┌────────────┐ ┌──────────────┐ ┌────────────┐
|
||||
│provider-aws│ │provider- │ │provider- │ │provider-k8s│
|
||||
│ (Pulumi) │ │xcpng (XO) │ │baremetal │ │ (Pulumi) │
|
||||
└────────────┘ └────────────┘ │(Tinkerbell) │ └────────────┘
|
||||
└──────────────┘
|
||||
HEALTH PLUGINS: IDENTITY PLUGINS:
|
||||
┌────────────┐ ┌──────────┐ ┌───────────┐ ┌─────────────┐
|
||||
│health- │ │health- │ │id-openvox │ │id-dns │
|
||||
│prometheus │ │naemon │ │ │ │ │
|
||||
└────────────┘ └──────────┘ └───────────┘ └─────────────┘
|
||||
┌────────────┐ ┌───────────┐ ┌─────────────┐
|
||||
│health- │ │id-ssh-ca │ │id-secret │
|
||||
│cloudwatch │ │ │ │ │
|
||||
└────────────┘ └───────────┘ └─────────────┘
|
||||
```
|
||||
|
||||
### Build Order (what depends on what)
|
||||
|
||||
```
|
||||
Phase 1: CORE (can be built and tested independently)
|
||||
├── Label Engine
|
||||
├── Group Engine (depends on: labels)
|
||||
├── Targeting Engine (depends on: labels, groups)
|
||||
├── Profile Engine (t-shirt sizes)
|
||||
├── Render Engine
|
||||
├── State Store (SQLite + Litestream)
|
||||
├── Plugin Registry
|
||||
├── CLI framework (cobra)
|
||||
└── gRPC/REST API skeleton
|
||||
|
||||
Phase 2: PROVIDERS (can be built in parallel, each independent)
|
||||
├── provider-ssh (simplest, needed for onboarding existing machines)
|
||||
├── provider-baremetal (PXE boot — embedded DHCP/TFTP/HTTP server)
|
||||
├── provider-portainer (deploy via Portainer API)
|
||||
├── provider-k8s (needed for k8s deployments)
|
||||
├── provider-aws (Pulumi AWS)
|
||||
└── provider-xcpng (Pulumi XO / XO REST API)
|
||||
|
||||
Phase 3: LIFECYCLE (depends on: core + at least one provider)
|
||||
├── Lifecycle Manager (plan/apply/destroy)
|
||||
├── Onboarding (lab onboard — SSH detect + PXE boot + auto-enroll)
|
||||
├── Hardware detection (suggest labels from detected CPU/GPU/RAM/disk)
|
||||
├── Local mode (lab init --local, engine on user device)
|
||||
├── Self-deploy (lab init — deploy to remote target)
|
||||
├── Self-migration (lab server migrate)
|
||||
└── Artifact Builder (puppet → container)
|
||||
|
||||
Phase 4: IDENTITY (depends on: lifecycle)
|
||||
├── Token Issuer (one-time join tokens)
|
||||
├── OpenVox Enrollor (cert signing, node classification)
|
||||
├── DNS Manager (auto-registration, IP mobility)
|
||||
├── SSH CA integration
|
||||
└── Secret Store (privileged label, local cache, git backup)
|
||||
|
||||
Phase 5: OBSERVABILITY (depends on: core + identity)
|
||||
├── Health Aggregator (Prometheus, Naemon, CloudWatch plugins)
|
||||
├── Alert Generator (auto + user-defined, targeting engine)
|
||||
├── Four-pillar status (sync + puppet + health + identity)
|
||||
└── Audit log
|
||||
|
||||
Phase 6: UX POLISH
|
||||
├── TUI (bubbletea, k9s-style, cross-linked navigation)
|
||||
├── lab show / lab targets (visibility commands)
|
||||
├── lab render (multi-provider comparison)
|
||||
└── Web UI (future)
|
||||
```
|
||||
|
||||
### Key Concepts
|
||||
|
||||
| Concept | Description |
|
||||
|---------|-------------|
|
||||
| **Labels** | Universal abstraction. Config (puppet classes, alerts, secrets, sizes) attaches to labels |
|
||||
| **Groups** | Composable, nested, with exclusions. Target by label, group, server, environment |
|
||||
| **Targeting** | Unified query syntax used everywhere: alerts, secrets, puppet, queries |
|
||||
| **Four Pillars** | Every resource shows: Sync + Puppet + Health + Identity |
|
||||
| **Profiles** | T-shirt sizing with per-provider mappings, user-owned |
|
||||
| **Secret Store** | Privileged label holding all secrets, machines get only entitled subset |
|
||||
| **Code = Policy** | `lab::secret()` in puppet code = usage AND access declaration |
|
||||
| **Artifact Builder** | Same puppet modules → VM config OR container image |
|
||||
| **Self-deploy** | Lab deploys itself using same engine as everything else |
|
||||
| **Visibility** | Two-way: server→everything applied, label→all servers affected |
|
||||
|
||||
## Infrastructure Stack
|
||||
|
||||
| Layer | Homelab | AWS Equivalent | Status |
|
||||
|-------|---------|----------------|--------|
|
||||
| Orchestration | k3s | EKS | Decided |
|
||||
| IaC engine | Pulumi | Pulumi | Decided |
|
||||
| GitOps | ArgoCD | ArgoCD | Decided |
|
||||
| Monitoring (k8s) | Prometheus + Grafana | Prometheus + Grafana | Decided |
|
||||
| Monitoring (infra) | Naemon | N/A (bare metal only) | Decided |
|
||||
| Secrets backend | TBD | TBD | Needs investigation |
|
||||
| DNS | PowerDNS + ExternalDNS | Route53 + ExternalDNS | Decided — see `dns-research.md` |
|
||||
| TLS / CA | TBD | TBD | Needs investigation |
|
||||
| SSH CA | TBD | TBD | Needs investigation |
|
||||
| Storage | Longhorn | EBS CSI | Decided |
|
||||
| Config mgmt | OpenVox | OpenVox | Decided |
|
||||
| Bare metal boot | Tinkerbell / iPXE | N/A | Needs investigation |
|
||||
| State store | SQLite + Litestream | SQLite + Litestream | Leading candidate |
|
||||
| Container build | Buildah / Docker | Buildah / Docker | Needs investigation |
|
||||
|
||||
## Decisions Made
|
||||
|
||||
| Decision | Choice | Why | Alternatives Considered |
|
||||
|----------|--------|-----|------------------------|
|
||||
| IaC engine | Pulumi | Real languages, plan/preview, component packages, XCP-ng provider exists | Terraform (no abstraction), Crossplane (no plan) |
|
||||
| Config mgmt | OpenVox | Puppet fork, Apache 2.0, existing modules, active community | Puppet (Perforce EULA, 25-node limit) |
|
||||
| Multi-cloud abstraction | Custom (Lab) | Nothing exists that does labels + plan + bare metal + XCP-ng | Crossplane (no plan), Terraform (re-implement per cloud) |
|
||||
| Kubernetes | k3s | Puppet-friendly, multi-arch, lightweight, same K8s API as EKS | OpenShift (fights puppet), Talos (no SSH/puppet), MicroK8s (snap-based) |
|
||||
| Target OS list | Ubuntu, Debian, Fedora, AlmaLinux, XCP-ng, VyOS | Multi-arch, each with different install automation | See `os-install-research.md` |
|
||||
| State store | NOT etcd | etcd crashes over serving stale data, availability > consistency | Leading: SQLite + Litestream |
|
||||
| Secret access model | Code = policy | Declarations in code/labels auto-grant access, no manual Vault policies | Manual Vault policy management |
|
||||
| Secret distribution | Privileged store + local cache | Prevents secret sprawl, machines only get entitled secrets | Peer-to-peer sync (leaks secrets sideways) |
|
||||
| Resilience model | Offline-capable | Local cache keeps everything running, git backup for DR | Central server dependency (FreeIPA burned us) |
|
||||
| Bootstrap | Self-deploying | lab init uses same engine as lab apply, no special codepath | Separate init provider interface |
|
||||
|
||||
## Evaluated and Rejected
|
||||
|
||||
| Tool | Why Rejected | Details |
|
||||
|------|-------------|---------|
|
||||
| **Crossplane** | No plan/preview — dealbreaker for enterprise | `crossplane-evaluation.md` |
|
||||
| **Foreman** | Obsolete, poor UX, user has used it | Memory: `feedback_foreman.md` |
|
||||
| **Terraform/OpenTofu** | No multi-platform abstraction | Re-implement per cloud at thousands of nodes |
|
||||
| **MAAS** | Bare metal only | No cloud VMs, no Puppet integration |
|
||||
| **OpenShift** | Fights external config mgmt, heavy, limited ARM | See `kubernetes-flavors.md` |
|
||||
| **Talos** | Immutable OS, no SSH, no puppet | Incompatible with our approach |
|
||||
| **MicroK8s** | Snap-based | Puppet managing snaps is awkward |
|
||||
| **HashiCorp Vault** | Not impressed, central-server mindset | Will evaluate alternatives (OpenBao, Infisical, etc.) |
|
||||
| **etcd** | Consistency over availability | Crashes rather than serving stale data |
|
||||
| **FreeIPA** | Unstable | Good features (DNS, SSH, CA, secrets) but unreliable |
|
||||
|
||||
## Investigation Queue
|
||||
|
||||
Things we've identified but haven't evaluated yet, in rough priority order:
|
||||
|
||||
| # | Topic | Context | Options to Investigate |
|
||||
|---|-------|---------|----------------------|
|
||||
| 1 | Secret backend | Distributed, offline-capable, policy-filtered | OpenBao, Infisical, Conjur, SOPS+age, custom encrypted SQLite |
|
||||
| 2 | ~~DNS auto-registration~~ | ~~Every managed resource auto-registered~~ | **DECIDED: PowerDNS + ExternalDNS** — see `dns-research.md` |
|
||||
| 3 | SSH CA | CA-signed host keys, short-lived user certs | Vault SSH engine, OpenVox CA, step-ca, Teleport, Boundary |
|
||||
| 4 | TLS / Internal CA | Machine certs, auto-renewal | OpenVox CA, Vault PKI, step-ca, cert-manager |
|
||||
| 5 | Bare metal provisioning | Universal PXE agent + rootfs deploy (NOT native installers) | Wrap Tinkerbell vs build own agent — see `os-install-research.md` |
|
||||
| 6 | State store | Embedded, auto-backup, auto-recover | SQLite+Litestream, bbolt, Badger |
|
||||
| 7 | Container build | Puppet modules → OCI images | Buildah, Docker, Kaniko |
|
||||
| 8 | Local cache encryption | Machine-specific key for secret cache | TPM 2.0, kernel keyring, LUKS-bound, secure enclave |
|
||||
| 9 | Alert rendering | Generate monitoring configs from lab alerts | Prometheus rules, Naemon configs, CloudWatch |
|
||||
| 10 | Input format | How users define resources and labels | YAML (Compose-like), Pkl, KCL, CUE, TypeScript |
|
||||
| 11 | Auth (CLI to server) | Secure CLI-to-lab-server communication | mTLS, OIDC, Vault tokens |
|
||||
| 12 | XCP-ng Pulumi provider | May need Upjet wrapper or direct API | Existing Terraform provider via Upjet, Pulumi XO provider |
|
||||
| 13 | Multi-tenancy | Team scoping for labels/resources | Namespaces, RBAC, org hierarchy |
|
||||
| 14 | Image production pipeline | Build rootfs tarballs per OS per arch | mkosi, debootstrap, dnf --installroot, Packer |
|
||||
| 15 | Tinkerbell evaluation | Hands-on: does wrapping it work, or build our own agent? | HookOS + actions vs custom LinuxKit agent |
|
||||
| 16 | XCP-ng rootfs extraction | How to produce deployable XCP-ng rootfs (not native installer) | Extract from ISO, capture installed system |
|
||||
| 17 | VyOS rootfs extraction | How to produce deployable VyOS rootfs | VyOS build system, published images, Docker mode |
|
||||
| 18 | Multi-arch PXE | Different boot chains for x86 BIOS, x86 UEFI, ARM UEFI | Per-arch agent OS builds, iPXE configs |
|
||||
|
||||
## Project Files
|
||||
|
||||
| File | Contents |
|
||||
|------|----------|
|
||||
| `lab-tool-spec.md` | Full platform specification (CLI examples, plugin interfaces, secrets, identity, bootstrap) |
|
||||
| `architecture.md` | This file — decisions, dependencies, investigation queue |
|
||||
| `hardware.md` | Homelab hardware inventory and node roles |
|
||||
| `crossplane-evaluation.md` | Crossplane evaluation and rejection rationale |
|
||||
| `config-format-research.md` | YAML alternatives research (Pkl, KCL, CUE, CDK8s, etc.) |
|
||||
| `os-install-research.md` | OS install automation, rootfs production, image pipeline, deployment matrix |
|
||||
| `kubernetes-flavors.md` | k3s chosen, OpenShift/Talos/MicroK8s rejected with rationale |
|
||||
| `dns-research.md` | PowerDNS + ExternalDNS chosen, domain claims, health-checked DNS |
|
||||
337
bastion.sh
Executable file
337
bastion.sh
Executable file
@@ -0,0 +1,337 @@
|
||||
#!/usr/bin/env bash
|
||||
# ─────────────────────────────────────────────────────────────────────
|
||||
# Lab PXE Bastion — ephemeral PXE server for bare-metal provisioning
|
||||
#
|
||||
# Turns this machine into a temporary PXE boot server. Target machines
|
||||
# on the same network can PXE boot and get Fedora installed automatically.
|
||||
#
|
||||
# Usage:
|
||||
# sudo bash bastion.sh # interactive, auto-detect everything
|
||||
# sudo TARGET_HOSTNAME=puppet SSH_PUBKEY=~/.ssh/id_ed25519.pub bash bastion.sh
|
||||
#
|
||||
# Requirements: Fedora/RHEL host with dnsmasq, python3, curl
|
||||
# ─────────────────────────────────────────────────────────────────────
|
||||
set -euo pipefail
|
||||
|
||||
# ──── Defaults (override via environment) ──────────────────────────
|
||||
FEDORA_VERSION="${FEDORA_VERSION:-41}"
|
||||
ARCH="${ARCH:-x86_64}"
|
||||
HTTP_PORT="${HTTP_PORT:-8080}"
|
||||
TARGET_HOSTNAME="${TARGET_HOSTNAME:-lab-node}"
|
||||
TARGET_DISK="${TARGET_DISK:-}" # empty = anaconda auto-picks
|
||||
SSH_PUBKEY="${SSH_PUBKEY:-}" # path to .pub file, auto-detected
|
||||
TIMEZONE="${TIMEZONE:-Europe/London}"
|
||||
LOCALE="${LOCALE:-en_GB.UTF-8}"
|
||||
BASTION_DIR="${BASTION_DIR:-/tmp/lab-bastion}"
|
||||
|
||||
# ──── Colors ───────────────────────────────────────────────────────
|
||||
RED='\033[0;31m'; GREEN='\033[0;32m'; YELLOW='\033[1;33m'
|
||||
CYAN='\033[0;36m'; BOLD='\033[1m'; NC='\033[0m'
|
||||
|
||||
log() { echo -e "${GREEN}[bastion]${NC} $*"; }
|
||||
warn() { echo -e "${YELLOW}[bastion]${NC} $*"; }
|
||||
err() { echo -e "${RED}[bastion]${NC} $*" >&2; }
|
||||
die() { err "$@"; exit 1; }
|
||||
|
||||
# ──── Preflight ────────────────────────────────────────────────────
|
||||
[[ $EUID -eq 0 ]] || die "Must run as root (need DHCP/TFTP ports). Use: sudo bash bastion.sh"
|
||||
|
||||
command -v python3 >/dev/null || die "python3 not found"
|
||||
command -v curl >/dev/null || die "curl not found"
|
||||
|
||||
# Install dnsmasq if missing
|
||||
if ! command -v dnsmasq >/dev/null; then
|
||||
log "Installing dnsmasq..."
|
||||
if command -v dnf >/dev/null; then
|
||||
dnf install -y dnsmasq
|
||||
elif command -v apt-get >/dev/null; then
|
||||
apt-get install -y dnsmasq
|
||||
else
|
||||
die "Cannot install dnsmasq — install it manually"
|
||||
fi
|
||||
fi
|
||||
|
||||
# ──── Auto-detect network ─────────────────────────────────────────
|
||||
IFACE="${IFACE:-$(ip route | awk '/default/ {print $5; exit}')}"
|
||||
SERVER_IP="$(ip -4 addr show "$IFACE" | awk '/inet / {split($2,a,"/"); print a[1]; exit}')"
|
||||
NETWORK="$(echo "$SERVER_IP" | awk -F. '{print $1"."$2"."$3".0"}')"
|
||||
|
||||
[[ -n "$SERVER_IP" ]] || die "Cannot detect IP on interface $IFACE"
|
||||
log "Interface: ${BOLD}$IFACE${NC} IP: ${BOLD}$SERVER_IP${NC} Network: ${BOLD}$NETWORK${NC}"
|
||||
|
||||
# ──── Auto-detect SSH pubkey ───────────────────────────────────────
|
||||
if [[ -z "$SSH_PUBKEY" ]]; then
|
||||
# When run via sudo, check the real user's home
|
||||
REAL_HOME="${HOME}"
|
||||
if [[ -n "${SUDO_USER:-}" ]]; then
|
||||
REAL_HOME="$(getent passwd "$SUDO_USER" | cut -d: -f6)"
|
||||
fi
|
||||
for keyfile in "$REAL_HOME/.ssh/id_ed25519.pub" "$REAL_HOME/.ssh/id_rsa.pub" "$REAL_HOME/.ssh/id_ecdsa.pub"; do
|
||||
if [[ -f "$keyfile" ]]; then
|
||||
SSH_PUBKEY="$keyfile"
|
||||
break
|
||||
fi
|
||||
done
|
||||
fi
|
||||
|
||||
if [[ -n "$SSH_PUBKEY" && -f "$SSH_PUBKEY" ]]; then
|
||||
SSH_KEY_CONTENT="$(cat "$SSH_PUBKEY")"
|
||||
log "SSH key: ${BOLD}$SSH_PUBKEY${NC}"
|
||||
else
|
||||
warn "No SSH public key found. Root password will be set to 'changeme'."
|
||||
warn "Set SSH_PUBKEY=/path/to/key.pub to use key-based auth instead."
|
||||
SSH_KEY_CONTENT=""
|
||||
fi
|
||||
|
||||
# ──── Prepare directories ─────────────────────────────────────────
|
||||
TFTPDIR="$BASTION_DIR/tftp"
|
||||
HTTPDIR="$BASTION_DIR/http"
|
||||
mkdir -p "$TFTPDIR" "$HTTPDIR"
|
||||
|
||||
# ──── Cleanup handler ─────────────────────────────────────────────
|
||||
DNSMASQ_PID=""
|
||||
HTTP_PID=""
|
||||
FW_OPENED=false
|
||||
|
||||
cleanup() {
|
||||
echo ""
|
||||
log "Shutting down..."
|
||||
[[ -n "$DNSMASQ_PID" ]] && kill "$DNSMASQ_PID" 2>/dev/null && log "Stopped dnsmasq"
|
||||
[[ -n "$HTTP_PID" ]] && kill "$HTTP_PID" 2>/dev/null && log "Stopped HTTP server"
|
||||
|
||||
if $FW_OPENED && command -v firewall-cmd >/dev/null; then
|
||||
log "Removing firewall rules..."
|
||||
firewall-cmd --quiet --remove-service=dhcp 2>/dev/null || true
|
||||
firewall-cmd --quiet --remove-service=tftp 2>/dev/null || true
|
||||
firewall-cmd --quiet --remove-port=${HTTP_PORT}/tcp 2>/dev/null || true
|
||||
firewall-cmd --quiet --remove-service=proxy-dhcp 2>/dev/null || true
|
||||
fi
|
||||
|
||||
log "Done. Bastion artifacts remain in $BASTION_DIR"
|
||||
log "Re-run this script to reprovision. Remove with: rm -rf $BASTION_DIR"
|
||||
}
|
||||
trap cleanup EXIT INT TERM
|
||||
|
||||
# ──── Download artifacts (cached) ─────────────────────────────────
|
||||
download() {
|
||||
local url="$1" dest="$2" label="$3"
|
||||
if [[ -f "$dest" ]]; then
|
||||
log " ${label} — cached"
|
||||
return
|
||||
fi
|
||||
log " ${label} — downloading..."
|
||||
curl -# -L -o "$dest" "$url" || die "Failed to download $label from $url"
|
||||
}
|
||||
|
||||
FEDORA_MIRROR="https://download.fedoraproject.org/pub/fedora/linux/releases/${FEDORA_VERSION}/Everything/${ARCH}/os"
|
||||
|
||||
log "Fetching boot artifacts (Fedora ${FEDORA_VERSION} ${ARCH})..."
|
||||
download "https://boot.ipxe.org/undionly.kpxe" "$TFTPDIR/undionly.kpxe" "iPXE BIOS"
|
||||
download "https://boot.ipxe.org/ipxe.efi" "$TFTPDIR/ipxe.efi" "iPXE UEFI"
|
||||
download "${FEDORA_MIRROR}/images/pxeboot/vmlinuz" "$HTTPDIR/vmlinuz" "Fedora kernel"
|
||||
download "${FEDORA_MIRROR}/images/pxeboot/initrd.img" "$HTTPDIR/initrd.img" "Fedora initrd"
|
||||
|
||||
# ──── Generate kickstart ──────────────────────────────────────────
|
||||
log "Generating kickstart for ${BOLD}${TARGET_HOSTNAME}${NC}..."
|
||||
|
||||
# Disk config
|
||||
if [[ -n "$TARGET_DISK" ]]; then
|
||||
DISK_CMDS="ignoredisk --only-use=${TARGET_DISK}
|
||||
clearpart --all --initlabel --drives=${TARGET_DISK}
|
||||
autopart --type=plain"
|
||||
else
|
||||
DISK_CMDS="clearpart --all --initlabel
|
||||
autopart --type=plain"
|
||||
fi
|
||||
|
||||
# Auth config
|
||||
if [[ -n "$SSH_KEY_CONTENT" ]]; then
|
||||
AUTH_CMDS="rootpw --lock
|
||||
sshkey --username=root \"${SSH_KEY_CONTENT}\""
|
||||
else
|
||||
AUTH_CMDS='rootpw --plaintext changeme'
|
||||
fi
|
||||
|
||||
cat > "$HTTPDIR/ks.cfg" << KICKSTART
|
||||
# Lab Bastion — Fedora ${FEDORA_VERSION} kickstart
|
||||
# Generated: $(date -Iseconds)
|
||||
# Target: ${TARGET_HOSTNAME}
|
||||
|
||||
# Install mode
|
||||
text
|
||||
reboot
|
||||
|
||||
# Locale
|
||||
lang ${LOCALE}
|
||||
keyboard uk
|
||||
timezone ${TIMEZONE} --utc
|
||||
|
||||
# Network
|
||||
network --bootproto=dhcp --activate --hostname=${TARGET_HOSTNAME}
|
||||
|
||||
# Auth
|
||||
${AUTH_CMDS}
|
||||
|
||||
# Disk
|
||||
${DISK_CMDS}
|
||||
|
||||
# Bootloader
|
||||
bootloader --append="console=tty0 console=ttyS0,115200n8"
|
||||
|
||||
# Install source
|
||||
url --mirrorlist=https://mirrors.fedoraproject.org/mirrorlist?repo=fedora-\$releasever&arch=\$basearch
|
||||
|
||||
# Packages — minimal server + essentials
|
||||
%packages
|
||||
@core
|
||||
@server-product
|
||||
openssh-server
|
||||
vim-enhanced
|
||||
tmux
|
||||
git
|
||||
curl
|
||||
python3
|
||||
dnf-plugins-core
|
||||
%end
|
||||
|
||||
# Post-install
|
||||
%post --log=/root/bastion-post-install.log
|
||||
#!/bin/bash
|
||||
set -x
|
||||
|
||||
# Ensure SSH is enabled
|
||||
systemctl enable --now sshd
|
||||
|
||||
# Allow root SSH with key (password auth disabled)
|
||||
sed -i 's/^#\?PermitRootLogin.*/PermitRootLogin prohibit-password/' /etc/ssh/sshd_config
|
||||
sed -i 's/^#\?PasswordAuthentication.*/PasswordAuthentication no/' /etc/ssh/sshd_config
|
||||
|
||||
# Set hostname
|
||||
hostnamectl set-hostname ${TARGET_HOSTNAME}
|
||||
|
||||
# Leave a breadcrumb
|
||||
echo "Provisioned by lab-bastion on $(date -Iseconds)" > /etc/lab-provisioned
|
||||
|
||||
# Placeholder: puppet enrollment will go here later
|
||||
# puppet is not installed yet — this IS the puppet server
|
||||
echo "# Lab bootstrap node — puppet server setup pending" > /root/README
|
||||
|
||||
%end
|
||||
KICKSTART
|
||||
|
||||
log "Kickstart written to ${HTTPDIR}/ks.cfg"
|
||||
|
||||
# ──── Generate iPXE boot script ───────────────────────────────────
|
||||
cat > "$HTTPDIR/boot.ipxe" << IPXE
|
||||
#!ipxe
|
||||
|
||||
echo
|
||||
echo =======================================
|
||||
echo Lab PXE Bastion — Fedora ${FEDORA_VERSION}
|
||||
echo Target: ${TARGET_HOSTNAME}
|
||||
echo =======================================
|
||||
echo
|
||||
|
||||
kernel http://${SERVER_IP}:${HTTP_PORT}/vmlinuz inst.ks=http://${SERVER_IP}:${HTTP_PORT}/ks.cfg inst.repo=${FEDORA_MIRROR} inst.text
|
||||
initrd http://${SERVER_IP}:${HTTP_PORT}/initrd.img
|
||||
boot
|
||||
IPXE
|
||||
|
||||
# ──── Generate dnsmasq config ─────────────────────────────────────
|
||||
cat > "$BASTION_DIR/dnsmasq.conf" << DNSMASQ
|
||||
# Lab PXE Bastion — dnsmasq config
|
||||
# ProxyDHCP mode: adds PXE options without replacing existing DHCP
|
||||
|
||||
# Disable DNS (we only want DHCP/TFTP)
|
||||
port=0
|
||||
|
||||
# Listen on the right interface
|
||||
interface=${IFACE}
|
||||
bind-interfaces
|
||||
|
||||
# ProxyDHCP — works alongside existing DHCP (UniFi etc)
|
||||
dhcp-range=${NETWORK},proxy
|
||||
|
||||
# TFTP for initial PXE boot
|
||||
enable-tftp
|
||||
tftp-root=${TFTPDIR}
|
||||
|
||||
# Detect client architecture
|
||||
dhcp-match=set:bios,option:client-arch,0
|
||||
dhcp-match=set:efi64,option:client-arch,7
|
||||
dhcp-match=set:efi64,option:client-arch,9
|
||||
|
||||
# Detect iPXE clients (already chainloaded)
|
||||
dhcp-userclass=set:ipxe,iPXE
|
||||
|
||||
# First PXE boot → serve iPXE binary via TFTP
|
||||
dhcp-boot=tag:bios,tag:!ipxe,undionly.kpxe
|
||||
dhcp-boot=tag:efi64,tag:!ipxe,ipxe.efi
|
||||
|
||||
# iPXE clients → chain to boot script via HTTP
|
||||
dhcp-boot=tag:ipxe,http://${SERVER_IP}:${HTTP_PORT}/boot.ipxe
|
||||
|
||||
# Verbose logging (see what's happening)
|
||||
log-dhcp
|
||||
DNSMASQ
|
||||
|
||||
# ──── Open firewall ───────────────────────────────────────────────
|
||||
if command -v firewall-cmd >/dev/null && firewall-cmd --state >/dev/null 2>&1; then
|
||||
log "Opening firewall ports (DHCP, TFTP, HTTP:${HTTP_PORT})..."
|
||||
firewall-cmd --quiet --add-service=dhcp
|
||||
firewall-cmd --quiet --add-service=tftp
|
||||
firewall-cmd --quiet --add-port=${HTTP_PORT}/tcp
|
||||
# ProxyDHCP uses port 4011
|
||||
firewall-cmd --quiet --add-port=4011/udp 2>/dev/null || true
|
||||
FW_OPENED=true
|
||||
fi
|
||||
|
||||
# ──── Stop conflicting services ───────────────────────────────────
|
||||
# dnsmasq might be running as a system service
|
||||
if systemctl is-active --quiet dnsmasq 2>/dev/null; then
|
||||
warn "System dnsmasq is running — stopping it temporarily"
|
||||
systemctl stop dnsmasq
|
||||
RESTART_DNSMASQ=true
|
||||
fi
|
||||
|
||||
# ──── Start HTTP server ───────────────────────────────────────────
|
||||
log "Starting HTTP server on :${HTTP_PORT}..."
|
||||
(cd "$HTTPDIR" && python3 -m http.server "$HTTP_PORT" --bind 0.0.0.0 >/dev/null 2>&1) &
|
||||
HTTP_PID=$!
|
||||
sleep 0.5
|
||||
|
||||
if ! kill -0 "$HTTP_PID" 2>/dev/null; then
|
||||
die "HTTP server failed to start — is port ${HTTP_PORT} in use?"
|
||||
fi
|
||||
|
||||
# ──── Start dnsmasq (proxyDHCP + TFTP) ────────────────────────────
|
||||
log "Starting PXE server (proxyDHCP on ${IFACE})..."
|
||||
echo ""
|
||||
echo -e "${CYAN}${BOLD}════════════════════════════════════════════════════════${NC}"
|
||||
echo -e "${CYAN}${BOLD} PXE Bastion ready!${NC}"
|
||||
echo -e "${CYAN}${BOLD}════════════════════════════════════════════════════════${NC}"
|
||||
echo ""
|
||||
echo -e " Network: ${BOLD}${NETWORK}/24${NC} via ${BOLD}${IFACE}${NC}"
|
||||
echo -e " HTTP: ${BOLD}http://${SERVER_IP}:${HTTP_PORT}/${NC}"
|
||||
echo -e " OS: ${BOLD}Fedora ${FEDORA_VERSION} (${ARCH})${NC}"
|
||||
echo -e " Hostname: ${BOLD}${TARGET_HOSTNAME}${NC}"
|
||||
echo -e " Kickstart: ${BOLD}http://${SERVER_IP}:${HTTP_PORT}/ks.cfg${NC}"
|
||||
echo ""
|
||||
echo -e " ${YELLOW}Now PXE-boot the target machine.${NC}"
|
||||
echo -e " ${YELLOW}Set boot order to Network/PXE in BIOS, or use one-time boot menu.${NC}"
|
||||
echo ""
|
||||
echo -e " Press ${BOLD}Ctrl-C${NC} to stop the bastion."
|
||||
echo ""
|
||||
echo -e "${CYAN}──── dnsmasq log (watch for DHCP/PXE requests) ────${NC}"
|
||||
echo ""
|
||||
|
||||
# Run dnsmasq in foreground so logs stream to terminal
|
||||
dnsmasq --no-daemon --conf-file="$BASTION_DIR/dnsmasq.conf" &
|
||||
DNSMASQ_PID=$!
|
||||
|
||||
# Wait for dnsmasq — if it exits, something went wrong
|
||||
wait "$DNSMASQ_PID" || {
|
||||
err "dnsmasq exited unexpectedly. Check if another DHCP/TFTP service is running."
|
||||
err "Try: ss -ulnp | grep -E ':(67|69|4011) '"
|
||||
exit 1
|
||||
}
|
||||
121
config-format-research.md
Normal file
121
config-format-research.md
Normal file
@@ -0,0 +1,121 @@
|
||||
# Configuration Format Research
|
||||
|
||||
## Decision: PENDING — exploring alternatives to raw Kubernetes YAML
|
||||
|
||||
## The Problem
|
||||
|
||||
Kubernetes YAML is verbose, repetitive, lacks type safety, and forces users to specify
|
||||
every layer of concern (intent, team defaults, org standards, k8s boilerplate) in one file.
|
||||
Helm "solves" this with Go templating, which produces unreadable template spaghetti.
|
||||
|
||||
Docker Compose is the gold standard for UX — 6 lines vs 35 for the same deployment.
|
||||
The problem was never YAML itself; it was being forced to write too much of it.
|
||||
|
||||
## Core Design Principle
|
||||
|
||||
Users should only define what they care about. Everything else should be inherited from
|
||||
expert-defined defaults. YAML (or JSON) can exist underneath as:
|
||||
- Easy, non-binary backup format
|
||||
- Live editing capability
|
||||
- Debugging / inspection output
|
||||
|
||||
## Layered Architecture
|
||||
|
||||
```
|
||||
Layer 1: User intent "I want an api service running myapp" ← USER WRITES THIS
|
||||
Layer 2: Team defaults "Our services get health checks, limits" ← Team lead defines
|
||||
Layer 3: Org standards "All pods need security context, labels" ← Platform team defines
|
||||
Layer 4: Output Full YAML/JSON for kubectl, backup, debug ← GENERATED
|
||||
```
|
||||
|
||||
Docker Compose feels good because it's only Layer 1 — Docker handles the rest.
|
||||
Kubernetes forces all 4 layers into one file.
|
||||
|
||||
## Evaluated Alternatives
|
||||
|
||||
### Tier 1 — Strong Contenders
|
||||
|
||||
**Pkl (Apple)**
|
||||
- Best syntax for "amend a template" via `amends` keyword
|
||||
- Strong static typing, clean readable syntax
|
||||
- Lowest ceremony for simple cases
|
||||
- Risk: Apple may abandon it, requires JVM runtime
|
||||
- K8s support: `pkl-k8s` package exists
|
||||
|
||||
**KCL (CNCF Sandbox)**
|
||||
- Python-like syntax, lowest learning curve of typed options
|
||||
- Schema defaults, validation, constraints built in
|
||||
- CNCF backing gives legitimacy
|
||||
- Risk: primarily driven by Ant Group (Alibaba)
|
||||
|
||||
**CUE**
|
||||
- Most principled — constraint-based unification, not inheritance
|
||||
- Used by Timoni (Helm replacement), KubeVela, Dagger
|
||||
- Defaults marked with `*`, types and values on same spectrum
|
||||
- Risk: steep learning curve, novel paradigm
|
||||
- Most mature K8s ecosystem of the three
|
||||
|
||||
### Tier 2 — Viable But Weaker Fit
|
||||
|
||||
**CDK8s+ (TypeScript)**
|
||||
- Full IDE support, strongest type safety
|
||||
- cdk8s+ has intent-driven APIs ("I want a web service" → generates Deployment+Service)
|
||||
- Risk: brings software engineering complexity into config, AWS-centric
|
||||
- Good if team is TypeScript-native
|
||||
|
||||
**Jsonnet (via Tanka)**
|
||||
- Proven at scale (Grafana uses it across hundreds of services)
|
||||
- Object mixins via `+` operator for composition
|
||||
- Risk: weak type safety, no compile-time validation of field names
|
||||
|
||||
### Tier 3 — Not Recommended
|
||||
|
||||
**Dhall** — strongest type safety but Haskell-like syntax, small/stale community
|
||||
**Nickel** — elegant contracts system but tiny K8s ecosystem
|
||||
**Starlark** — no type safety, no schema system, just a scripting layer
|
||||
**HCL** — great for infra provisioning, wrong fit for k8s manifests
|
||||
|
||||
### Dead Projects
|
||||
- **Winglang** — shut down April 2025
|
||||
- **Klotho** — archived, pivoted to InfraCopilot
|
||||
- **Acorn** — pivoted to AI agents (Obot)
|
||||
|
||||
## Compose-Like Input Format (Preferred Direction)
|
||||
|
||||
The user prefers Docker Compose brevity. The tool we build could use a Compose-inspired
|
||||
input format at Layer 1, generating full k8s manifests + provider-specific resources underneath:
|
||||
|
||||
```yaml
|
||||
# What the user writes
|
||||
services:
|
||||
api:
|
||||
image: myapp:latest
|
||||
size: medium
|
||||
ports: [8080]
|
||||
env:
|
||||
DB_HOST: postgres
|
||||
|
||||
# System generates: full k8s Deployment, Service, NetworkPolicy,
|
||||
# resource limits, security context, health checks, etc.
|
||||
```
|
||||
|
||||
YAML is fine for Layer 1 if it's short enough. The problem was never the format —
|
||||
it was the verbosity. Compose proves short YAML works.
|
||||
|
||||
## Open Questions
|
||||
|
||||
1. Should Layer 1 input be YAML (Compose-like), or a typed language (Pkl/KCL/CUE)?
|
||||
2. How do team defaults (Layer 2) and org standards (Layer 3) get defined and distributed?
|
||||
3. Should the render view show the generated YAML diff when changing Layer 1 input?
|
||||
4. How does this integrate with the Pulumi multi-cloud abstraction layer?
|
||||
5. Could the input format support both k8s workloads AND infrastructure resources
|
||||
(VMs, networks, storage) in the same spec?
|
||||
|
||||
## GUI/TUI Space — Underserved Opportunity
|
||||
|
||||
No tool has achieved significant adoption for visually *defining* infrastructure.
|
||||
Existing tools (K9s, Lens, Rancher) are for monitoring/management, not authoring.
|
||||
|
||||
The ideal: platform engineers define schemas with constraints/defaults,
|
||||
developers interact with a form/wizard showing only fields they need,
|
||||
validated config generated underneath. Nobody has built this well yet.
|
||||
106
crossplane-evaluation.md
Normal file
106
crossplane-evaluation.md
Normal file
@@ -0,0 +1,106 @@
|
||||
# Crossplane Evaluation
|
||||
|
||||
## Decision: NOT ADOPTING
|
||||
|
||||
Crossplane will not be used in this stack. The lack of a plan/preview mechanism is a dealbreaker
|
||||
for enterprise adoption and safe infrastructure management.
|
||||
|
||||
---
|
||||
|
||||
## Why We Evaluated It
|
||||
|
||||
The core problem: Terraform/OpenTofu requires re-implementing the same infrastructure concepts
|
||||
per platform (AWS, XCP-ng, bare metal). At thousands of nodes across multiple platforms, this is
|
||||
a massive maintenance burden. Crossplane's XRD/Composition model promised a unified API:
|
||||
|
||||
```
|
||||
XRD: "VirtualMachine" (universal API)
|
||||
├── Composition: AWS → EC2 instance
|
||||
├── Composition: XCP-ng → XO VM
|
||||
└── Composition: bare metal → MAAS / Ansible
|
||||
```
|
||||
|
||||
One API, multiple backends — teams request a "VirtualMachine" and the right composition handles it.
|
||||
|
||||
## Strengths
|
||||
|
||||
- **CNCF Graduated** (Nov 2025, v2.2) — Apache 2.0 license, top-tier maturity
|
||||
- **Continuous drift detection** — automatically reverts manual changes, unlike Terraform's on-demand plan/apply
|
||||
- **No state file management** — no remote backends, locking issues, or state corruption
|
||||
- **Kubernetes-native** — works with ArgoCD, Flux, kubectl, RBAC out of the box
|
||||
- **XRDs/Compositions** — genuine multi-platform abstraction layer, solves the "re-implement per cloud" problem
|
||||
- **Eventual consistency** — resources with complex dependencies don't get stuck like Terraform's dependency graph
|
||||
- **Enterprise adoption** — Deutsche Kreditbank, Elastic, Nike, Apple, NASA, Grafana Labs, 60+ orgs
|
||||
- **Deutsche Kreditbank** replaced Terraform; deployments went from weeks to under one hour
|
||||
|
||||
## Dealbreaker: No Plan/Preview
|
||||
|
||||
The single biggest issue. Terraform's `terraform plan` lets operators see exactly what will change
|
||||
before applying. Crossplane applies changes immediately upon resource creation/modification.
|
||||
|
||||
- Discussed in the community for 2+ years with no resolution
|
||||
- A Kubernetes-native solution would be a `Plan` CRD that shows proposed changes before approval
|
||||
- ArgoCD `sync --dry-run` is a partial workaround but only shows k8s resource diffs, not what the
|
||||
cloud provider will actually do underneath
|
||||
- **For regulated environments and SRE teams at scale, change preview is non-negotiable**
|
||||
|
||||
Possible reasons it hasn't been implemented:
|
||||
- The continuous reconciliation architecture may make point-in-time snapshots fundamentally hard
|
||||
- Upbound (commercial entity) may be reserving it for their paid platform
|
||||
- Or simply not prioritised
|
||||
|
||||
## Other Significant Concerns
|
||||
|
||||
### CRD Bloat
|
||||
- `provider-aws` installs 900+ CRDs — can make API server unresponsive for up to an hour (GitHub #2649)
|
||||
- Exceeds Kubernetes' recommended ~500 CRD limit
|
||||
- Mitigated by "Provider Families" (install per-service sub-providers) but requires careful planning
|
||||
|
||||
### Debugging Difficulty
|
||||
- Errors propagate through layers: Claim → XR → Composition → Managed Resource → Provider → Cloud API
|
||||
- Multiple sources report debugging compositions is painful
|
||||
- Pipeline Inspector (alpha in v2.2) is being introduced but not production-ready
|
||||
|
||||
### Chicken-and-Egg Problem
|
||||
- Crossplane runs inside Kubernetes — cannot provision the cluster it runs on
|
||||
- Requires a "management cluster" bootstrapped by other means (Terraform, Puppet, etc.)
|
||||
- If the management cluster dies, no drift detection or reconciliation runs
|
||||
- Recovery: applying YAMLs to a new cluster works if deterministic resource names are used,
|
||||
otherwise risks creating duplicate cloud resources
|
||||
|
||||
### Cluster Loss / Immutability Concerns
|
||||
- State lives in etcd, not a versionable state file
|
||||
- No independent audit trail or easy way to diff historical states
|
||||
- On new cluster: resources with explicit external names get adopted; auto-named resources get duplicated
|
||||
- Need etcd backups as insurance, and deterministic naming everywhere
|
||||
|
||||
### Performance at Scale
|
||||
- ~2000 composites took 6+ minutes to reconcile on k3d (GitHub #2256)
|
||||
- Reconciliation interval not easily configurable globally (GitHub #5934)
|
||||
|
||||
### YAML Limitations
|
||||
- No native loops, conditionals, or programming constructs
|
||||
- Complex compositions require changes in multiple locations
|
||||
|
||||
## XCP-ng Provider Gap
|
||||
|
||||
- No Crossplane provider for XCP-ng exists today
|
||||
- A mature Terraform provider (`terraform-provider-xenorchestra`) exists, maintained by Vates
|
||||
- Could be wrapped via Upjet to auto-generate a Crossplane provider — but nobody has done it
|
||||
- Would be a greenfield open-source project
|
||||
|
||||
## Real Issues Reported
|
||||
|
||||
- API server unresponsiveness with too many CRDs (GitHub #2649)
|
||||
- CRD scaling issues beyond ~500 CRDs (GitHub #2895)
|
||||
- GCP SQL resources randomly marked for deletion — dangerous for production databases
|
||||
- Reconciliation rate limiting at scale (GitHub #2256)
|
||||
|
||||
## Conclusion
|
||||
|
||||
Crossplane solves a real problem (multi-platform abstraction) that we need, but the lack of
|
||||
plan/preview makes it unsuitable for enterprise-scale production infrastructure management.
|
||||
The operational concerns (CRD bloat, debugging, cluster dependency) add further risk.
|
||||
|
||||
We need to find an alternative approach to the multi-platform abstraction problem that Crossplane
|
||||
solves, while retaining plan/preview capabilities.
|
||||
143
dns-research.md
Normal file
143
dns-research.md
Normal file
@@ -0,0 +1,143 @@
|
||||
# DNS Solution Research
|
||||
|
||||
## Decision: PowerDNS Authoritative + ExternalDNS
|
||||
|
||||
### Why PowerDNS
|
||||
|
||||
| Feature | PowerDNS | CoreDNS | BIND9 | Technitium |
|
||||
|---------|----------|---------|-------|------------|
|
||||
| REST API | Full | No (needs etcd) | No (nsupdate) | Yes |
|
||||
| Database backend | PostgreSQL/MySQL/SQLite | etcd | Zone files | Custom |
|
||||
| Health-aware DNS | Lua records (ifportup, ifurlup) | No | No | No |
|
||||
| ExternalDNS provider | Yes | Yes (via etcd) | Yes (RFC 2136) | No |
|
||||
| DNSSEC | Yes | Limited | Best | Yes |
|
||||
| Split DNS | dnsdist routing | Corefile blocks | Views (best) | APP records |
|
||||
| Maturity | ISP-grade | K8s-focused | Oldest | Newer |
|
||||
|
||||
PowerDNS wins on: REST API (critical for Lab), health-check-aware Lua records,
|
||||
database backend for HA, and ExternalDNS integration.
|
||||
|
||||
### Architecture
|
||||
|
||||
```
|
||||
Lab Server
|
||||
(control plane)
|
||||
│
|
||||
│ PowerDNS REST API
|
||||
▼
|
||||
┌───────────────┐
|
||||
│ PowerDNS │
|
||||
│ Authoritative│──── PostgreSQL/SQLite backend
|
||||
│ Server │
|
||||
└───────┬───────┘
|
||||
│
|
||||
┌───────────┼───────────┐
|
||||
│ │ │
|
||||
▼ ▼ ▼
|
||||
Internal DNS ExternalDNS dnsdist
|
||||
.lab.internal (k8s syncs (split DNS
|
||||
Services/ routing)
|
||||
Ingress)
|
||||
```
|
||||
|
||||
### How Lab Uses DNS
|
||||
|
||||
#### Auto-registration on onboard
|
||||
When `lab onboard` completes, Lab calls PowerDNS API:
|
||||
- A record: `<server>.lab.internal → <ip>`
|
||||
- PTR record: `<reverse-ip>.in-addr.arpa → <server>.lab.internal`
|
||||
- Both created/updated atomically
|
||||
|
||||
#### Domain claims via labels
|
||||
Labels can claim shared domain names:
|
||||
```yaml
|
||||
labels:
|
||||
mailserver:
|
||||
dns:
|
||||
records:
|
||||
- type: A
|
||||
name: "{{server.name}}.lab.internal"
|
||||
claims:
|
||||
- name: mail.example.com
|
||||
type: A
|
||||
health_check: { port: 25 }
|
||||
```
|
||||
All servers with label `mailserver` contribute to `mail.example.com` round-robin.
|
||||
PowerDNS Lua records remove unhealthy servers automatically.
|
||||
|
||||
#### IP mobility
|
||||
Lab agent on machine reports IP change → Lab server updates PowerDNS API →
|
||||
A record, PTR, and all claimed domains updated.
|
||||
|
||||
#### K8s integration
|
||||
ExternalDNS runs in k8s, syncs Service/Ingress records to same PowerDNS instance.
|
||||
Same DNS server serves both bare metal and k8s records.
|
||||
|
||||
#### Groups claiming domains
|
||||
Groups can claim domains for all member servers:
|
||||
```yaml
|
||||
groups:
|
||||
production-web:
|
||||
match:
|
||||
labels: [web-frontend]
|
||||
environment: prod
|
||||
dns:
|
||||
claims:
|
||||
- name: www.example.com
|
||||
type: A
|
||||
health_check: { url: "https://{{server.ip}}/healthz" }
|
||||
```
|
||||
|
||||
### DNS Plugin Interface
|
||||
|
||||
```go
|
||||
type DNSPlugin interface {
|
||||
Name() string
|
||||
|
||||
// Record management
|
||||
CreateRecord(zone, name, recordType string, targets []string, ttl int) error
|
||||
UpdateRecord(zone, name, recordType string, targets []string, ttl int) error
|
||||
DeleteRecord(zone, name, recordType string) error
|
||||
ListRecords(zone string) ([]Record, error)
|
||||
|
||||
// Health-checked records
|
||||
CreateHealthCheckedRecord(zone, name string, targets []string, check HealthCheck) error
|
||||
|
||||
// Zone management
|
||||
CreateZone(name string, kind string) error
|
||||
DeleteZone(name string) error
|
||||
}
|
||||
```
|
||||
|
||||
Built-in:
|
||||
- `dns-powerdns` — PowerDNS REST API (primary)
|
||||
- `dns-route53` — AWS Route53 (for cloud deployments)
|
||||
- `dns-rfc2136` — RFC 2136 dynamic updates (BIND/Knot fallback)
|
||||
|
||||
### Split DNS Setup
|
||||
|
||||
Internal zones (`.lab.internal`) served by PowerDNS authoritatively.
|
||||
External queries forwarded upstream (8.8.8.8, ISP DNS).
|
||||
|
||||
Options:
|
||||
- **dnsdist** (PowerDNS ecosystem) routes by source subnet
|
||||
- **CoreDNS as resolver** — serves internal from PowerDNS, forwards external
|
||||
- **BIND views** — if we need view-based split on same zone (unlikely)
|
||||
|
||||
### Evaluated and Not Chosen
|
||||
|
||||
| Tool | Why Not |
|
||||
|------|---------|
|
||||
| CoreDNS | No REST API, needs etcd intermediary, k8s-focused |
|
||||
| BIND9 | No REST API, nsupdate is cumbersome for automation |
|
||||
| Technitium | No ExternalDNS provider, newer/smaller community |
|
||||
| dnsmasq | Not suitable — caching forwarder, no API, ~1000 client limit |
|
||||
| Knot DNS | No REST API, better as secondary/downstream |
|
||||
|
||||
### DNS-as-Code (Optional Layer)
|
||||
|
||||
For static DNS infrastructure (SOA, NS, MX, base zone config):
|
||||
- **octoDNS** (GitHub) or **DNSControl** (Stack Exchange)
|
||||
- GitOps workflow: PR → review → merge → sync to PowerDNS
|
||||
- Dynamic records (server A records, claims) managed by Lab directly via API
|
||||
- Static records managed via DNS-as-code in Git
|
||||
37
hardware.md
Normal file
37
hardware.md
Normal file
@@ -0,0 +1,37 @@
|
||||
# Homelab Hardware Inventory
|
||||
|
||||
## Compute Nodes
|
||||
|
||||
| Node | CPU Arch | RAM | Role | Cost |
|
||||
|------|----------|-----|------|------|
|
||||
| Beelink SER9 MAX | x86_64 | 64GB | k3s worker, ROCm GPU, Longhorn storage | ~£869 |
|
||||
| Beelink SER9 Pro | x86_64 | 32GB | Bootstrap: Puppet, DNS, UniFi, Vault, Naemon | ~£300 |
|
||||
| Minisforum MS-R1 | ARM (aarch64) | 64GB | k3s node | ~£500-640 |
|
||||
| Nvidia DGX Spark | ARM (Grace) | 128GB | CUDA/AI inference | ~£3,700 |
|
||||
| Mac Studio M1 Max | ARM (aarch64) | 32GB | k3s server #1 (etcd) | ~£775 |
|
||||
|
||||
## Networking
|
||||
|
||||
| Device | Specs | Cost |
|
||||
|--------|-------|------|
|
||||
| USW-Flex-XG x2 | 8x 10GbE ports total (4 per switch) | £458 |
|
||||
|
||||
## Summary
|
||||
|
||||
- **Total RAM:** 320GB
|
||||
- **Architectures:** x86_64, aarch64 (Apple Silicon + ARM + Grace)
|
||||
- **GPU compute:** ROCm (SER9 MAX), CUDA (DGX Spark)
|
||||
- **Estimated total:** ~£6,600-6,740
|
||||
|
||||
## Node Roles
|
||||
|
||||
### Bootstrap Node (Beelink SER9 Pro) — Outside k3s
|
||||
- Puppet (bare metal config management)
|
||||
- DNS (CoreDNS or PowerDNS)
|
||||
- UniFi controller
|
||||
- Vault (secrets management)
|
||||
- Naemon (bare metal, network, black-box endpoint monitoring)
|
||||
|
||||
### k3s Cluster
|
||||
- **Server (control plane + etcd):** Mac Studio M1 Max
|
||||
- **Workers:** Beelink SER9 MAX, Minisforum MS-R1, DGX Spark
|
||||
120
kubernetes-flavors.md
Normal file
120
kubernetes-flavors.md
Normal file
@@ -0,0 +1,120 @@
|
||||
# Kubernetes Flavor Decision
|
||||
|
||||
## Decision: k3s (confirmed)
|
||||
|
||||
k3s is the best fit for Lab. OpenShift and most other flavors conflict with
|
||||
the puppet-managed, multi-arch, lightweight approach.
|
||||
|
||||
## Evaluation
|
||||
|
||||
| Flavor | Puppet-Friendly | ARM | Multi-arch | Enterprise | License | Verdict |
|
||||
|--------|:-:|:-:|:-:|:-:|---------|---------|
|
||||
| **k3s** | ✓ binary + config files | ✓ | ✓ | Rancher/SUSE | Apache 2.0 | **CHOSEN** |
|
||||
| **k0s** | ✓ single binary, config-driven | ✓ | ✓ | Mirantis | Apache 2.0 | Good alternative |
|
||||
| **kubeadm** | ✓ well-understood bootstrap | ✓ | ✓ | Upstream K8s | Apache 2.0 | Viable but heavier |
|
||||
| **RKE2** | ✓ config files | ✓ | ✓ | Rancher/SUSE | Apache 2.0 | Heavier k3s |
|
||||
| **OpenShift** | ✗ operator-driven, fights puppet | ✗ limited | ✗ limited | Red Hat | Proprietary | REJECTED |
|
||||
| **MicroK8s** | ⚠ snap-based, puppet+snaps awkward | ✓ | ✓ | Canonical | Apache 2.0 | Not great |
|
||||
| **Talos** | ✗ immutable OS, no SSH, no puppet | ✓ | ✓ | Sidero Labs | MPL 2.0 | Incompatible |
|
||||
|
||||
## Why NOT OpenShift — Deep Analysis
|
||||
|
||||
### OpenShift Does Overlap With Lab
|
||||
|
||||
OpenShift is the closest existing thing to what Lab does. The overlap is real:
|
||||
|
||||
| Capability | OpenShift | Lab |
|
||||
|-----------|-----------|-----|
|
||||
| Manages nodes end-to-end | Yes (RHCOS + MCO) | Yes (OpenVox + labels) |
|
||||
| Immutable infrastructure | Yes (rpm-ostree, operator-driven) | Yes (puppet convergence) |
|
||||
| Fights config drift | Yes (operators reconcile) | Yes (puppet + sync pillar) |
|
||||
| Built-in monitoring | Yes (Prometheus + Alertmanager bundled) | Yes (health aggregator) |
|
||||
| Built-in secrets | Yes (etcd-encrypted secrets) | Yes (secret store + local cache) |
|
||||
| Certificate management | Yes (internal CA, auto-rotation) | Yes (identity layer) |
|
||||
| Node lifecycle | Yes (MachineSet, MachinePools) | Yes (onboard, labels, providers) |
|
||||
| Self-managing | Yes (operators update themselves) | Yes (lab manages itself) |
|
||||
|
||||
### Why OpenShift Still Doesn't Fit
|
||||
|
||||
**1. Single OS** — OpenShift control plane = RHCOS only. Can't run on Apple Silicon,
|
||||
Asahi Linux, or any non-RHCOS system. Lab needs Ubuntu, Debian, Fedora, AlmaLinux,
|
||||
XCP-ng, VyOS across x86 and ARM.
|
||||
|
||||
**2. K8s only** — OpenShift manages k8s nodes. Lab manages everything: k8s nodes,
|
||||
standalone VMs, bare metal hypervisors, network appliances, physical servers that
|
||||
will never run k8s. Not everything is a container.
|
||||
|
||||
**3. Single cluster scope** — OpenShift manages one cluster. Lab manages homelab k3s +
|
||||
enterprise AWS EKS + XCP-ng hypervisors + bare metal + OVH vRack. Cross-provider,
|
||||
cross-cluster.
|
||||
|
||||
**4. Fights puppet** — OpenShift has ~30+ operators that each own a piece of the system.
|
||||
If puppet changes kubelet config, the Machine Config Operator detects "drift" and
|
||||
reverts it. Two reconciliation loops fighting each other, possibly rebooting nodes
|
||||
in a loop. You're supposed to change everything via CRDs, not external tools.
|
||||
|
||||
**5. No XCP-ng/hypervisor management** — Can't provision VMs on XCP-ng, manage Xen
|
||||
hosts, or understand hypervisors that aren't VMware/OpenStack.
|
||||
|
||||
**6. Throws away puppet modules** — Company has existing puppet modules. OpenShift's
|
||||
model is operators, not puppet. Complete rewrite of config management.
|
||||
|
||||
**7. Heavyweight** — Minimum 6 nodes, 88GB RAM just for the platform. k3s uses 512MB.
|
||||
Our entire homelab is 5 nodes, 320GB RAM.
|
||||
|
||||
**8. ARM limited** — RHCOS on Apple Silicon doesn't exist. ARM support is limited to
|
||||
AWS Graviton and some server ARM platforms.
|
||||
|
||||
### The Scope Difference
|
||||
|
||||
```
|
||||
OpenShift: "I am your platform. Everything runs in me. I control the OS."
|
||||
Scope: Kubernetes cluster + its nodes
|
||||
|
||||
Lab: "I manage your infrastructure. K8s is one thing I deploy."
|
||||
Scope: Everything — VMs, bare metal, hypervisors, k8s,
|
||||
network gear, containers, across any provider
|
||||
```
|
||||
|
||||
Lab is closer to what OpenShift + Satellite + RHCOS + ACM (Advanced Cluster Management)
|
||||
do **together** — but unified, lighter, open source, and not locked to Red Hat's ecosystem.
|
||||
|
||||
## Why k3s
|
||||
|
||||
- **Puppet-friendly** — it's just a binary and config files in `/etc/rancher/k3s/`
|
||||
- **Ultra-light** — runs on Mac Studio, ARM boxes, small VMs
|
||||
- **Multi-arch** — native x86 and ARM
|
||||
- **Same K8s API** as EKS/GKE — portable to cloud
|
||||
- **Single binary** — trivial to manage with puppet
|
||||
- **Proven** — CNCF certified, widely used in edge/IoT/homelab
|
||||
|
||||
## k3s via Puppet (OpenVox)
|
||||
|
||||
```puppet
|
||||
# Label: k8s-server → puppet class
|
||||
class kubernetes::server {
|
||||
class { 'k3s::server':
|
||||
token => lab::secret('k8s/cluster-token'),
|
||||
cluster_init => true,
|
||||
tls_san => [$facts['fqdn'], 'k8s.lab.internal'],
|
||||
}
|
||||
}
|
||||
|
||||
# Label: k8s-worker → puppet class
|
||||
class kubernetes::worker {
|
||||
class { 'k3s::worker':
|
||||
server_url => 'https://k8s.lab.internal:6443',
|
||||
token => lab::secret('k8s/cluster-token'),
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Same puppet classes work on bare metal, XCP-ng VM, EC2 instance, any architecture.
|
||||
|
||||
## k0s as Backup Option
|
||||
|
||||
If k3s ever becomes problematic, k0s is the closest alternative:
|
||||
- Also single binary, config-driven, multi-arch
|
||||
- `k0sctl` adds cluster management (bootstrap, upgrade, reset)
|
||||
- Mirantis backing (Lens, Docker EE)
|
||||
- Worth monitoring but no reason to switch from k3s today
|
||||
1537
lab-tool-spec.md
Normal file
1537
lab-tool-spec.md
Normal file
File diff suppressed because it is too large
Load Diff
356
os-install-research.md
Normal file
356
os-install-research.md
Normal file
@@ -0,0 +1,356 @@
|
||||
# OS Installation Research
|
||||
|
||||
## Target Operating Systems
|
||||
|
||||
All must support unattended network installation and automated OpenVox enrollment.
|
||||
All must work across multiple CPU architectures where the OS supports it.
|
||||
|
||||
| OS | Install System | Answer Format | Architectures | PXE Difficulty |
|
||||
|-----|---------------|--------------|---------------|---------------|
|
||||
| Ubuntu 24.04 | autoinstall (cloud-init) | YAML | x86_64, aarch64, RISC-V | Easy |
|
||||
| Debian 12 | preseed | preseed.cfg | x86_64, aarch64, many others | Medium |
|
||||
| Fedora 41+ | Anaconda/kickstart | .ks file | x86_64, aarch64 | Easy |
|
||||
| AlmaLinux 9 | Anaconda/kickstart | .ks file | x86_64, aarch64 | Easy |
|
||||
| XCP-ng 8.3 | Custom Python TUI | XML answer file | x86_64 only | HARD |
|
||||
| VyOS 1.4 | Custom installer | config.boot | x86_64, aarch64 | Medium |
|
||||
|
||||
## XCP-ng Network Install — Known Hard
|
||||
|
||||
### Why it's difficult
|
||||
- iPXE UEFI is fundamentally broken (open bug, multiboot module corruption)
|
||||
- Serial/headless install hangs after detecting storage — no fix
|
||||
- No VNC installer mode (unlike RHEL/Debian)
|
||||
- TFTP agonizingly slow for large install.img
|
||||
- Custom Python TUI designed for VGA console, not automation
|
||||
- No major provisioning tool has first-class XCP-ng support
|
||||
|
||||
### What works
|
||||
- **BIOS PXE** more reliable than UEFI
|
||||
- **IPMI virtual media** with remastered ISO is most reliable
|
||||
- Answer file XML with `<post-install-script>` and `<script stage="filesystem-populated">`
|
||||
- Post-install puppet enrollment via `/etc/firstboot.d/` scripts
|
||||
- XCP-ng enables SSH by default after install
|
||||
|
||||
### Answer file format (XML, custom to XenServer/XCP-ng)
|
||||
```xml
|
||||
<?xml version="1.0"?>
|
||||
<installation mode="fresh" srtype="ext">
|
||||
<primary-disk>sda</primary-disk>
|
||||
<keymap>us</keymap>
|
||||
<root-password type="hash">$6$...</root-password>
|
||||
<source type="url">http://server/xcp-ng/</source>
|
||||
<admin-interface name="eth0" proto="dhcp" />
|
||||
<hostname>xcphost01</hostname>
|
||||
<timezone>Europe/London</timezone>
|
||||
<ntp-server>pool.ntp.org</ntp-server>
|
||||
<network-backend>openvswitch</network-backend>
|
||||
<post-install-script type="url">http://server/scripts/post-install.sh</post-install-script>
|
||||
<script stage="filesystem-populated" type="url">http://server/scripts/fs-setup.sh</script>
|
||||
</installation>
|
||||
```
|
||||
|
||||
### Post-install puppet enrollment
|
||||
The `filesystem-populated` stage script drops a firstboot script:
|
||||
```bash
|
||||
#!/bin/bash
|
||||
MOUNT=$1
|
||||
cat > "$MOUNT/etc/firstboot.d/99-lab-enroll" << 'SCRIPT'
|
||||
#!/bin/bash
|
||||
# Install puppet agent (XCP-ng is CentOS-based, yum works)
|
||||
yum install -y puppet-agent
|
||||
# Configure and start
|
||||
puppet config set server puppet.lab.internal
|
||||
systemctl enable --now puppet
|
||||
SCRIPT
|
||||
chmod +x "$MOUNT/etc/firstboot.d/99-lab-enroll"
|
||||
```
|
||||
|
||||
## Lab Install Profile Abstraction
|
||||
|
||||
Lab needs an `InstallerPlugin` interface so the same `lab onboard` command works
|
||||
for all OS types. Each plugin handles answer file generation, PXE chain setup,
|
||||
and post-install enrollment for its OS type.
|
||||
|
||||
```go
|
||||
type InstallerPlugin interface {
|
||||
Name() string
|
||||
SupportedArchitectures() []string
|
||||
|
||||
// Generate the answer/config file for unattended install
|
||||
GenerateAnswerFile(config InstallConfig) ([]byte, error)
|
||||
|
||||
// Set up PXE boot artifacts (kernel, initrd, bootloader configs)
|
||||
PreparePXE(config PXEConfig) error
|
||||
|
||||
// Generate post-install enrollment script
|
||||
GenerateEnrollmentScript(token string, labels []string) ([]byte, error)
|
||||
}
|
||||
```
|
||||
|
||||
Built-in installer plugins:
|
||||
- `installer-autoinstall` — Ubuntu (cloud-init based autoinstall YAML)
|
||||
- `installer-kickstart` — Fedora, AlmaLinux, RHEL (kickstart .ks files)
|
||||
- `installer-preseed` — Debian (preseed.cfg)
|
||||
- `installer-xcpng` — XCP-ng (custom XML + firstboot.d scripts)
|
||||
- `installer-vyos` — VyOS (config.boot)
|
||||
|
||||
## Auto-Onboard Rules
|
||||
|
||||
Automatic onboarding based on detected hardware characteristics:
|
||||
|
||||
```yaml
|
||||
auto-onboard:
|
||||
rules:
|
||||
- name: large-compute-to-xcpng
|
||||
conditions:
|
||||
cores: ">= 40"
|
||||
memory: ">= 500GB"
|
||||
provider: ovh
|
||||
action:
|
||||
image: xcpng-8.3
|
||||
labels: [xen-host, production]
|
||||
|
||||
- name: arm-to-ubuntu
|
||||
conditions:
|
||||
arch: aarch64
|
||||
action:
|
||||
image: ubuntu-24.04
|
||||
labels: [arm, k8s-worker]
|
||||
```
|
||||
|
||||
Must support:
|
||||
- Preview: show which existing servers match/don't match rules
|
||||
- Dry-run: show what would happen for pending servers
|
||||
- Apply: actually onboard matching servers
|
||||
|
||||
## Deployment Approach: Universal PXE Agent + Rootfs Images
|
||||
|
||||
### Decision: NOT using native installers
|
||||
|
||||
Instead of dealing with 6 different installer formats (autoinstall, kickstart, preseed,
|
||||
XCP-ng XML, VyOS config), Lab uses a universal approach:
|
||||
|
||||
1. PXE boot ONE agent OS (same for all target distros)
|
||||
2. Agent contacts Lab server, gets instructions
|
||||
3. Agent partitions disk, deploys rootfs tarball, injects config, reboots
|
||||
4. Target OS boots with lab-agent, enrolls with OpenVox
|
||||
|
||||
This avoids the nightmare of maintaining 6 installer plugins × 3 architectures.
|
||||
|
||||
### Tool Evaluation
|
||||
|
||||
| Tool | What It Does | For Lab? |
|
||||
|------|-------------|----------|
|
||||
| **Tinkerbell (CNCF)** | PXE → HookOS agent → workflow actions (partition, deploy, inject) | **Best candidate to wrap** |
|
||||
| **LinuxKit** | Build minimal agent OS (used by Tinkerbell's HookOS) | Build our PXE agent |
|
||||
| **mkosi** | Build rootfs tarballs for any distro (Fedora, Ubuntu, Debian, etc.) | **Image production** |
|
||||
| **iPXE** | Universal PXE bootloader with scripting | PXE foundation |
|
||||
| **Pixiecore** | Simple Go PXE server with per-MAC API mode | PXE building block |
|
||||
| **bootc** | Bootable OCI containers → install to disk (RHEL-family) | Image format option |
|
||||
| **cloud-init** | First-boot config injection | Post-deploy config |
|
||||
| **Packer** | Build VM/machine images | Golden image building |
|
||||
| **MAAS/Curtin** | Production-grade, same pattern, but Ubuntu-centric + heavy | Too opinionated |
|
||||
| **Warewulf** | Stateless/diskless boot from container images | Wrong model (RAM-only) |
|
||||
| **Kairos** | Immutable k8s-focused OS from containers | Too opinionated |
|
||||
| **FOG/Clonezilla** | Block-level disk cloning | Too rigid |
|
||||
| **FAI** | Debian-centric installer framework | Too narrow |
|
||||
| **Razor (Puppet)** | Dead (archived 2019) | Dead |
|
||||
| **netboot.xyz** | PXE boot menu into native installers | Opposite of what we want |
|
||||
|
||||
### Tinkerbell — Closest Match
|
||||
|
||||
Tinkerbell already implements this pattern:
|
||||
- **HookOS**: minimal agent OS built with LinuxKit, boots via PXE, multi-arch (x86 + ARM)
|
||||
- **Tink Worker**: runs inside HookOS, contacts server via gRPC, executes workflows
|
||||
- **Workflow Actions**:
|
||||
- `rootio` — partition disks, create filesystems
|
||||
- `archive2disk` — stream compressed rootfs tarball to mounted filesystem
|
||||
- `image2disk` — write raw disk image (dd-style)
|
||||
- `oci2disk` — pull OCI container image, write to disk
|
||||
- `writefile` — write individual files (puppet certs, config, enrollment token)
|
||||
- `cexec` — chroot and run commands (install bootloader, etc.)
|
||||
- `kexec` — kexec into new kernel (avoids reboot)
|
||||
|
||||
**Tinkerbell's limitation:** requires Kubernetes to run (Tink Server is k8s-native).
|
||||
Options:
|
||||
- Run on bootstrap node's k3s (works but adds k3s dependency before we have k3s)
|
||||
- Extract just HookOS + actions, replace Tink Server with Lab's own API
|
||||
- Use Tinkerbell after initial bootstrap
|
||||
|
||||
### Option A: Wrap Tinkerbell
|
||||
Use Tinkerbell's HookOS and actions, Lab translates `lab onboard` into Tinkerbell
|
||||
workflows. Proven, multi-arch, battle-tested by Equinix Metal.
|
||||
|
||||
### Option B: Build our own lightweight agent
|
||||
If Tinkerbell's k8s dependency is too heavy:
|
||||
- Build agent OS with LinuxKit (like HookOS but simpler)
|
||||
- Small Go binary as the agent: contacts lab-server, gets instructions, partitions,
|
||||
deploys rootfs, injects files, installs bootloader, reboots
|
||||
- Embedded in Lab binary — no k8s dependency
|
||||
- Essentially "Tinkerbell actions without Tinkerbell's workflow engine"
|
||||
|
||||
### Decision: TBD — needs hands-on evaluation of Tinkerbell
|
||||
|
||||
### VyOS Inspiration
|
||||
|
||||
VyOS proves this pattern works:
|
||||
- Image-based install (rootfs deployed to partition)
|
||||
- Also runs as Docker container (same config system)
|
||||
- Same concept as Lab: one definition → VM image, bare metal, or container
|
||||
|
||||
### Image Production Pipeline
|
||||
|
||||
Lab needs to produce rootfs tarballs for each OS × architecture:
|
||||
|
||||
```
|
||||
$ lab image build ubuntu-24.04 --arch x86_64,aarch64
|
||||
→ Uses mkosi or debootstrap to build rootfs
|
||||
→ Injects lab-agent, cloud-init datasource
|
||||
→ Produces: ubuntu-24.04-x86_64.tar.gz, ubuntu-24.04-aarch64.tar.gz
|
||||
|
||||
$ lab image build xcpng-8.3 --arch x86_64
|
||||
→ Extract/capture rootfs from XCP-ng installer/installed system
|
||||
→ Produces: xcpng-8.3-x86_64.tar.gz
|
||||
|
||||
$ lab image list
|
||||
IMAGE ARCH SIZE BUILT
|
||||
ubuntu-24.04 x86_64, aarch64 850MB 2026-03-15
|
||||
debian-12 x86_64, aarch64 620MB 2026-03-14
|
||||
fedora-41 x86_64, aarch64 920MB 2026-03-14
|
||||
almalinux-9 x86_64, aarch64 780MB 2026-03-13
|
||||
xcpng-8.3 x86_64 1.2GB 2026-03-10
|
||||
vyos-1.4 x86_64, aarch64 450MB 2026-03-12
|
||||
```
|
||||
|
||||
Image build tools per OS:
|
||||
- Ubuntu/Debian: debootstrap or mkosi
|
||||
- Fedora/AlmaLinux: dnf --installroot or mkosi
|
||||
- XCP-ng: install in QEMU + Packer, capture rootfs (only viable method)
|
||||
- VyOS: extract squashfs from ISO (`unsquashfs /mnt/live/filesystem.squashfs`)
|
||||
- Asahi Linux: NOT BUILDABLE — SSH onboard only, OS already installed by user
|
||||
|
||||
## XCP-ng Rootfs Production — Detailed
|
||||
|
||||
### Why package-based build doesn't work
|
||||
- `install.img` is the installer ramdisk, NOT the target system
|
||||
- The installer (`host-installer/backend.py`) does post-install XAPI setup that
|
||||
can't be replicated with just yum --installroot
|
||||
- Nobody has successfully built XCP-ng from packages alone
|
||||
- `create-install-image` scripts only produce ISOs
|
||||
|
||||
### Viable approach: Packer + QEMU capture
|
||||
```
|
||||
1. Boot XCP-ng ISO in QEMU with answerfile (unattended)
|
||||
2. Installer runs normally, does all XAPI/Xen setup
|
||||
3. Mount resulting disk image
|
||||
4. Tar up root partition
|
||||
5. Generalize: remove SSH keys, XAPI state.db, hostname, UUIDs, persistent net rules
|
||||
6. Output: xcpng-8.3-x86_64.tar.gz
|
||||
```
|
||||
|
||||
### XCP-ng partition layout (PXE agent must recreate this)
|
||||
```
|
||||
sda1: 18GB ext3 / (dom0 root)
|
||||
sda2: 18GB ext3 (backup) (upgrade slot)
|
||||
sda3: rest LVM (SR) (VM storage repository)
|
||||
sda4: 512MB vfat /boot/efi (UEFI ESP)
|
||||
sda5: 4GB ext3 /var/log
|
||||
sda6: 1GB swap
|
||||
```
|
||||
|
||||
## Asahi Linux — Special Case
|
||||
|
||||
### Why it can't follow the standard path
|
||||
- No PXE boot — Apple Silicon only boots from internal NVMe or USB (iBoot)
|
||||
- Firmware partition — m1n1 must be in Apple's APFS container, coexists with macOS
|
||||
- Device tree — generated per-chip at install time
|
||||
- GPU drivers — Asahi's reverse-engineered drivers are kernel-specific
|
||||
- Boot chain: iBoot → m1n1 → U-Boot/GRUB → Linux (completely non-standard)
|
||||
|
||||
### How Lab handles it
|
||||
- SSH onboard only: `lab onboard mac-studio --provider ssh --host <ip>`
|
||||
- Asahi is already installed (user did this manually or via Asahi installer)
|
||||
- Lab manages the userspace (Fedora-based) via puppet normally
|
||||
- Kernel updates from Asahi repos, managed by puppet/dnf
|
||||
- m1n1/U-Boot/firmware layer is untouched by Lab
|
||||
|
||||
### Lesson
|
||||
Not everything is PXE-bootable. Lab needs two onboard paths:
|
||||
- **PXE onboard**: bare metal with no OS (Beelinks, OVH servers, XCP-ng hosts)
|
||||
- **SSH onboard**: OS already installed (Mac Studio, DGX Spark, cloud VMs)
|
||||
|
||||
## Image Deployment Matrix
|
||||
|
||||
```
|
||||
PXE Deploy SSH Onboard Container VM Image
|
||||
Ubuntu 24.04 ✓ rootfs ✓ ✓ ✓ qcow2
|
||||
Debian 12 ✓ rootfs ✓ ✓ ✓ qcow2
|
||||
Fedora 41 ✓ rootfs ✓ ✓ ✓ qcow2
|
||||
AlmaLinux 9 ✓ rootfs ✓ ✓ ✓ qcow2
|
||||
XCP-ng 8.3 ✓ rootfs ✓ (existing) ✗ ✗
|
||||
VyOS 1.4 ✓ rootfs ✓ (existing) ✓ docker ✓ qcow2
|
||||
Asahi Linux ✗ impossible ✓ (only way) ✗ ✗
|
||||
```
|
||||
|
||||
## Automated Image Pipeline
|
||||
|
||||
Images must be rebuilt regularly to include security updates and new lab-agent versions.
|
||||
|
||||
### Pipeline Configuration
|
||||
```yaml
|
||||
image-pipelines:
|
||||
ubuntu-24.04:
|
||||
method: debootstrap
|
||||
schedule: weekly
|
||||
architectures: [x86_64, aarch64]
|
||||
outputs: [rootfs-tarball, container-base, qcow2]
|
||||
retention: 4 builds
|
||||
|
||||
xcpng-8.3:
|
||||
method: packer-qemu # install in QEMU, capture
|
||||
schedule: monthly
|
||||
architectures: [x86_64]
|
||||
outputs: [rootfs-tarball]
|
||||
retention: 3 builds
|
||||
|
||||
vyos-1.4:
|
||||
method: squashfs-extract # extract from ISO
|
||||
schedule: monthly
|
||||
architectures: [x86_64, aarch64]
|
||||
outputs: [rootfs-tarball, container-base]
|
||||
retention: 3 builds
|
||||
```
|
||||
|
||||
### Build runs on Lab itself (dogfooding)
|
||||
- x86 images build on x86 machines (Beelink SER9 MAX)
|
||||
- ARM images build on ARM machines (DGX Spark, Minisforum)
|
||||
- XCP-ng builds on any x86 with QEMU/KVM
|
||||
- Lab picks the right builder based on architecture
|
||||
|
||||
### Upgrade flow
|
||||
- New image built → Lab knows which servers run old version
|
||||
- `lab image diff` shows package changes
|
||||
- `lab image promote` makes new image the default for new deploys
|
||||
- Existing servers: puppet manages package updates (not re-imaged unless requested)
|
||||
|
||||
### Connection to Puppet → Container Artifact Builder
|
||||
|
||||
Same pipeline, different output targets:
|
||||
|
||||
```
|
||||
Label "mailserver" + base image "ubuntu-24.04":
|
||||
→ rootfs + puppet classes = bare metal image (tar.gz for PXE deploy)
|
||||
→ rootfs + puppet classes = container image (OCI for k8s/docker)
|
||||
→ rootfs + puppet classes = VM image (qcow2/vmdk for XCP-ng/AWS)
|
||||
|
||||
One label, one set of puppet modules, three deployment formats.
|
||||
```
|
||||
|
||||
## Multi-Architecture Considerations
|
||||
|
||||
- PXE boot chain differs between x86 (BIOS/UEFI) and ARM (UEFI only)
|
||||
- Need separate kernel/initrd per architecture for the agent OS
|
||||
- Rootfs tarballs are architecture-specific
|
||||
- Some OS images don't exist for all architectures (XCP-ng = x86 only)
|
||||
- Lab must track architecture per image and refuse mismatches
|
||||
- Tinkerbell's HookOS already builds for x86_64 and aarch64
|
||||
Reference in New Issue
Block a user