Files
lab/bastion/tests/integration/helpers/pxe-vm.ts

182 lines
6.4 KiB
TypeScript
Raw Normal View History

feat: install logging, error trapping, PXE/ISO integration tests Kickstart installs on real hardware failed silently — no error reporting, only 3 progress callbacks, zero log streaming. This overhaul makes every install fully observable. Kickstart improvements: - Error trapping in %pre and %post (trap ERR sends failure details to bastion) - 12+ granular progress stages (was 3): SSH, hostname, k3s prep, EFI boot, metadata - Background log streamer: tails %post output and batch-sends to /api/log - bastion_log() function for explicit log lines from kickstart scripts Bastion API: - POST /api/log — receives raw log lines from kickstart (single or batch) - InstallLogBuffer — per-MAC ring buffer (2000 lines) + file persistence - GET /api/logs/:mac — now returns log_lines + log_total alongside stages - SSE /api/logs/:mac/follow — uses named events (event: stage vs event: log) - Progress events forwarded to labd via bastion-progress WebSocket message - Post-provision k3s logs routed through progressBus (was console-only) dnsmasq fixes found during VM testing: - HTTP Boot filename: ipxe-real.efi → ipxe.efi (leftover from old 2-stage approach) - pxe-service directives: only in proxy mode (breaks OVMF PXE in full mode) - PXEClient vendor class echo for UEFI firmware compatibility Integration tests: - PXE boot test: blank UEFI VM → dnsmasq → HTTP Boot → iPXE → bastion → install - ISO boot test: blank VM boots from bastion-generated ISO → same flow - Shared helpers: pxe-network (no DHCP, nftables fix), pxe-vm (UEFI + ISO boot) - test-provision.sh: runs both PXE + ISO tests with prerequisite checks - 250GB sparse QCOW2 disk (LVM layout needs ~204GB) 201 unit tests passing (11 new). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 22:26:33 +00:00
// Create a blank UEFI VM for PXE boot testing.
// Unlike cloud image VMs, these have an empty disk and boot from network.
import { execSync, spawnSync, type SpawnSyncReturns } from "node:child_process";
import { existsSync } from "node:fs";
import { join } from "node:path";
import { log } from "./libvirt.js";
const IMAGE_DIR = "/var/lib/libvirt/images";
const IS_ROOT = process.getuid?.() === 0;
function run(cmd: string, opts?: { timeout?: number }): string {
const full = IS_ROOT ? cmd : `sudo ${cmd}`;
return execSync(full, { encoding: "utf-8", stdio: "pipe", timeout: opts?.timeout ?? 60_000 });
}
function virsh(...args: string[]): SpawnSyncReturns<string> {
const cmd = IS_ROOT ? "virsh" : "sudo";
const finalArgs = IS_ROOT ? args : ["virsh", ...args];
return spawnSync(cmd, finalArgs, { encoding: "utf-8", stdio: "pipe", timeout: 30_000 });
}
export interface PxeVmConfig {
name: string;
memory: number; // MB
vcpus: number;
diskSize: number; // GB
network: string; // libvirt network name
arch?: "x86_64" | "aarch64";
}
/** Create a blank UEFI VM that PXE boots from the network. */
export function createPxeVm(config: PxeVmConfig): void {
destroyPxeVm(config.name);
const arch = config.arch ?? "x86_64";
log(`Creating PXE VM: ${config.name} (${arch}, ${config.memory}MB RAM, ${config.vcpus} vCPU, ${config.diskSize}GB disk)`);
// Create blank disk
const diskPath = join(IMAGE_DIR, `${config.name}.qcow2`);
run(`qemu-img create -f qcow2 "${diskPath}" ${config.diskSize}G`);
// UEFI firmware paths (Fedora)
if (arch === "aarch64") {
const aavmf = "/usr/share/edk2/aarch64/QEMU_EFI.fd";
if (!existsSync(aavmf)) {
throw new Error(`AAVMF firmware not found at ${aavmf}. Install: sudo dnf install edk2-aarch64`);
}
} else {
const ovmf = "/usr/share/edk2/ovmf/OVMF_CODE.fd";
if (!existsSync(ovmf)) {
throw new Error(`OVMF firmware not found at ${ovmf}. Install: sudo dnf install edk2-ovmf`);
}
feat: install logging, error trapping, PXE/ISO integration tests Kickstart installs on real hardware failed silently — no error reporting, only 3 progress callbacks, zero log streaming. This overhaul makes every install fully observable. Kickstart improvements: - Error trapping in %pre and %post (trap ERR sends failure details to bastion) - 12+ granular progress stages (was 3): SSH, hostname, k3s prep, EFI boot, metadata - Background log streamer: tails %post output and batch-sends to /api/log - bastion_log() function for explicit log lines from kickstart scripts Bastion API: - POST /api/log — receives raw log lines from kickstart (single or batch) - InstallLogBuffer — per-MAC ring buffer (2000 lines) + file persistence - GET /api/logs/:mac — now returns log_lines + log_total alongside stages - SSE /api/logs/:mac/follow — uses named events (event: stage vs event: log) - Progress events forwarded to labd via bastion-progress WebSocket message - Post-provision k3s logs routed through progressBus (was console-only) dnsmasq fixes found during VM testing: - HTTP Boot filename: ipxe-real.efi → ipxe.efi (leftover from old 2-stage approach) - pxe-service directives: only in proxy mode (breaks OVMF PXE in full mode) - PXEClient vendor class echo for UEFI firmware compatibility Integration tests: - PXE boot test: blank UEFI VM → dnsmasq → HTTP Boot → iPXE → bastion → install - ISO boot test: blank VM boots from bastion-generated ISO → same flow - Shared helpers: pxe-network (no DHCP, nftables fix), pxe-vm (UEFI + ISO boot) - test-provision.sh: runs both PXE + ISO tests with prerequisite checks - 250GB sparse QCOW2 disk (LVM layout needs ~204GB) 201 unit tests passing (11 new). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 22:26:33 +00:00
}
const virtInstallArgs = [
"virt-install",
`--name=${config.name}`,
`--memory=${config.memory}`,
`--vcpus=${config.vcpus}`,
`--disk=path=${diskPath},format=qcow2,bus=virtio`,
`--network=network=${config.network},model=virtio`,
// UEFI firmware — required for PXE boot in modern mode
`--boot=uefi,network`,
// No OS to install — PXE provides everything
"--os-variant=generic",
"--noautoconsole",
"--wait=0",
// Graphics for debugging (VNC, connect with virt-viewer if needed)
"--graphics=vnc,listen=127.0.0.1",
];
if (arch === "aarch64") {
virtInstallArgs.push("--arch=aarch64", "--machine=virt");
}
log(`Running: virt-install --name=${config.name} --boot=uefi,network ...`);
run(virtInstallArgs.join(" "), { timeout: 30_000 });
log(`PXE VM ${config.name} created and booting from network`);
}
/** Destroy a PXE VM and clean up its disk. */
export function destroyPxeVm(name: string): void {
const result = virsh("dominfo", name);
if (result.status !== 0) return;
log(`Destroying PXE VM: ${name}`);
virsh("destroy", name);
virsh("undefine", name, "--remove-all-storage", "--nvram");
}
/** Get the MAC address of a VM's first NIC. */
export function getVmMac(name: string): string | null {
const result = virsh("domiflist", name);
if (result.status !== 0) return null;
// Output format: Interface Type Source Model MAC
const match = result.stdout.match(/([0-9a-f]{2}:[0-9a-f]{2}:[0-9a-f]{2}:[0-9a-f]{2}:[0-9a-f]{2}:[0-9a-f]{2})/i);
return match ? match[1].toLowerCase() : null;
}
/** Reboot a VM (force off + start). */
export function rebootPxeVm(name: string): void {
log(`Rebooting PXE VM: ${name}`);
virsh("destroy", name);
// Brief pause to let resources release
spawnSync("sleep", ["2"]);
virsh("start", name);
log(`PXE VM ${name} restarted`);
}
/** Change VM boot order to disk first (skip PXE on next boot). */
export function setBootDisk(name: string): void {
log(`Setting ${name} boot order to disk first`);
virsh("destroy", name);
spawnSync("sleep", ["2"]);
// Get current XML, replace boot dev='network' with boot dev='hd'
// This preserves UEFI loader/nvram settings (virt-xml --boot hd can break them)
const dumpXml = virsh("dumpxml", name);
if (dumpXml.status !== 0) throw new Error("Failed to dump VM XML");
let xml = dumpXml.stdout;
// Replace any <boot dev='...' /> entries with hd
xml = xml.replace(/<boot dev='[^']*'\/>/g, "<boot dev='hd'/>");
// If no boot dev entry, add one before </os>
if (!xml.includes("<boot dev=")) {
xml = xml.replace("</os>", " <boot dev='hd'/>\n </os>");
}
const xmlPath = `/tmp/${name}-bootfix.xml`;
const { writeFileSync: writeFs, unlinkSync: unlinkFs } = require("node:fs") as typeof import("node:fs");
writeFs(xmlPath, xml);
run(`virsh define "${xmlPath}"`);
try { unlinkFs(xmlPath); } catch { /* ignore */ }
virsh("start", name);
log(`${name} restarted with disk boot (UEFI preserved)`);
}
feat: install logging, error trapping, PXE/ISO integration tests Kickstart installs on real hardware failed silently — no error reporting, only 3 progress callbacks, zero log streaming. This overhaul makes every install fully observable. Kickstart improvements: - Error trapping in %pre and %post (trap ERR sends failure details to bastion) - 12+ granular progress stages (was 3): SSH, hostname, k3s prep, EFI boot, metadata - Background log streamer: tails %post output and batch-sends to /api/log - bastion_log() function for explicit log lines from kickstart scripts Bastion API: - POST /api/log — receives raw log lines from kickstart (single or batch) - InstallLogBuffer — per-MAC ring buffer (2000 lines) + file persistence - GET /api/logs/:mac — now returns log_lines + log_total alongside stages - SSE /api/logs/:mac/follow — uses named events (event: stage vs event: log) - Progress events forwarded to labd via bastion-progress WebSocket message - Post-provision k3s logs routed through progressBus (was console-only) dnsmasq fixes found during VM testing: - HTTP Boot filename: ipxe-real.efi → ipxe.efi (leftover from old 2-stage approach) - pxe-service directives: only in proxy mode (breaks OVMF PXE in full mode) - PXEClient vendor class echo for UEFI firmware compatibility Integration tests: - PXE boot test: blank UEFI VM → dnsmasq → HTTP Boot → iPXE → bastion → install - ISO boot test: blank VM boots from bastion-generated ISO → same flow - Shared helpers: pxe-network (no DHCP, nftables fix), pxe-vm (UEFI + ISO boot) - test-provision.sh: runs both PXE + ISO tests with prerequisite checks - 250GB sparse QCOW2 disk (LVM layout needs ~204GB) 201 unit tests passing (11 new). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 22:26:33 +00:00
export interface IsoVmConfig {
name: string;
memory: number; // MB
vcpus: number;
diskSize: number; // GB
network: string; // libvirt network name
isoPath: string; // path to boot ISO
arch?: "x86_64" | "aarch64";
feat: install logging, error trapping, PXE/ISO integration tests Kickstart installs on real hardware failed silently — no error reporting, only 3 progress callbacks, zero log streaming. This overhaul makes every install fully observable. Kickstart improvements: - Error trapping in %pre and %post (trap ERR sends failure details to bastion) - 12+ granular progress stages (was 3): SSH, hostname, k3s prep, EFI boot, metadata - Background log streamer: tails %post output and batch-sends to /api/log - bastion_log() function for explicit log lines from kickstart scripts Bastion API: - POST /api/log — receives raw log lines from kickstart (single or batch) - InstallLogBuffer — per-MAC ring buffer (2000 lines) + file persistence - GET /api/logs/:mac — now returns log_lines + log_total alongside stages - SSE /api/logs/:mac/follow — uses named events (event: stage vs event: log) - Progress events forwarded to labd via bastion-progress WebSocket message - Post-provision k3s logs routed through progressBus (was console-only) dnsmasq fixes found during VM testing: - HTTP Boot filename: ipxe-real.efi → ipxe.efi (leftover from old 2-stage approach) - pxe-service directives: only in proxy mode (breaks OVMF PXE in full mode) - PXEClient vendor class echo for UEFI firmware compatibility Integration tests: - PXE boot test: blank UEFI VM → dnsmasq → HTTP Boot → iPXE → bastion → install - ISO boot test: blank VM boots from bastion-generated ISO → same flow - Shared helpers: pxe-network (no DHCP, nftables fix), pxe-vm (UEFI + ISO boot) - test-provision.sh: runs both PXE + ISO tests with prerequisite checks - 250GB sparse QCOW2 disk (LVM layout needs ~204GB) 201 unit tests passing (11 new). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 22:26:33 +00:00
}
/** Create a UEFI VM that boots from a CD-ROM ISO (not PXE). */
export function createIsoVm(config: IsoVmConfig): void {
destroyPxeVm(config.name);
const arch = config.arch ?? "x86_64";
log(`Creating ISO boot VM: ${config.name} (${arch}, ${config.memory}MB RAM, ${config.vcpus} vCPU, ${config.diskSize}GB disk)`);
feat: install logging, error trapping, PXE/ISO integration tests Kickstart installs on real hardware failed silently — no error reporting, only 3 progress callbacks, zero log streaming. This overhaul makes every install fully observable. Kickstart improvements: - Error trapping in %pre and %post (trap ERR sends failure details to bastion) - 12+ granular progress stages (was 3): SSH, hostname, k3s prep, EFI boot, metadata - Background log streamer: tails %post output and batch-sends to /api/log - bastion_log() function for explicit log lines from kickstart scripts Bastion API: - POST /api/log — receives raw log lines from kickstart (single or batch) - InstallLogBuffer — per-MAC ring buffer (2000 lines) + file persistence - GET /api/logs/:mac — now returns log_lines + log_total alongside stages - SSE /api/logs/:mac/follow — uses named events (event: stage vs event: log) - Progress events forwarded to labd via bastion-progress WebSocket message - Post-provision k3s logs routed through progressBus (was console-only) dnsmasq fixes found during VM testing: - HTTP Boot filename: ipxe-real.efi → ipxe.efi (leftover from old 2-stage approach) - pxe-service directives: only in proxy mode (breaks OVMF PXE in full mode) - PXEClient vendor class echo for UEFI firmware compatibility Integration tests: - PXE boot test: blank UEFI VM → dnsmasq → HTTP Boot → iPXE → bastion → install - ISO boot test: blank VM boots from bastion-generated ISO → same flow - Shared helpers: pxe-network (no DHCP, nftables fix), pxe-vm (UEFI + ISO boot) - test-provision.sh: runs both PXE + ISO tests with prerequisite checks - 250GB sparse QCOW2 disk (LVM layout needs ~204GB) 201 unit tests passing (11 new). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 22:26:33 +00:00
// Create blank disk
const diskPath = join(IMAGE_DIR, `${config.name}.qcow2`);
run(`qemu-img create -f qcow2 "${diskPath}" ${config.diskSize}G`);
const virtInstallArgs = [
"virt-install",
`--name=${config.name}`,
`--memory=${config.memory}`,
`--vcpus=${config.vcpus}`,
`--disk=path=${diskPath},format=qcow2,bus=virtio`,
// Boot ISO as CD-ROM
`--disk=path=${config.isoPath},device=cdrom,readonly=on`,
`--network=network=${config.network},model=virtio`,
// UEFI firmware, boot from cdrom (not network)
"--boot=uefi,cdrom",
"--os-variant=generic",
"--noautoconsole",
"--wait=0",
"--graphics=vnc,listen=127.0.0.1",
];
if (arch === "aarch64") {
virtInstallArgs.push("--arch=aarch64", "--machine=virt");
}
feat: install logging, error trapping, PXE/ISO integration tests Kickstart installs on real hardware failed silently — no error reporting, only 3 progress callbacks, zero log streaming. This overhaul makes every install fully observable. Kickstart improvements: - Error trapping in %pre and %post (trap ERR sends failure details to bastion) - 12+ granular progress stages (was 3): SSH, hostname, k3s prep, EFI boot, metadata - Background log streamer: tails %post output and batch-sends to /api/log - bastion_log() function for explicit log lines from kickstart scripts Bastion API: - POST /api/log — receives raw log lines from kickstart (single or batch) - InstallLogBuffer — per-MAC ring buffer (2000 lines) + file persistence - GET /api/logs/:mac — now returns log_lines + log_total alongside stages - SSE /api/logs/:mac/follow — uses named events (event: stage vs event: log) - Progress events forwarded to labd via bastion-progress WebSocket message - Post-provision k3s logs routed through progressBus (was console-only) dnsmasq fixes found during VM testing: - HTTP Boot filename: ipxe-real.efi → ipxe.efi (leftover from old 2-stage approach) - pxe-service directives: only in proxy mode (breaks OVMF PXE in full mode) - PXEClient vendor class echo for UEFI firmware compatibility Integration tests: - PXE boot test: blank UEFI VM → dnsmasq → HTTP Boot → iPXE → bastion → install - ISO boot test: blank VM boots from bastion-generated ISO → same flow - Shared helpers: pxe-network (no DHCP, nftables fix), pxe-vm (UEFI + ISO boot) - test-provision.sh: runs both PXE + ISO tests with prerequisite checks - 250GB sparse QCOW2 disk (LVM layout needs ~204GB) 201 unit tests passing (11 new). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 22:26:33 +00:00
log(`Running: virt-install --name=${config.name} --boot=uefi,cdrom ...`);
run(virtInstallArgs.join(" "), { timeout: 60_000 });
feat: install logging, error trapping, PXE/ISO integration tests Kickstart installs on real hardware failed silently — no error reporting, only 3 progress callbacks, zero log streaming. This overhaul makes every install fully observable. Kickstart improvements: - Error trapping in %pre and %post (trap ERR sends failure details to bastion) - 12+ granular progress stages (was 3): SSH, hostname, k3s prep, EFI boot, metadata - Background log streamer: tails %post output and batch-sends to /api/log - bastion_log() function for explicit log lines from kickstart scripts Bastion API: - POST /api/log — receives raw log lines from kickstart (single or batch) - InstallLogBuffer — per-MAC ring buffer (2000 lines) + file persistence - GET /api/logs/:mac — now returns log_lines + log_total alongside stages - SSE /api/logs/:mac/follow — uses named events (event: stage vs event: log) - Progress events forwarded to labd via bastion-progress WebSocket message - Post-provision k3s logs routed through progressBus (was console-only) dnsmasq fixes found during VM testing: - HTTP Boot filename: ipxe-real.efi → ipxe.efi (leftover from old 2-stage approach) - pxe-service directives: only in proxy mode (breaks OVMF PXE in full mode) - PXEClient vendor class echo for UEFI firmware compatibility Integration tests: - PXE boot test: blank UEFI VM → dnsmasq → HTTP Boot → iPXE → bastion → install - ISO boot test: blank VM boots from bastion-generated ISO → same flow - Shared helpers: pxe-network (no DHCP, nftables fix), pxe-vm (UEFI + ISO boot) - test-provision.sh: runs both PXE + ISO tests with prerequisite checks - 250GB sparse QCOW2 disk (LVM layout needs ~204GB) 201 unit tests passing (11 new). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 22:26:33 +00:00
log(`ISO boot VM ${config.name} created and booting from ISO`);
}