feat: install logging, error trapping, PXE/ISO integration tests
Kickstart installs on real hardware failed silently — no error reporting,
only 3 progress callbacks, zero log streaming. This overhaul makes every
install fully observable.
Kickstart improvements:
- Error trapping in %pre and %post (trap ERR sends failure details to bastion)
- 12+ granular progress stages (was 3): SSH, hostname, k3s prep, EFI boot, metadata
- Background log streamer: tails %post output and batch-sends to /api/log
- bastion_log() function for explicit log lines from kickstart scripts
Bastion API:
- POST /api/log — receives raw log lines from kickstart (single or batch)
- InstallLogBuffer — per-MAC ring buffer (2000 lines) + file persistence
- GET /api/logs/:mac — now returns log_lines + log_total alongside stages
- SSE /api/logs/:mac/follow — uses named events (event: stage vs event: log)
- Progress events forwarded to labd via bastion-progress WebSocket message
- Post-provision k3s logs routed through progressBus (was console-only)
dnsmasq fixes found during VM testing:
- HTTP Boot filename: ipxe-real.efi → ipxe.efi (leftover from old 2-stage approach)
- pxe-service directives: only in proxy mode (breaks OVMF PXE in full mode)
- PXEClient vendor class echo for UEFI firmware compatibility
Integration tests:
- PXE boot test: blank UEFI VM → dnsmasq → HTTP Boot → iPXE → bastion → install
- ISO boot test: blank VM boots from bastion-generated ISO → same flow
- Shared helpers: pxe-network (no DHCP, nftables fix), pxe-vm (UEFI + ISO boot)
- test-provision.sh: runs both PXE + ISO tests with prerequisite checks
- 250GB sparse QCOW2 disk (LVM layout needs ~204GB)
201 unit tests passing (11 new).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 22:26:33 +00:00
|
|
|
// Libvirt network for PXE boot testing.
|
|
|
|
|
// Unlike the regular test network, this one has NO DHCP —
|
|
|
|
|
// the bastion provides full DHCP + PXE on this network.
|
|
|
|
|
|
|
|
|
|
import { execSync, spawnSync } from "node:child_process";
|
|
|
|
|
import { writeFileSync, unlinkSync } from "node:fs";
|
|
|
|
|
import { log } from "./libvirt.js";
|
|
|
|
|
|
|
|
|
|
export const PXE_NETWORK_NAME = "lab-pxe-test";
|
|
|
|
|
export const PXE_BRIDGE = "virbr-pxe";
|
|
|
|
|
export const PXE_SUBNET = "192.168.251";
|
|
|
|
|
export const PXE_GATEWAY = `${PXE_SUBNET}.1`;
|
|
|
|
|
|
|
|
|
|
const IS_ROOT = process.getuid?.() === 0;
|
|
|
|
|
|
|
|
|
|
function run(cmd: string): string {
|
|
|
|
|
const full = IS_ROOT ? cmd : `sudo ${cmd}`;
|
|
|
|
|
return execSync(full, { encoding: "utf-8", stdio: "pipe" });
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
function virsh(...args: string[]): { status: number; stdout: string } {
|
|
|
|
|
const cmd = IS_ROOT ? "virsh" : "sudo";
|
|
|
|
|
const finalArgs = IS_ROOT ? args : ["virsh", ...args];
|
|
|
|
|
const result = spawnSync(cmd, finalArgs, { encoding: "utf-8", stdio: "pipe" });
|
|
|
|
|
return { status: result.status ?? 1, stdout: result.stdout ?? "" };
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// No <dhcp> section — bastion dnsmasq provides full DHCP + PXE
|
|
|
|
|
const NETWORK_XML = `<network>
|
|
|
|
|
<name>${PXE_NETWORK_NAME}</name>
|
|
|
|
|
<forward mode='nat'/>
|
|
|
|
|
<bridge name='${PXE_BRIDGE}' stp='on' delay='0'/>
|
|
|
|
|
<ip address='${PXE_GATEWAY}' netmask='255.255.255.0'>
|
|
|
|
|
</ip>
|
|
|
|
|
</network>`;
|
|
|
|
|
|
|
|
|
|
/** Ensure the PXE test network exists and is active (no DHCP). */
|
|
|
|
|
export function ensurePxeNetwork(): void {
|
|
|
|
|
const result = virsh("net-info", PXE_NETWORK_NAME);
|
|
|
|
|
|
|
|
|
|
if (result.status === 0 && result.stdout.includes("Active: yes")) {
|
|
|
|
|
log(`Network ${PXE_NETWORK_NAME} already active`);
|
2026-03-28 20:24:14 +00:00
|
|
|
} else {
|
|
|
|
|
// Destroy existing if present but inactive
|
|
|
|
|
if (result.status === 0) {
|
|
|
|
|
virsh("net-destroy", PXE_NETWORK_NAME);
|
|
|
|
|
virsh("net-undefine", PXE_NETWORK_NAME);
|
|
|
|
|
}
|
feat: install logging, error trapping, PXE/ISO integration tests
Kickstart installs on real hardware failed silently — no error reporting,
only 3 progress callbacks, zero log streaming. This overhaul makes every
install fully observable.
Kickstart improvements:
- Error trapping in %pre and %post (trap ERR sends failure details to bastion)
- 12+ granular progress stages (was 3): SSH, hostname, k3s prep, EFI boot, metadata
- Background log streamer: tails %post output and batch-sends to /api/log
- bastion_log() function for explicit log lines from kickstart scripts
Bastion API:
- POST /api/log — receives raw log lines from kickstart (single or batch)
- InstallLogBuffer — per-MAC ring buffer (2000 lines) + file persistence
- GET /api/logs/:mac — now returns log_lines + log_total alongside stages
- SSE /api/logs/:mac/follow — uses named events (event: stage vs event: log)
- Progress events forwarded to labd via bastion-progress WebSocket message
- Post-provision k3s logs routed through progressBus (was console-only)
dnsmasq fixes found during VM testing:
- HTTP Boot filename: ipxe-real.efi → ipxe.efi (leftover from old 2-stage approach)
- pxe-service directives: only in proxy mode (breaks OVMF PXE in full mode)
- PXEClient vendor class echo for UEFI firmware compatibility
Integration tests:
- PXE boot test: blank UEFI VM → dnsmasq → HTTP Boot → iPXE → bastion → install
- ISO boot test: blank VM boots from bastion-generated ISO → same flow
- Shared helpers: pxe-network (no DHCP, nftables fix), pxe-vm (UEFI + ISO boot)
- test-provision.sh: runs both PXE + ISO tests with prerequisite checks
- 250GB sparse QCOW2 disk (LVM layout needs ~204GB)
201 unit tests passing (11 new).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 22:26:33 +00:00
|
|
|
|
2026-03-28 20:24:14 +00:00
|
|
|
const xmlPath = "/tmp/lab-pxe-test-network.xml";
|
|
|
|
|
writeFileSync(xmlPath, NETWORK_XML);
|
|
|
|
|
|
|
|
|
|
log(`Creating PXE libvirt network: ${PXE_NETWORK_NAME} (${PXE_SUBNET}.0/24, no DHCP)`);
|
|
|
|
|
run(`virsh net-define "${xmlPath}"`);
|
|
|
|
|
run(`virsh net-start "${PXE_NETWORK_NAME}"`);
|
feat: install logging, error trapping, PXE/ISO integration tests
Kickstart installs on real hardware failed silently — no error reporting,
only 3 progress callbacks, zero log streaming. This overhaul makes every
install fully observable.
Kickstart improvements:
- Error trapping in %pre and %post (trap ERR sends failure details to bastion)
- 12+ granular progress stages (was 3): SSH, hostname, k3s prep, EFI boot, metadata
- Background log streamer: tails %post output and batch-sends to /api/log
- bastion_log() function for explicit log lines from kickstart scripts
Bastion API:
- POST /api/log — receives raw log lines from kickstart (single or batch)
- InstallLogBuffer — per-MAC ring buffer (2000 lines) + file persistence
- GET /api/logs/:mac — now returns log_lines + log_total alongside stages
- SSE /api/logs/:mac/follow — uses named events (event: stage vs event: log)
- Progress events forwarded to labd via bastion-progress WebSocket message
- Post-provision k3s logs routed through progressBus (was console-only)
dnsmasq fixes found during VM testing:
- HTTP Boot filename: ipxe-real.efi → ipxe.efi (leftover from old 2-stage approach)
- pxe-service directives: only in proxy mode (breaks OVMF PXE in full mode)
- PXEClient vendor class echo for UEFI firmware compatibility
Integration tests:
- PXE boot test: blank UEFI VM → dnsmasq → HTTP Boot → iPXE → bastion → install
- ISO boot test: blank VM boots from bastion-generated ISO → same flow
- Shared helpers: pxe-network (no DHCP, nftables fix), pxe-vm (UEFI + ISO boot)
- test-provision.sh: runs both PXE + ISO tests with prerequisite checks
- 250GB sparse QCOW2 disk (LVM layout needs ~204GB)
201 unit tests passing (11 new).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 22:26:33 +00:00
|
|
|
|
2026-03-28 20:24:14 +00:00
|
|
|
try { unlinkSync(xmlPath); } catch { /* ignore */ }
|
feat: install logging, error trapping, PXE/ISO integration tests
Kickstart installs on real hardware failed silently — no error reporting,
only 3 progress callbacks, zero log streaming. This overhaul makes every
install fully observable.
Kickstart improvements:
- Error trapping in %pre and %post (trap ERR sends failure details to bastion)
- 12+ granular progress stages (was 3): SSH, hostname, k3s prep, EFI boot, metadata
- Background log streamer: tails %post output and batch-sends to /api/log
- bastion_log() function for explicit log lines from kickstart scripts
Bastion API:
- POST /api/log — receives raw log lines from kickstart (single or batch)
- InstallLogBuffer — per-MAC ring buffer (2000 lines) + file persistence
- GET /api/logs/:mac — now returns log_lines + log_total alongside stages
- SSE /api/logs/:mac/follow — uses named events (event: stage vs event: log)
- Progress events forwarded to labd via bastion-progress WebSocket message
- Post-provision k3s logs routed through progressBus (was console-only)
dnsmasq fixes found during VM testing:
- HTTP Boot filename: ipxe-real.efi → ipxe.efi (leftover from old 2-stage approach)
- pxe-service directives: only in proxy mode (breaks OVMF PXE in full mode)
- PXEClient vendor class echo for UEFI firmware compatibility
Integration tests:
- PXE boot test: blank UEFI VM → dnsmasq → HTTP Boot → iPXE → bastion → install
- ISO boot test: blank VM boots from bastion-generated ISO → same flow
- Shared helpers: pxe-network (no DHCP, nftables fix), pxe-vm (UEFI + ISO boot)
- test-provision.sh: runs both PXE + ISO tests with prerequisite checks
- 250GB sparse QCOW2 disk (LVM layout needs ~204GB)
201 unit tests passing (11 new).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 22:26:33 +00:00
|
|
|
|
2026-03-28 20:24:14 +00:00
|
|
|
log(`Network ${PXE_NETWORK_NAME} created and active`);
|
|
|
|
|
}
|
feat: install logging, error trapping, PXE/ISO integration tests
Kickstart installs on real hardware failed silently — no error reporting,
only 3 progress callbacks, zero log streaming. This overhaul makes every
install fully observable.
Kickstart improvements:
- Error trapping in %pre and %post (trap ERR sends failure details to bastion)
- 12+ granular progress stages (was 3): SSH, hostname, k3s prep, EFI boot, metadata
- Background log streamer: tails %post output and batch-sends to /api/log
- bastion_log() function for explicit log lines from kickstart scripts
Bastion API:
- POST /api/log — receives raw log lines from kickstart (single or batch)
- InstallLogBuffer — per-MAC ring buffer (2000 lines) + file persistence
- GET /api/logs/:mac — now returns log_lines + log_total alongside stages
- SSE /api/logs/:mac/follow — uses named events (event: stage vs event: log)
- Progress events forwarded to labd via bastion-progress WebSocket message
- Post-provision k3s logs routed through progressBus (was console-only)
dnsmasq fixes found during VM testing:
- HTTP Boot filename: ipxe-real.efi → ipxe.efi (leftover from old 2-stage approach)
- pxe-service directives: only in proxy mode (breaks OVMF PXE in full mode)
- PXEClient vendor class echo for UEFI firmware compatibility
Integration tests:
- PXE boot test: blank UEFI VM → dnsmasq → HTTP Boot → iPXE → bastion → install
- ISO boot test: blank VM boots from bastion-generated ISO → same flow
- Shared helpers: pxe-network (no DHCP, nftables fix), pxe-vm (UEFI + ISO boot)
- test-provision.sh: runs both PXE + ISO tests with prerequisite checks
- 250GB sparse QCOW2 disk (LVM layout needs ~204GB)
201 unit tests passing (11 new).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 22:26:33 +00:00
|
|
|
|
2026-03-28 20:24:14 +00:00
|
|
|
// Libvirt adds nftables reject rules for NAT networks that block host→VM SSH.
|
|
|
|
|
// Delete them now and after every VM reboot (libvirt recreates them).
|
|
|
|
|
deleteNftablesRejectRules();
|
|
|
|
|
}
|
feat: install logging, error trapping, PXE/ISO integration tests
Kickstart installs on real hardware failed silently — no error reporting,
only 3 progress callbacks, zero log streaming. This overhaul makes every
install fully observable.
Kickstart improvements:
- Error trapping in %pre and %post (trap ERR sends failure details to bastion)
- 12+ granular progress stages (was 3): SSH, hostname, k3s prep, EFI boot, metadata
- Background log streamer: tails %post output and batch-sends to /api/log
- bastion_log() function for explicit log lines from kickstart scripts
Bastion API:
- POST /api/log — receives raw log lines from kickstart (single or batch)
- InstallLogBuffer — per-MAC ring buffer (2000 lines) + file persistence
- GET /api/logs/:mac — now returns log_lines + log_total alongside stages
- SSE /api/logs/:mac/follow — uses named events (event: stage vs event: log)
- Progress events forwarded to labd via bastion-progress WebSocket message
- Post-provision k3s logs routed through progressBus (was console-only)
dnsmasq fixes found during VM testing:
- HTTP Boot filename: ipxe-real.efi → ipxe.efi (leftover from old 2-stage approach)
- pxe-service directives: only in proxy mode (breaks OVMF PXE in full mode)
- PXEClient vendor class echo for UEFI firmware compatibility
Integration tests:
- PXE boot test: blank UEFI VM → dnsmasq → HTTP Boot → iPXE → bastion → install
- ISO boot test: blank VM boots from bastion-generated ISO → same flow
- Shared helpers: pxe-network (no DHCP, nftables fix), pxe-vm (UEFI + ISO boot)
- test-provision.sh: runs both PXE + ISO tests with prerequisite checks
- 250GB sparse QCOW2 disk (LVM layout needs ~204GB)
201 unit tests passing (11 new).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 22:26:33 +00:00
|
|
|
|
2026-03-28 20:24:14 +00:00
|
|
|
/** Delete libvirt's nftables reject rules for our bridge so host→VM traffic works.
|
|
|
|
|
* Must be called after every VM start/restart — libvirt recreates them. */
|
|
|
|
|
export function deleteNftablesRejectRules(): void {
|
|
|
|
|
// libvirt uses "ip libvirt_network" table (not "inet libvirt")
|
|
|
|
|
const tables = ["ip libvirt_network", "ip6 libvirt_network", "inet libvirt"];
|
|
|
|
|
for (const table of tables) {
|
|
|
|
|
try {
|
|
|
|
|
for (const chain of ["guest_input", "guest_output"]) {
|
|
|
|
|
const output = run(`nft -a list chain ${table} ${chain} 2>/dev/null || true`);
|
|
|
|
|
for (const line of output.split("\n")) {
|
|
|
|
|
if (line.includes(PXE_BRIDGE) && line.includes("reject")) {
|
|
|
|
|
const handleMatch = line.match(/# handle (\d+)/);
|
|
|
|
|
if (handleMatch) {
|
|
|
|
|
run(`nft delete rule ${table} ${chain} handle ${handleMatch[1]}`);
|
|
|
|
|
}
|
feat: install logging, error trapping, PXE/ISO integration tests
Kickstart installs on real hardware failed silently — no error reporting,
only 3 progress callbacks, zero log streaming. This overhaul makes every
install fully observable.
Kickstart improvements:
- Error trapping in %pre and %post (trap ERR sends failure details to bastion)
- 12+ granular progress stages (was 3): SSH, hostname, k3s prep, EFI boot, metadata
- Background log streamer: tails %post output and batch-sends to /api/log
- bastion_log() function for explicit log lines from kickstart scripts
Bastion API:
- POST /api/log — receives raw log lines from kickstart (single or batch)
- InstallLogBuffer — per-MAC ring buffer (2000 lines) + file persistence
- GET /api/logs/:mac — now returns log_lines + log_total alongside stages
- SSE /api/logs/:mac/follow — uses named events (event: stage vs event: log)
- Progress events forwarded to labd via bastion-progress WebSocket message
- Post-provision k3s logs routed through progressBus (was console-only)
dnsmasq fixes found during VM testing:
- HTTP Boot filename: ipxe-real.efi → ipxe.efi (leftover from old 2-stage approach)
- pxe-service directives: only in proxy mode (breaks OVMF PXE in full mode)
- PXEClient vendor class echo for UEFI firmware compatibility
Integration tests:
- PXE boot test: blank UEFI VM → dnsmasq → HTTP Boot → iPXE → bastion → install
- ISO boot test: blank VM boots from bastion-generated ISO → same flow
- Shared helpers: pxe-network (no DHCP, nftables fix), pxe-vm (UEFI + ISO boot)
- test-provision.sh: runs both PXE + ISO tests with prerequisite checks
- 250GB sparse QCOW2 disk (LVM layout needs ~204GB)
201 unit tests passing (11 new).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 22:26:33 +00:00
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
2026-03-28 20:24:14 +00:00
|
|
|
} catch { /* table may not exist */ }
|
feat: install logging, error trapping, PXE/ISO integration tests
Kickstart installs on real hardware failed silently — no error reporting,
only 3 progress callbacks, zero log streaming. This overhaul makes every
install fully observable.
Kickstart improvements:
- Error trapping in %pre and %post (trap ERR sends failure details to bastion)
- 12+ granular progress stages (was 3): SSH, hostname, k3s prep, EFI boot, metadata
- Background log streamer: tails %post output and batch-sends to /api/log
- bastion_log() function for explicit log lines from kickstart scripts
Bastion API:
- POST /api/log — receives raw log lines from kickstart (single or batch)
- InstallLogBuffer — per-MAC ring buffer (2000 lines) + file persistence
- GET /api/logs/:mac — now returns log_lines + log_total alongside stages
- SSE /api/logs/:mac/follow — uses named events (event: stage vs event: log)
- Progress events forwarded to labd via bastion-progress WebSocket message
- Post-provision k3s logs routed through progressBus (was console-only)
dnsmasq fixes found during VM testing:
- HTTP Boot filename: ipxe-real.efi → ipxe.efi (leftover from old 2-stage approach)
- pxe-service directives: only in proxy mode (breaks OVMF PXE in full mode)
- PXEClient vendor class echo for UEFI firmware compatibility
Integration tests:
- PXE boot test: blank UEFI VM → dnsmasq → HTTP Boot → iPXE → bastion → install
- ISO boot test: blank VM boots from bastion-generated ISO → same flow
- Shared helpers: pxe-network (no DHCP, nftables fix), pxe-vm (UEFI + ISO boot)
- test-provision.sh: runs both PXE + ISO tests with prerequisite checks
- 250GB sparse QCOW2 disk (LVM layout needs ~204GB)
201 unit tests passing (11 new).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 22:26:33 +00:00
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/** Destroy the PXE test network. */
|
|
|
|
|
export function destroyPxeNetwork(): void {
|
|
|
|
|
log(`Destroying PXE network: ${PXE_NETWORK_NAME}`);
|
|
|
|
|
// nftables rules are cleaned up when the network is destroyed (libvirt removes them)
|
|
|
|
|
virsh("net-destroy", PXE_NETWORK_NAME);
|
|
|
|
|
virsh("net-undefine", PXE_NETWORK_NAME);
|
|
|
|
|
}
|