Root cause found: console=ttyS0,115200n8 causes 30-second timeout at every
systemd boot phase on hardware without a physical serial UART. Each phase
transition blocks waiting for the non-existent UART.
Changes:
- Remove console=ttyS0 from kickstart bootloader args and %post setup
- Enable Anaconda syslog forwarding (logging --host --port) for install visibility
- Improve syslog IP→MAC resolution (register from kickstart fetch + progress)
- Fix disk auto-detect: default to empty string (not /dev/sda) for NVMe support
- Enable SysRq magic keys (kernel.sysrq=1) for emergency reboot via JetKVM
- Simplify debug command: remove --sshd flag (inst.sshd always available),
add /debug-setup.sh HTTP endpoint for nc listener setup
- Add labctl provision logs -f (follow mode with polling)
- Add syslog listener unit tests
- Enable syslog log capture test in integration suite
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- logging --host blocks Anaconda when syslog UDP port not reachable
- nomodeset prevents amdgpu hang on SER9MAX (Radeon 780M)
- JetKVM helper script for device control (status, reboot, power)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Changed admin user from 'michal' to generic 'lab' user.
SSH key auth works for both root and lab user.
21/22 tests pass (1 skipped: log lines, needs log streamer redesign).
Bisection complete — all features work except background log streamer
which prevents Anaconda from syncing filesystem writes.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
All accumulated changes to kickstart template, test infrastructure,
and dnsmasq config. None of these produce a clean boot yet — saving
state before reverting to baseline for bisection.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- VMs get serial console on TCP (PXE: port 4555, ISO: port 4556)
- serialExec() helper: runs commands via telnet when SSH/network is down
- PXE test: on SSH failure, dumps hostname, IP, NetworkManager, sshd,
failed units, and fstab via serial console before failing
- Kickstart enables serial-getty@ttyS0 for auto-login on serial
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Kickstart %post now restores network-first EFI boot order (undoes
Anaconda's disk-first default). Grep pattern includes HTTP boot entries.
- Test force-restarts VM after install so OVMF rereads NVRAM.
- VM successfully network-boots after install, hits /dispatch, bastion
returns exit (local boot). Confirmed in test logs.
- nofail on /boot/efi fstab entry prevents emergency mode.
- Remaining: Fedora disk boot after iPXE exit may still fail.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ARM integration test:
- arm-iso-provision.test.ts: aarch64 VM boots from bastion-generated ISO
- Uses QEMU aarch64 emulation (slow but validates the R1 scenario)
- Generous timeouts for emulated boot (15min discovery, 60min install)
- test-provision.sh updated: `sudo ./scripts/test-provision.sh arm`
VM boot fixes:
- setBootDisk() preserves UEFI loader/nvram when switching to disk boot
- /boot/efi mount gets nofail in fstab (prevents emergency mode in VMs)
- chronyd enable uses || true (fails in kickstart chroot)
- createIsoVm supports arch parameter for ARM VMs
Note: SSH-after-reboot in OVMF VMs still fails — OVMF doesn't respect
efibootmgr changes and loops PXE/HTTP Boot. Real hardware works fine.
The install flow itself (discovery → kickstart → complete) is validated.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Kickstart installs on real hardware failed silently — no error reporting,
only 3 progress callbacks, zero log streaming. This overhaul makes every
install fully observable.
Kickstart improvements:
- Error trapping in %pre and %post (trap ERR sends failure details to bastion)
- 12+ granular progress stages (was 3): SSH, hostname, k3s prep, EFI boot, metadata
- Background log streamer: tails %post output and batch-sends to /api/log
- bastion_log() function for explicit log lines from kickstart scripts
Bastion API:
- POST /api/log — receives raw log lines from kickstart (single or batch)
- InstallLogBuffer — per-MAC ring buffer (2000 lines) + file persistence
- GET /api/logs/:mac — now returns log_lines + log_total alongside stages
- SSE /api/logs/:mac/follow — uses named events (event: stage vs event: log)
- Progress events forwarded to labd via bastion-progress WebSocket message
- Post-provision k3s logs routed through progressBus (was console-only)
dnsmasq fixes found during VM testing:
- HTTP Boot filename: ipxe-real.efi → ipxe.efi (leftover from old 2-stage approach)
- pxe-service directives: only in proxy mode (breaks OVMF PXE in full mode)
- PXEClient vendor class echo for UEFI firmware compatibility
Integration tests:
- PXE boot test: blank UEFI VM → dnsmasq → HTTP Boot → iPXE → bastion → install
- ISO boot test: blank VM boots from bastion-generated ISO → same flow
- Shared helpers: pxe-network (no DHCP, nftables fix), pxe-vm (UEFI + ISO boot)
- test-provision.sh: runs both PXE + ISO tests with prerequisite checks
- 250GB sparse QCOW2 disk (LVM layout needs ~204GB)
201 unit tests passing (11 new).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>