When using `labctl provision debug <target> --sshd`, the rescue kickstart generates host keys, starts sshd (pw: debug) and nc listener (port 2323), and reports the IP back to bastion via /api/progress callback. Fully self-contained, no mounted FS needed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
4.1 KiB
Kickstart Reference — Lessons Learned
This documents pitfalls discovered during PXE boot testing. Read before modifying
the kickstart template (src/bastion/src/templates/install.ks.ts).
Package requirements
kernel-modules is mandatory
@core only installs kernel-modules-core, which lacks common modules like vfat,
zram, and many network/filesystem drivers. Without kernel-modules:
/boot/efi(FAT32) cannot mount →systemd-remount-fsfails → root stays read-only → sshd-keygen can't write host keys → SSH unreachablezram-generatorfails → can trigger emergency mode
Always include kernel-modules in %packages. This matches what the real
labmaster (192.168.8.11) has installed.
Regression introduced in commit fac14b6 which removed @server-product
(that group pulled in kernel-modules via fedora-release-server).
dosfstools is needed
Provides mkfs.vfat and ensures FAT filesystem support is available. The real
labmaster has it installed.
Verify against the real machine
Before changing the package list, SSH to the labmaster and compare:
ssh 192.168.8.11 "rpm -q <package>"
Anaconda %post execution order
This is critical and not well documented:
%prescripts run- Disk partitioning and formatting
- Package installation
- Anaconda writes system config (fstab, hostname, etc.)
%postscripts run (in chroot of installed system)%post --nochrootscripts run- Anaconda MAY overwrite fstab again after %post scripts
Consequence: You cannot reliably modify /etc/fstab from %post or
%post --nochroot. Anaconda overwrites it. Tested and confirmed — both
sed in %post and %post --nochroot had no effect on the final fstab.
What DOES work from %post:
- Writing files to
/etc/(systemd units, config files, SSH keys) - Enabling/disabling systemd services
- Installing additional packages
- Running
systemctl enable/mask
What does NOT work from %post:
- Modifying
/etc/fstab(Anaconda overwrites it) --fsoptionsonpart /boot/efi(Anaconda ignores it for EFI partitions)
UEFI / EFI partition
- Anaconda always creates an EFI System Partition for UEFI installs
- The EFI partition is FAT32 — requires
vfatkernel module to mount - If
/boot/efifails to mount,systemd-remount-fsfails, which leaves root as read-only. This cascades to break ALL services that need to write - The EFI partition is used by firmware directly for bootloader — the OS doesn't strictly need it mounted, but Anaconda adds it to fstab
VM-specific issues (libvirt/QEMU/OVMF)
iPXE exit behavior
exit(no args) returns EFI_SUCCESS → OVMF retries PXE, never reaches diskexit 1returns EFI_ABORTED → OVMF moves to next boot device (disk)- VM boot order needs both
networkandhd:--boot=uefi,network,hd
nftables
- libvirt creates reject rules for NAT networks in table
ip libvirt_network(NOTinet libvirt— this wrong table name cost hours of debugging) - These rules block new host→VM connections (SSH)
- Rules are recreated on every
virsh start— must delete after each VM restart - Chains:
guest_inputandguest_output
Serial console
- VM serial port:
--serial=tcp,host=127.0.0.1:4555,mode=bind,protocol=telnet - Use
virsh console <vm-name>for interactive access (handles telnet protocol) - Raw
socatworks for reading but pagers/readline break interactive use - Add
console=ttyS0,115200n8to kernel args for boot output on serial
SELinux on labmaster
- Set to permissive — this is for k3s/kubernetes, NOT because SSH needs it
- SSH works fine with SELinux enforcing on a properly installed Fedora system
- The
ld.so.cacheAVC denials seen during debugging were caused by the read-only root filesystem, not by SELinux policy
Testing checklist
Before merging kickstart changes:
- Check the real labmaster has the same packages:
ssh 192.168.8.11 "rpm -q <pkg>" - Run the PXE integration test:
sudo pnpm run test:integration:pxe - Verify via serial console (root /
lab-root-pw) if SSH fails - Check
mount | grep " / "— must showrw, notro - Check
systemctl --failed— no critical failures