Commit Graph

10 Commits

Author SHA1 Message Date
Michal
d01b675cca feat: firewall management + reprovision SSH key fix
- Open firewall ports (dhcp, tftp, http, 4011) on bastion start
- Close firewall ports on bastion shutdown
- Auto-detect firewall zone for interface
- Fix reprovision SSH to use execFileSync with explicit key path

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 11:39:57 +00:00
Michal
62f896593d feat: CLI subcommands, PID self-restart, unit tests (22 passing)
CLI restructured:
  lab init bastion standalone start/stop/status
  lab provision list/install/reprovision/forget

- Nested commander subcommand groups (init > bastion > standalone, provision)
- PID file management: auto-kills old bastion on start, cleans up on stop
- stop command reads PID file and sends SIGTERM
- status command shows running state, port, machine counts
- forget command (DELETE /api/machines/:mac) removes from all state

Unit tests (22 tests, 3 files):
- kickstart.test.ts: worker/infra roles, SSH keys, partitions, admin user
- state.test.ts: load/save, atomic writes
- dispatch.test.ts: install/discover/local-boot routing, progress, forget

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 11:12:17 +00:00
Michal
64533b2dcf refactor: restructure bastion as pnpm monorepo (@lab/shared, @lab/bastion, @lab/cli)
- Split into 3 workspace packages: shared (types/constants), bastion (server), cli
- CLI binary renamed from "bastion" to "lab"
- Cross-package imports via @lab/shared and @lab/bastion workspace references
- Extracted BastionConfig, BastionState, HardwareInfo types into @lab/shared
- Added APP_NAME/APP_VERSION constants
- tsconfig.base.json with project references for build ordering
- Root workspace scripts: build, test, typecheck, clean

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 11:05:41 +00:00
Michal
937c01f5d9 fix: add --skip-dnsmasq/--skip-artifacts flags, fix config propagation
Enables running the TS bastion without dnsmasq for testing.
VM-tested: SSH works, partitions correct, k3s prereqs configured.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 03:11:29 +00:00
Michal
177e993736 feat: TypeScript bastion rewrite (initial scaffold)
Full rewrite of the bash bastion.sh into a TypeScript application:
- Fastify HTTP server with typed routes (dispatch, kickstart, API)
- Commander CLI (serve, install, list, reprovision)
- Kickstart templates as TypeScript template literals (no more heredoc hell)
- dnsmasq management via execa subprocess
- Merged machine list view (hardware + install info in one table)
- Containerized via podman-compose (Dockerfile + docker-compose.yml)
- All partition logic preserved (LVM, reprovision detection, role-based)

Not yet tested end-to-end — needs VM validation before replacing bash version.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 02:55:52 +00:00
Michal
fac14b6d4a feat: server kickstart with LVM, user creation, progress callbacks, reprovision
- LVM partition layout: /, /var, /var/log, /home, /srv, swap, tmpfs /tmp
  plus /var/lib/longhorn for worker role (grows to fill disk)
- Reprovision preserves /home, /srv, /var/lib/longhorn via %pre detection
- Admin user created matching the user running the bastion script
  with SSH keys from authorized_keys + local pubkeys, passwordless sudo
- Progress callbacks from %pre and %post to /api/progress endpoint
  with IP reported on completion (ssh command printed)
- Installed machines boot from local disk (iPXE exit) instead of
  re-entering discovery mode
- --role worker|infra flag (infra skips longhorn partition)
- reprovision subcommand: queues install + SSH reboot into PXE
- Self-cleanup: kills old bastion instances on start
- Domain config (DOMAIN env, default ad.itaz.eu)
- efibootmgr in %post to set local disk first in boot order
- k3s prereqs: kernel modules, sysctl, firewalld disabled, chrony
- VM reprovision test script (test-reprovision.sh)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 02:40:40 +00:00
Michal
75d17eb87c fix: HTTP Content-Length, firewall zones, UEFI boot improvements
- Fix Content-Length using byte count instead of character count
  (em dash in iPXE scripts caused mismatch, breaking iPXE chain)
- Use firewall zone-aware commands matching interface zone
- Add UEFI HTTP Boot support (arch 16/20) alongside PXE TFTP
- Add pxe-service directives for proper proxy DHCP responses
- Use bind-dynamic instead of bind-interfaces for bridge compat
- Add tftp-no-blocksize for UEFI firmware compatibility
- Use local ipxe packages instead of downloading from internet
- Add custom UEFI PXE loader stub (pxeloader.c) for chainloading
- Enable HTTP request logging for debugging boot issues

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 00:59:27 +00:00
Michal Rydlikowski
2a429088c5 bastion: discover-first PXE provisioning with multi-arch support
Rewrote bastion from install-only to discover-first flow:
- Default mode discovers hardware (PXE boot → inventory → poweroff)
- Discovered machines promoted to install via subcommand
- Per-MAC iPXE dispatch (/dispatch?mac=) routes discover vs install
- Python HTTP server with discovery API, state management, kickstart gen
- Added full DHCP mode (DHCP_MODE=full) for isolated/test networks
- Added arm64 UEFI support (client-arch 11, iPXE arm64 binary)
- Added QEMU test script (aarch64+KVM on Asahi Linux)
- All API endpoints unit tested and working

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-16 00:06:04 +00:00
Michal Rydlikowski
5ba22b94ea first commit 2026-03-16 00:00:13 +00:00
Michal Rydlikowski
ac695f506f first commit 2026-03-15 23:50:43 +00:00