Some checks failed
CI/CD / typecheck (pull_request) Failing after 13s
CI/CD / lint (pull_request) Failing after 23s
CI/CD / test (pull_request) Failing after 10s
CI/CD / build (pull_request) Has been skipped
CI/CD / publish-rpm (pull_request) Has been skipped
CI/CD / publish-deb (pull_request) Has been skipped
Two changes prompted by today's etcd raft panic on worker1-k8s0
(tocommit out of range, lost-write on follower) and the cascading
disk pressure that surfaced underneath it.
Audit logs to journald
- kube-apiserver now uses audit-log-path=- so audit events flow to
k3s.service stdout and into journald instead of growing files in
/var/log/kubernetes. The previous setup combined apiserver's
internal rotation with a logrotate *.log glob that double-rotated
the rotated files into permanent orphans (observed: 7+ GB).
- New journald-limits operation writes a SystemMaxUse=2G drop-in so
audit volume cannot fill /var/log even under bursty load.
- log-rotation operation repurposed to decommission the obsolete
logrotate rule and reap leftover audit files. Idempotent: no-op
on fresh installs.
Etcd member recovery
- New recoverEtcdMember(broken, peer, hostname) codifies the
documented k3s recovery: stop k3s, etcdctl member remove, wipe
/var/lib/rancher/k3s/server/{db,tls,cred}, restart, poll for
rejoin. Refuses to operate when cluster size < 3 to preserve
quorum.
Tests
- 7 new unit tests covering both decommission paths and the
recovery procedure (54 total, all green).
- install.test.ts asserts the file-based audit args are gone.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
labctl
Infrastructure management platform for bare-metal servers, Kubernetes clusters, and cloud resources.
Install
# From Gitea packages (Fedora/RHEL)
sudo dnf config-manager --add-repo https://mysources.co.uk/michal/-/packages/rpm/
sudo dnf install labctl
# From source
cd bastion && pnpm install && pnpm build
bun build src/cli/src/index.ts --compile --outfile dist/labctl
sudo cp dist/labctl /usr/bin/labctl
Quick Start
# Start the bastion (PXE provisioning server)
sudo labctl init bastion standalone start
# PXE boot a machine — it gets discovered automatically
labctl provision list
# Install Fedora on a discovered machine
labctl provision install 78:55:36:08:35:14 labmaster --role infra
# Reprovision (SSH reboot into PXE, preserves /home /srv /var/lib/rancher)
labctl provision reprovision 78:55:36:08:35:14 labmaster --role infra
Commands
Bastion (PXE Provisioning)
# Lifecycle
sudo labctl init bastion standalone start # Start bastion (daemonized)
sudo labctl init bastion standalone start --foreground # Start in foreground
sudo labctl init bastion standalone stop # Stop bastion
labctl init bastion standalone status # Show status, PID, machine count
# Options
sudo labctl init bastion standalone start \
--port 8080 \
--dir /tmp/lab-bastion \
--domain ad.itaz.eu \
--dhcp-mode proxy \
--fedora 43 \
--timezone Europe/London
Provisioning
# List all machines (discovered, queued, installing, installed)
labctl provision list
# Queue a machine for Fedora install
labctl provision install <mac> <hostname> --role worker # k3s worker (gets longhorn)
labctl provision install <mac> <hostname> --role infra # infra node (gets k3s server + /var/lib/rancher)
# Reprovision — queues install, SSHes in, sets PXE boot, reboots
labctl provision reprovision <mac> <hostname> --role infra
# Remove a machine from state
labctl provision forget <mac>
# Options
labctl provision install <mac> <hostname> \
--role worker \
--disk nvme0n1 \
--port 8080
Server Management (planned)
# List servers with filters
labctl get servers
labctl get servers --env production
labctl get servers --cloud baremetal
labctl get servers --cloud aws
labctl get servers --label role=k3s-worker
labctl get servers --label asg=web-servers
# Detailed server info
labctl describe server/puppet
labctl describe server/ser9
Remote Execution (planned)
# Execute commands on servers (audited, RBAC-checked)
labctl exec server/puppet -- whoami
labctl exec server/puppet -- systemctl status k3s
labctl exec server/puppet -it -- bash # interactive TTY
labctl exec server/puppet --timeout 30s -- long-running-task
Kubernetes (planned)
# Proxied kubectl — audited, RBAC-checked, no kubeconfig needed
labctl kubectl --cluster lab get pods
labctl kubectl --cluster lab get nodes
labctl kubectl --cluster lab logs pod/nginx -f
labctl kubectl --cluster lab exec pod/nginx -- bash
labctl kubectl --cluster lab apply -f deployment.yaml
labctl kubectl --cluster aws-prod get pods --namespace app
# Cluster management
labctl clusters add lab --kubeconfig ~/.kube/config
labctl clusters list
labctl clusters remove staging
Logs (planned)
# Server logs (journalctl passthrough via agent)
labctl logs server/puppet # all journal
labctl logs server/puppet -f # follow (live stream)
labctl logs server/puppet -n 100 # last 100 lines
labctl logs server/puppet -u k3s # specific unit
labctl logs server/puppet -u sshd --since "1h ago" # time range
labctl logs server/puppet --since "2026-03-17" --until "2026-03-18"
labctl logs server/puppet -k # kernel only
labctl logs server/puppet -p err # errors only
labctl logs server/puppet --file /var/log/nginx/error.log # tail a file
labctl logs server/puppet --file /var/log/nginx/error.log -n 50
# App logs (k8s pod logs)
labctl logs app/bastion
labctl logs app/bastion -f
labctl logs app/labd --container postgres
# Pulumi execution logs
labctl logs pulumi/run-abc123
labctl logs pulumi/run-abc123 -f # follow active run
# Bastion logs
labctl logs bastion/lab
labctl logs bastion/lab --mac 78:55:36:08:35:14 # specific machine's install
# Agent daemon logs
labctl logs agent/puppet
# Audit logs
labctl logs audit
labctl logs audit --user michal
labctl logs audit --user michal --since "1h ago"
labctl logs audit/michal-20260317-abc123 # specific session
labctl logs audit --action kubectl --cluster lab
labctl logs audit --action exec --server puppet
Apps (planned, replaces Helm)
# Install Pulumi-based apps to Kubernetes
labctl apps list # available apps
labctl apps install bastion # deploy bastion
labctl apps install bastion --set port=8080 # with overrides
labctl apps install bastion -f values.yaml # from values file
labctl apps install monitoring # Prometheus + Grafana
# Manage deployed apps
labctl apps status bastion # health, version, config
labctl apps upgrade bastion # rolling upgrade
labctl apps history bastion # version history
labctl apps rollback bastion 2 # rollback to version 2
labctl apps uninstall bastion
Infrastructure as Code (planned)
# Execute Pulumi programs via labd (RBAC-checked)
labctl apply -f infra/k3s-cluster.ts --env lab
labctl plan -f infra/k3s-cluster.ts --env lab # dry run
labctl destroy -f infra/k3s-cluster.ts --env lab
RBAC (planned)
# Roles and permissions
labctl get roles
labctl get users
labctl create role viewer --allow "read:*:*:*"
labctl create role lab-admin --allow "*:baremetal:lab:*" --deny "destroy:*:*:*"
labctl bind role lab-admin --user michal
labctl unbind role lab-admin --user michal
# Permission model: action:cloud:environment:server
# read:*:*:* — read everything
# exec:baremetal:lab:* — exec on any lab server
# kubectl:*:*:* — kubectl on any cluster
# *:baremetal:lab:puppet — full access to puppet only
# manage:*:*:* — manage apps, clusters, tokens
Environments and Clouds (planned)
labctl get environments
labctl get clouds
labctl create environment staging --cloud aws
labctl create environment lab --cloud baremetal
Partition Layout
Machines installed by the bastion get this LVM layout:
Worker role (k3s worker with Longhorn)
/boot/efi 600MB EFI
/boot 3GB ext4
── LVM VG: labvg ──
swap 27GB (matches RAM)
/ 33GB xfs
/var 100GB xfs
/var/log 10GB xfs
/home 10GB xfs ← preserved on reprovision
/srv 20GB xfs ← preserved on reprovision
/tmp tmpfs 4GB
/var/lib/longhorn rest xfs ← preserved on reprovision (Longhorn PVC storage)
Infra role (k3s server, labmaster)
/boot/efi 600MB EFI
/boot 3GB ext4
── LVM VG: labvg ──
swap 27GB (matches RAM)
/ 33GB xfs
/var 100GB xfs
/var/log 10GB xfs
/home 10GB xfs ← preserved on reprovision
/srv 20GB xfs ← preserved on reprovision
/var/lib/rancher 20GB xfs ← preserved on reprovision (k3s etcd data)
/tmp tmpfs 4GB
On reprovision, OS partitions (/, /var, /var/log, swap) are wiped. Data partitions (/home, /srv, /var/lib/longhorn, /var/lib/rancher) are preserved.
Architecture
┌──────────────────────────────────────────────────────────────┐
│ labctl CLI │
│ init | provision | get | exec | logs | apply | apps | kubectl│
└───────────────────────────┬──────────────────────────────────┘
│ mTLS
▼
┌──────────────────────────────────────────────────────────────┐
│ labd (master daemon — stateless, on k3s) │
│ ┌─────┐ ┌──────┐ ┌──────┐ ┌────────┐ ┌──────┐ ┌────────┐ │
│ │ CA │ │ RBAC │ │ Logs │ │ Pulumi │ │ Apps │ │kubectl │ │
│ │ │ │ │ │relay │ │executor│ │ │ │ proxy │ │
│ └─────┘ └──────┘ └──────┘ └────────┘ └──────┘ └────────┘ │
│ CockroachDB │
└──────────────┬─────────────────────────┬─────────────────────┘
│ mTLS │ mTLS
┌──────────▼───────────┐ ┌──────────▼───────────┐
│ lab-agent │ │ lab-agent │
│ bare-metal server │ │ AWS EC2 / cloud VM │
│ ┌────────────────┐ │ │ ┌────────────────┐ │
│ │ heartbeat │ │ │ │ heartbeat │ │
│ │ exec handler │ │ │ │ exec handler │ │
│ │ log streamer │ │ │ │ log streamer │ │
│ │ module runner │ │ │ │ module runner │ │
│ └────────────────┘ │ │ └────────────────┘ │
└──────────────────────┘ └──────────────────────┘
Technology Stack
| Component | Technology |
|---|---|
| Language | TypeScript (ESM) |
| CLI | Commander.js |
| HTTP Server | Fastify + WebSocket |
| Database | CockroachDB (PostgreSQL compatible) |
| ORM | Prisma |
| IaC | Pulumi (TypeScript) |
| k8s CNI | Cilium |
| Auth | mTLS (built-in CA) |
| Packaging | nfpm (RPM/DEB), bun compile |
| Containers | Podman + podman-compose |
| CI/CD | Gitea Actions |
| Testing | Vitest |
Development
cd bastion
# Install dependencies
pnpm install
# Build all packages
pnpm build
# Run tests (30 tests)
pnpm test:run
# Type check
pnpm typecheck
# Lint
pnpm lint
# Generate shell completions
pnpm completions:generate
# Build standalone binary
bun build src/cli/src/index.ts --compile --outfile dist/labctl
# Build RPM/DEB packages (both architectures)
bash scripts/build-rpm.sh --all
# Build Docker image
bash scripts/build-bastion.sh
# Full release (build + publish + install)
bash scripts/release.sh
Project Structure
bastion/
├── src/
│ ├── shared/ # @lab/shared — types, constants
│ ├── bastion/ # @lab/bastion — PXE provisioning server
│ ├── cli/ # @lab/cli — CLI binary (labctl)
│ ├── labd/ # @lab/labd — master daemon (planned)
│ └── agent/ # @lab/agent — server agent (planned)
├── modules/ # Built-in configuration modules (planned)
├── deploy/
│ └── k3s/ # Kubernetes manifests
├── stack/
│ ├── Dockerfile
│ └── docker-compose.yml
├── scripts/ # Build, publish, release scripts
├── completions/ # Generated shell completions
└── ARCHITECTURE.md
License
MIT