fix(k3s): audit logs via journald + etcd recovery #13

Merged
michal merged 1 commits from fix/k3s-audit-via-journald into main 2026-05-05 20:29:52 +00:00
Owner

Summary

Two changes prompted by today's etcd raft panic on worker1-k8s0 (tocommit(19232321) is out of range [lastIndex(19232320)]) and the cascading disk pressure that surfaced underneath it.

Audit logs through journald

  • kube-apiserver now uses audit-log-path=- so audit events flow to k3s.service stdout and into journald instead of growing files in /var/log/kubernetes.
  • The previous setup mixed apiserver's internal rotation with a logrotate *.log glob that double-rotated rotated files into permanent orphans — observed at 7+ GB on worker0/labmaster.
  • New journald-limits operation writes a SystemMaxUse=2G drop-in so audit volume cannot fill /var/log even under bursty load.
  • log-rotation operation repurposed to decommission the obsolete logrotate rule and reap leftover audit files. Idempotent on fresh installs.

Etcd member recovery codified

  • New recoverEtcdMember({broken, peer, brokenHostname}) does the documented k3s recovery: stop k3s, etcdctl member remove, wipe /var/lib/rancher/k3s/server/{db,tls,cred}, restart, poll for rejoin.
  • Refuses to operate when cluster size < 3 (preserves quorum).

Already deployed live

The fix has been applied to the running cluster (worker0/1/2-k8s0, labmaster, spark-2935) — rolling k3s restart on the 3 control planes, all 3 etcd endpoints healthy, audit events confirmed flowing through journalctl -u k3s.

Disk reclaim from removing dead orphan files:

  • worker0: /var/log 96% → 9%
  • labmaster: 82% → 13%
  • worker1: 86% → 26%
  • worker2: 22% → 12%

Test plan

  • vitest --run src/modules/modules/k3s — 54 pass, 0 fail (7 new tests cover both decommission paths and the recovery procedure)
  • etcd cluster health verified post-rolling-restart
  • audit events visible in journalctl -u k3s on all 4 control planes
  • /var/log/kubernetes directory removed on all control planes
  • Future install run on a fresh node should leave /var/log/kubernetes absent (not yet exercised)
## Summary Two changes prompted by today's etcd raft panic on `worker1-k8s0` (`tocommit(19232321) is out of range [lastIndex(19232320)]`) and the cascading disk pressure that surfaced underneath it. ### Audit logs through journald - `kube-apiserver` now uses `audit-log-path=-` so audit events flow to `k3s.service` stdout and into journald instead of growing files in `/var/log/kubernetes`. - The previous setup mixed apiserver's internal rotation with a logrotate `*.log` glob that double-rotated rotated files into permanent orphans — observed at 7+ GB on worker0/labmaster. - New `journald-limits` operation writes a `SystemMaxUse=2G` drop-in so audit volume cannot fill `/var/log` even under bursty load. - `log-rotation` operation repurposed to decommission the obsolete logrotate rule and reap leftover audit files. Idempotent on fresh installs. ### Etcd member recovery codified - New `recoverEtcdMember({broken, peer, brokenHostname})` does the documented k3s recovery: stop k3s, `etcdctl member remove`, wipe `/var/lib/rancher/k3s/server/{db,tls,cred}`, restart, poll for rejoin. - Refuses to operate when cluster size < 3 (preserves quorum). ### Already deployed live The fix has been applied to the running cluster (worker0/1/2-k8s0, labmaster, spark-2935) — rolling k3s restart on the 3 control planes, all 3 etcd endpoints healthy, audit events confirmed flowing through `journalctl -u k3s`. Disk reclaim from removing dead orphan files: - worker0: `/var/log` 96% → 9% - labmaster: 82% → 13% - worker1: 86% → 26% - worker2: 22% → 12% ## Test plan - [x] `vitest --run src/modules/modules/k3s` — 54 pass, 0 fail (7 new tests cover both decommission paths and the recovery procedure) - [x] etcd cluster health verified post-rolling-restart - [x] audit events visible in `journalctl -u k3s` on all 4 control planes - [x] `/var/log/kubernetes` directory removed on all control planes - [ ] Future install run on a fresh node should leave `/var/log/kubernetes` absent (not yet exercised)
michal added 1 commit 2026-05-05 20:29:44 +00:00
fix(k3s): route audit logs through journald, codify etcd member recovery
Some checks failed
CI/CD / typecheck (pull_request) Failing after 13s
CI/CD / lint (pull_request) Failing after 23s
CI/CD / test (pull_request) Failing after 10s
CI/CD / build (pull_request) Has been skipped
CI/CD / publish-rpm (pull_request) Has been skipped
CI/CD / publish-deb (pull_request) Has been skipped
dd92147341
Two changes prompted by today's etcd raft panic on worker1-k8s0
(tocommit out of range, lost-write on follower) and the cascading
disk pressure that surfaced underneath it.

Audit logs to journald
- kube-apiserver now uses audit-log-path=- so audit events flow to
  k3s.service stdout and into journald instead of growing files in
  /var/log/kubernetes. The previous setup combined apiserver's
  internal rotation with a logrotate *.log glob that double-rotated
  the rotated files into permanent orphans (observed: 7+ GB).
- New journald-limits operation writes a SystemMaxUse=2G drop-in so
  audit volume cannot fill /var/log even under bursty load.
- log-rotation operation repurposed to decommission the obsolete
  logrotate rule and reap leftover audit files. Idempotent: no-op
  on fresh installs.

Etcd member recovery
- New recoverEtcdMember(broken, peer, hostname) codifies the
  documented k3s recovery: stop k3s, etcdctl member remove, wipe
  /var/lib/rancher/k3s/server/{db,tls,cred}, restart, poll for
  rejoin. Refuses to operate when cluster size < 3 to preserve
  quorum.

Tests
- 7 new unit tests covering both decommission paths and the
  recovery procedure (54 total, all green).
- install.test.ts asserts the file-based audit args are gone.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
michal merged commit f5af24699a into main 2026-05-05 20:29:52 +00:00
Sign in to join this conversation.
No Reviewers
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: michal/lab#13