dd921473414ae069628df3478bea06a6cdd98c22
Some checks failed
CI/CD / typecheck (pull_request) Failing after 13s
CI/CD / lint (pull_request) Failing after 23s
CI/CD / test (pull_request) Failing after 10s
CI/CD / build (pull_request) Has been skipped
CI/CD / publish-rpm (pull_request) Has been skipped
CI/CD / publish-deb (pull_request) Has been skipped
Two changes prompted by today's etcd raft panic on worker1-k8s0
(tocommit out of range, lost-write on follower) and the cascading
disk pressure that surfaced underneath it.
Audit logs to journald
- kube-apiserver now uses audit-log-path=- so audit events flow to
k3s.service stdout and into journald instead of growing files in
/var/log/kubernetes. The previous setup combined apiserver's
internal rotation with a logrotate *.log glob that double-rotated
the rotated files into permanent orphans (observed: 7+ GB).
- New journald-limits operation writes a SystemMaxUse=2G drop-in so
audit volume cannot fill /var/log even under bursty load.
- log-rotation operation repurposed to decommission the obsolete
logrotate rule and reap leftover audit files. Idempotent: no-op
on fresh installs.
Etcd member recovery
- New recoverEtcdMember(broken, peer, hostname) codifies the
documented k3s recovery: stop k3s, etcdctl member remove, wipe
/var/lib/rancher/k3s/server/{db,tls,cred}, restart, poll for
rejoin. Refuses to operate when cluster size < 3 to preserve
quorum.
Tests
- 7 new unit tests covering both decommission paths and the
recovery procedure (54 total, all green).
- install.test.ts asserts the file-based audit args are gone.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Description
No description provided
Languages
TypeScript
83.4%
Shell
15.2%
JavaScript
0.7%
C
0.5%
Dockerfile
0.2%