diff --git a/CLAUDE.md b/CLAUDE.md new file mode 100644 index 0000000..81cdb63 --- /dev/null +++ b/CLAUDE.md @@ -0,0 +1,19 @@ +## Skill routing + +When the user's request matches an available skill, ALWAYS invoke it using the Skill +tool as your FIRST action. Do NOT answer directly, do NOT use other tools first. +The skill has specialized workflows that produce better results than ad-hoc answers. + +Key routing rules: +- Product ideas, "is this worth building", brainstorming → invoke gstack-office-hours +- Bugs, errors, "why is this broken", 500 errors → invoke gstack-investigate +- Ship, deploy, push, create PR → invoke gstack-ship +- QA, test the site, find bugs → invoke gstack-qa +- Code review, check my diff → invoke gstack-review +- Update docs after shipping → invoke gstack-document-release +- Weekly retro → invoke gstack-retro +- Design system, brand → invoke gstack-design-consultation +- Visual audit, design polish → invoke gstack-design-review +- Architecture review → invoke gstack-plan-eng-review +- Save progress, checkpoint, resume → invoke gstack-checkpoint +- Code quality, health check → invoke gstack-health diff --git a/TODOS.md b/TODOS.md new file mode 100644 index 0000000..4cac586 --- /dev/null +++ b/TODOS.md @@ -0,0 +1,47 @@ +# TODOS + +## P1 — Ship with Phase 1 + +### v2.0 Architecture Document Update +Update `bastion/docs/ARCHITECTURE.md` to cover v2.0: driver model, fleet system, +Pulumi integration, Vault secrets, Deno evaluator, new CLI grammar. The existing +doc covers v1.0 comprehensively (432 lines). v2.0 adds 5+ major subsystems. +**Effort:** M (human: 1 week / CC: 1-2 days) +**Depends on:** Phase 1 complete +**Source:** CEO review 2026-04-01 + +## P2 — Post-v2.0 Core + +### SSH Emergency Mode (scoped) +SSH-based operations limited to: (1) earliest necessary box provisioning before agent +is installed, and (2) emergency debugging/fixing operations that can't be done via agent. +NOT a general-purpose DeploymentTarget alternative. The v1.0 `recheck` and `fix-ssh-root.sh` +patterns are the model. Agent stays the primary management path. +**Effort:** S (human: 1 week / CC: 1 day) +**Depends on:** Phase 2 complete (DeploymentTarget interface exists) +**Source:** CEO review 2026-04-01 + +### Prometheus Metrics Endpoint +Add `/metrics` endpoint to labd: resource counts by status, apply duration histograms, +driver operation latency, fleet pipeline completion rates. Standard Prometheus scraping +for Grafana dashboards and alerting. +**Effort:** S (human: 2-3 days / CC: 2-3 hours) +**Depends on:** Phase 1 (labd exists with resource store) +**Source:** CEO review 2026-04-01 (observability gap) + +## P3 — Future Enhancements + +### Infrastructure Graph Visualization +Visual representation of resource dependencies, environment topology, fleet status. +Could be a web UI or terminal-based (like `kubectl tree`). +**Source:** CEO review 2026-04-01 + +### `labctl import` for Existing Cloud Resources +Discover and import existing AWS/GCP resources into the state store. +Pulumi's import functionality could be leveraged. +**Source:** CEO review 2026-04-01 + +### Built-in Secrets Rotation +Automatic rotation of managed secrets (database passwords, API keys). +Vault handles rotation but a labctl-native workflow could simplify. +**Source:** CEO review 2026-04-01