michal/mcpctl

feat: Kubernetes operator for MCP server management #47

Merged

michal merged 7 commits from feat/k8s-operator into main

2026-04-09 22:46:22 +00:00

Author	SHA1	Message	Date
Michal	016f8abe68	fix: accurate instance status — STARTING until pod is actually running All checks were successful CI/CD / typecheck (pull_request) Successful in 52s Details CI/CD / lint (pull_request) Successful in 1m53s Details CI/CD / test (pull_request) Successful in 1m2s Details CI/CD / build (pull_request) Successful in 4m0s Details CI/CD / smoke (pull_request) Successful in 8m38s Details CI/CD / publish-rpm (pull_request) Has been skipped Details CI/CD / publish-deb (pull_request) Has been skipped Details Instance status now reflects actual container state: - startOne() sets STARTING (not RUNNING) after container creation - syncStatus() promotes STARTING→RUNNING when pod is ready - syncStatus() demotes RUNNING→STARTING if pod restarts (CrashLoop) - External servers still get RUNNING immediately (no container) Previously, CrashLooping pods showed as RUNNING in mcpctl get instances. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 23:45:10 +01:00
Michal	1bd5087052	fix: add prompts/templates to backup + STDIO attach for docker-image servers Two bugs fixed: 1. Backup completeness: JSON backup API now includes prompts and templates. Previously these were silently dropped during backup/restore, causing data loss on migration. 2. STDIO proxy for docker-image servers: servers with dockerImage but no packageName/command (like docmost) now use k8s Attach to connect to the container's PID 1 stdin/stdout instead of exec. This fixes "has no packageName or command" errors. Changes: - backup-service.ts: add BackupPrompt/BackupTemplate types, export them - restore-service.ts: restore prompts (with project FK) and templates - mcp-proxy-service.ts: sendViaPersistentAttach for docker-image STDIO - orchestrator.ts: add attachInteractive to McpOrchestrator interface - kubernetes-orchestrator.ts: implement attachInteractive via k8s Attach - k8s-client-official.ts: expose Attach client Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 23:37:16 +01:00
Michal	d293df738a	feat: automatic reconciliation loop for MCP server instances mcpd now runs a periodic reconcileAll() every 30s that: - Detects crashed/missing containers (syncStatus) - Cleans up ERROR instances - Creates replacement pods to match desired replica count This replaces the old syncStatus-only timer. Servers migrated from another deployment or recovering from node failures will automatically get their instances recreated. 6 new tests for reconcileAll covering: missing instances, skip replicas=0, already-at-count, ERROR cleanup, multi-server, error isolation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 19:00:19 +01:00
Michal	14be2fa18e	feat: nodeSelector for MCP server pods + restore fix - Add MCPD_NODE_SELECTOR env var support in manifest generator for mixed-arch clusters (e.g. arm64+amd64) - Fix backup restore: resolve system user ID instead of hardcoded 'system' string Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 13:04:34 +01:00
Michal	3663963a32	fix: resolve system user ID in backup restore for projects The restore service hardcoded ownerId as the literal string 'system' instead of looking up the actual system user ID. This caused FK constraint violations when restoring projects to a fresh database. Now resolves the system user by email, falling back to the first available user. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 02:04:32 +01:00
Michal	5e45960a18	feat: add Kubernetes orchestrator for MCP server pod management mcpd can now deploy MCP server instances as Kubernetes pods instead of Docker containers. Set MCPD_ORCHESTRATOR=kubernetes to enable. - Add @kubernetes/client-node with thin wrapper (context enforcement via MCPD_K8S_CONTEXT to prevent multi-cluster mishaps) - Rewrite KubernetesOrchestrator: pod CRUD, pod IP extraction, exec via SPDY (one-shot + interactive), log streaming - Manifest generator: stdin:true for STDIO servers, args (not command) to preserve runner image entrypoint, security hardening - Orchestrator selection in main.ts via MCPD_ORCHESTRATOR env var - 25 unit tests for k8s orchestrator, all 624 tests pass Tested end-to-end on local k3s: - mcpd deployed via Pulumi, creates pods in mcpctl-servers namespace - NetworkPolicy verified: only mcpd can reach MCP server pods - Python runner (uvx) successfully runs aws-documentation-mcp-server Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 01:55:13 +01:00
Michal	f409952b0c	chore: add gstack skill routing rules to CLAUDE.md Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-02 01:33:56 +01:00