Commit Graph

41 Commits

Author SHA1 Message Date
Michal
515206685b feat(openbao): kubernetes ServiceAccount auth — no static token in DB
Some checks failed
CI/CD / lint (push) Successful in 52s
CI/CD / test (push) Successful in 1m5s
CI/CD / typecheck (push) Successful in 2m8s
CI/CD / smoke (push) Failing after 3m38s
CI/CD / build (push) Successful in 4m15s
CI/CD / publish (push) Has been skipped
Why: requiring a static OpenBao root token to live (even once-bootstrap) on
the plaintext backend is the weakest link in the chain. With the bao-side
Kubernetes auth method enabled, mcpd's pod can authenticate using its own
projected SA token, exchange it for a short-lived Vault client token, and
keep the database free of any vault credentials at all.

Driver changes (src/mcpd/src/services/secret-backends/openbao.ts):
- New `OpenBaoConfig.auth = 'token' | 'kubernetes'`. Defaults to 'token' so
  existing rows keep working. Both shapes share url + mount + pathPrefix +
  namespace; auth-specific fields are mutually exclusive in the config schema.
- Kubernetes auth flow: read JWT from /var/run/secrets/.../token, POST to
  /v1/auth/<authMount>/login {role, jwt}, cache the returned client_token
  for `lease_duration - 60s` (grace window), then re-login.
- One-shot 403-retry: if a request comes back 403 (revoked / clock skew),
  purge cache and retry the original request once with a fresh login.
- Reads + writes go through the same getToken() path so token-auth is
  unchanged for existing deployments.

CLI (src/cli/src/commands/create.ts):
- `mcpctl create secretbackend bao --type openbao --auth kubernetes \
     --url https://bao.example:8200 --role mcpctl`
- Optional `--auth-mount` (default 'kubernetes') + `--sa-token-path` (default
  the standard projected-token path) for non-default deployments.
- Token-auth path unchanged: `--auth token --token-secret SECRET/KEY`
  (or omit `--auth` since 'token' is the default).

Validation (factory.ts) gates on the auth strategy: each path enforces its
own required fields and produces a clear error if misconfigured.

Tests: 6 new k8s-auth unit cases (login wire shape, lease-based caching,
custom authMount, 403-on-login, missing-role rejection, missing-tokenSecretRef
rejection). Full suite 1859/1859. Completions regenerated for the new flags.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 23:23:05 +01:00
Michal
4d8ee23d0e feat(mcplocal): RBAC-bounded vllm-managed failover + name-based llm lookup
Why: when mcpd's inference proxy is unreachable, clients with a local
vllm-managed provider should be able to substitute — but only if they still
have view permission on the centralized Llm. Otherwise revoking an Llm
wouldn't actually stop a misbehaving client.

Infrastructure (the agent + mcplocal HTTP-mode wire-up will land separately
when those clients pivot to mcpd's proxy):

- LlmProviderFileEntry gains optional `failoverFor: <central llm name>`. The
  entry is otherwise the same local provider it always was; the new field
  just declares which central Llm it can substitute for.
- ProviderRegistry tracks a failover map (registerFailover / getFailoverFor /
  listFailovers). Unregister removes any failover entry pointing at the
  removed provider so we don't end up with dangling references.
- New FailoverRouter wraps a primary inference call. On primary failure: if
  a local provider is registered for the Llm, HEAD-probe `mcpd /api/v1/llms/
  :name` with the caller's bearer to verify view permission, then either
  invoke the local provider (allowed) or re-throw the primary error (403,
  401, network unreachable, anything else — all fail-closed).
- Server: GET /api/v1/llms/:idOrName accepts both CUID and human name. Lets
  FailoverRouter probe by name without a separate id-resolution call. HEAD
  derives automatically from GET in Fastify, which runs the same RBAC hook
  and drops the body — exactly what the probe needs.

Tests: 11 failover unit tests (registry map, decision flow, fail-closed for
forbidden + unreachable, checkAuth status mapping) + 4 new route tests
(name lookup, HEAD existing/missing). Full suite 1844/1844 (+14 from Phase
2's 1830). TypeScript clean across mcpd + mcplocal.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 13:05:43 +01:00
Michal
23f53a0798 feat(mcpd): inference proxy — POST /api/v1/llms/:name/infer
Why: the point of the Llm resource (Phase 1) is that credentials never leave
the server. This lands the proxy: clients POST OpenAI chat/completions to
mcpd, mcpd attaches the provider API key server-side, and the response
streams back as OpenAI-format SSE.

Design:
- Wire format client-side is always OpenAI chat/completions — every existing
  SDK speaks it. Adapters translate on the provider side.
- `openai | vllm | deepseek | ollama` → pure passthrough (they already speak
  OpenAI). `anthropic` → translator to/from Anthropic Messages API
  (system-string extraction, content-block flattening, SSE event remap).
- Plain fetch; no @anthropic-ai/sdk dep. Consistent with the OpenBao driver
  shape and keeps the proxy layer thin.
- `gemini-cli` intentionally rejected — subprocess providers need extra
  lifecycle plumbing; deferred to a follow-up.
- Streaming: adapters yield `StreamingChunk`s; the route frames them as
  `data: <json>\n\n` + terminal `data: [DONE]\n\n` so any OpenAI client
  works unchanged.

RBAC:
- New URL special-case in mapUrlToPermission: `POST /api/v1/llms/:name/infer`
  → `run:llms:<name>` (not the default create:llms). Users need an explicit
  `{role: 'run', resource: 'llms', [name: X]}` binding to call infer.
- Possession of `edit:llms` does NOT imply `run` — keeps catalogue
  management separate from spend.

Audit: route emits an `llm_inference_call` event per request (llm name,
model, user/tokenSha, streaming, duration, status). main.ts wires it to the
structured logger for now; hook is in place for a richer audit sink later.

Tests:
- 11 adapter tests (passthrough POST shape + default URLs + no-auth ollama +
  SSE forwarding; anthropic translate request/response + non-2xx wrap + SSE
  event translation; registry dispatch + caching + unsupported-provider).
- 7 route tests (404, 400, non-streaming dispatch + audit, apiKey failure,
  null apiKeyRef path, streaming SSE output, 502 on adapter error).
- Full suite 1830/1830 (+18 from Phase 1's 1812). TypeScript clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 22:43:55 +01:00
Michal
6ff90a8228 feat(mcpd): Llm resource — CRUD + CLI + apply
Why: every client that wants an LLM (the agent, HTTP-mode mcplocal, Claude
Code's STDIO mcplocal) today has to know the provider URL + key, and each
user's ~/.mcpctl/config.json carries them. Centralising the catalogue on the
server is the prerequisite for Phase 2 (mcpd proxies inference so credentials
never leave the cluster).

This phase adds the `Llm` resource and its CRUD surface — no proxy yet, no
client pivot yet. Just enough to register what you have.

Schema:
- New `Llm` model: name/type/model/url/tier/description + {apiKeySecretId,
  apiKeySecretKey} FK pair. Reverse `llms` relation on Secret.
- Provider types: anthropic | openai | deepseek | vllm | ollama | gemini-cli.
- Tiers: fast | heavy.

mcpd:
- LlmRepository + LlmService + Zod validation schema + /api/v1/llms routes.
- API surface exposes `apiKeyRef: {name, key}` — the service translates to/
  from the FK pair so clients never deal in cuids.
- `resolveApiKey(llmName)` reads through SecretService (which itself dispatches
  to the right SecretBackend). That's the hook Phase 2's inference proxy uses.
- RBAC: added `'llms'` to RBAC_RESOURCES + resource alias. Standard
  view/create/edit/delete semantics.
- Wired into main.ts (repo, service, routes).

CLI:
- `mcpctl create llm <name> --type X --model Y --tier fast|heavy --api-key-ref SECRET/KEY [--url ...] [--extra k=v ...]`
- `mcpctl get|describe|delete llm` — standard resource verbs.
- `mcpctl apply -f` with `kind: llm` (single- or multi-doc yaml/json).
  Applied after secrets, before servers — apiKeyRef resolves an existing Secret.
- Shell completions regenerated.

Tests: 11 service unit tests + 9 route tests (happy path, 404s, 409, validation).
Full suite 1812/1812 (+20 from the 1792 Phase 0 baseline). TypeScript clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 21:28:43 +01:00
Michal
029c3d5f34 feat(mcpd): pluggable SecretBackend abstraction + OpenBao driver + migrate
All checks were successful
CI/CD / typecheck (pull_request) Successful in 51s
CI/CD / lint (pull_request) Successful in 1m47s
CI/CD / test (pull_request) Successful in 1m3s
CI/CD / smoke (pull_request) Successful in 4m34s
CI/CD / build (pull_request) Successful in 3m50s
CI/CD / publish (pull_request) Has been skipped
Why: API keys live in Postgres as plaintext JSON. A DB read exposes every
credential in the system. Before centralising more secrets (LLM keys, etc.)
we want to be able to point at an external KV store and drop DB access to
sensitive rows.

New model:
- `SecretBackend` resource (CRUD + isDefault invariant) owns how a secret is
  stored. `Secret` gains `backendId` FK and `externalRef`. Reads/writes
  dispatch through a driver.
- `plaintext` driver (near-noop, uses existing Secret.data column) is seeded
  as the `default` row at startup. Acts as trust root / bootstrap.
- `openbao` driver (also HashiCorp Vault KV v2 compatible) talks plain HTTP,
  no SDK dependency. Auth via static token pulled from a plaintext-backed
  `Secret` through the injected SecretRefResolver. Caches resolved token.
- `SecretMigrateService` moves secrets one-at-a-time: read → write dest →
  flip row → best-effort source delete. Interrupted runs are idempotent
  (skips secrets already on destination).

CLI surface:
- `mcpctl create|get|describe|delete secretbackend` + `--default` on create.
- `mcpctl migrate secrets --from X --to Y [--names a,b] [--keep-source] [--dry-run]`
- `apply -f` round-trips secretbackends (yaml/json multi-doc + grouped).
- RBAC: `secretbackends` resource + `run:migrate-secrets` operation.
- Fish + bash completions regenerated.

docs/secret-backends.md covers the OpenBao policy, chicken-and-egg auth flow,
and the migration semantics.

Broke the circular dep (OpenBao needs SecretService to resolve its own token,
SecretService needs SecretBackendService) with a deferred-resolver bridge in
mcpd startup. 11 new driver unit tests; existing env-resolver/secret-route/
backup tests updated for the new service signatures. Full suite: 1792/1792.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 19:29:55 +01:00
Michal
a151b2e756 feat: mcpctl mcptoken verbs + mcpd auth dispatch + audit plumbing
Adds the end-to-end CLI surface for McpTokens and the mcpd auth dispatch
that recognizes them.

mcpd auth middleware:
  - Dispatch on the `mcpctl_pat_` bearer prefix. McpToken bearers resolve
    through a new `findMcpToken(hash)` dep, populating `request.mcpToken`
    and `request.userId = ownerId`. Everything else follows the existing
    session path.
  - Returns 401 for revoked / expired / unknown tokens.
  - Global RBAC hook now threads `mcpTokenSha` into `canAccess` /
    `canRunOperation` / `getAllowedScope`, and enforces a hard
    project-scope check: a McpToken principal can only hit
    `/api/v1/projects/<its-project>/...`.

CLI verbs:
  - `mcpctl create mcptoken <name> -p <proj> [--rbac empty|clone]
    [--bind role:view,resource:servers] [--ttl 30d|never|ISO]
    [--description ...] [--force]` — returns the raw token once.
  - `mcpctl get mcptokens [-p <proj>]` — table with
    NAME/PROJECT/PREFIX/CREATED/LAST USED/EXPIRES/STATUS.
  - `mcpctl get mcptoken <name> -p <proj>` and
    `mcpctl describe mcptoken <name> -p <proj>` — describe surfaces the
    auto-created RBAC bindings.
  - `mcpctl delete mcptoken <name> -p <proj>`.
  - `apply -f` support with `kind: mcptoken`. Tokens are immutable, so
    apply creates if missing and skips if the name is already active.

Audit plumbing:
  - `AuditEvent` / collector now carry optional `tokenName` / `tokenSha`.
    `setSessionMcpToken` sits alongside `setSessionUserName`; both feed a
    per-session principal map used at emit time.
  - `AuditEventService` query accepts `tokenName` / `tokenSha` filters.
  - Console `AuditEvent` type carries the new fields so a follow-up can
    add a TOKEN column.

Completions regenerated. 1764/1764 tests pass workspace-wide.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 01:12:43 +01:00
Michal
2ddb493bb0 feat(mcpd): McpToken schema + CRUD routes + introspection
Adds a new McpToken Prisma model (project-scoped, SHA-256 hashed at rest,
optional expiry, revocable) plus backing repository, service, and REST
routes. Tokens are a first-class RBAC subject: new 'McpToken' kind is
added to the subject enum and the service auto-creates an RbacDefinition
with subject McpToken:<sha> when bindings are provided.

Creator-permission ceiling: the service rejects any requested binding
the creator cannot already satisfy themselves (re-uses
rbacService.canAccess / canRunOperation). rbacMode=clone snapshots the
creator's full permissions into the token.

Routes:
  POST   /api/v1/mcptokens              create (returns raw token once)
  GET    /api/v1/mcptokens              list (filter by project)
  GET    /api/v1/mcptokens/:id          describe (no secret in response)
  POST   /api/v1/mcptokens/:id/revoke   soft-delete + remove RbacDef
  DELETE /api/v1/mcptokens/:id          hard-delete
  GET    /api/v1/mcptokens/introspect   validate raw bearer (used by mcplocal)

Extends AuditEvent with optional tokenName/tokenSha fields (indexed) so
token-driven activity can be filtered later. Adds token helpers in
@mcpctl/shared: TOKEN_PREFIX='mcpctl_pat_', generateToken, hashToken,
isMcpToken, timingSafeEqualHex.

Follow-up PRs add the auth-hook dispatch on the prefix, the CLI verbs,
and the HTTP-mode mcplocal that calls /introspect.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 01:00:04 +01:00
Michal
3149ea3ae7 fix: MCP proxy resilience — discovery cache, default liveness probes
Some checks failed
CI/CD / lint (push) Successful in 52s
CI/CD / typecheck (push) Successful in 1m51s
CI/CD / test (push) Successful in 1m1s
CI/CD / smoke (push) Failing after 3m21s
CI/CD / build (push) Successful in 4m9s
CI/CD / publish (push) Has been skipped
Adds a per-server tools/list cache in McpRouter (positive + negative TTL)
so a slow or dead upstream only stalls the first discovery call, not every
subsequent client request. Invalidated on upstream add/remove.

Health probes now apply a default liveness spec (tools/list via the real
production path) to any RUNNING instance without an explicit healthCheck,
so synthetic and real failures converge on the same signal.

Includes supporting updates in mcpd-client, discovery, upstream/mcpd,
seeder, and fulldeploy/release scripts.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 00:48:57 +01:00
Michal
9ff2dcc3d9 fix: actually wire STDIO attach for docker-image MCP servers
All checks were successful
CI/CD / typecheck (pull_request) Successful in 52s
CI/CD / lint (pull_request) Successful in 1m43s
CI/CD / test (pull_request) Successful in 1m2s
CI/CD / build (pull_request) Successful in 1m45s
CI/CD / publish-rpm (pull_request) Has been skipped
CI/CD / publish-deb (pull_request) Has been skipped
CI/CD / smoke (pull_request) Successful in 9m51s
Commit 1bd5087 added attachInteractive to the orchestrator interface
but never hooked it up in mcp-proxy-service — sendViaPersistentAttach
was promised in the commit message but missing from the diff. Servers
with a distroless image whose entrypoint IS the MCP server (gitea-mcp)
ended up needing a bogus `command: [node, dist/index.js]` workaround
that silently failed on every exec, leaving clients with empty tool
lists.

Changes:
- PersistentStdioClient: take a StdioMode discriminated union. Exec
  mode runs a command via execInteractive; attach mode talks to PID 1
  via attachInteractive.
- mcp-proxy-service: dispatch by config — command → exec; packageName
  → exec via runtime runner; dockerImage-only → attach. Error
  serialization no longer drops non-Error objects as "[object Object]".
- templates/gitea.yaml: remove the command workaround; the image CMD
  runs as PID 1 and mcpd attaches.
- Add unit tests covering both modes and the unsupported-orchestrator
  paths.

Also required (separate repo): mcpd's k8s Role needed pods/attach
added alongside pods/exec; updated in kubernetes-deployment/…/mcpctl/server.ts
and kubectl-patched on the live cluster.

Verified end-to-end against mcpctl.ad.itaz.eu:
- gitea (attach): 49 tools listed, real tools/call round-trip.
- aws-docs (exec via packageName): 4 tools, no regression.
- docmost (exec via command): 11 tools, no regression.
- mcpd suite: 634/634 passing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 22:26:26 +01:00
Michal
016f8abe68 fix: accurate instance status — STARTING until pod is actually running
All checks were successful
CI/CD / typecheck (pull_request) Successful in 52s
CI/CD / lint (pull_request) Successful in 1m53s
CI/CD / test (pull_request) Successful in 1m2s
CI/CD / build (pull_request) Successful in 4m0s
CI/CD / smoke (pull_request) Successful in 8m38s
CI/CD / publish-rpm (pull_request) Has been skipped
CI/CD / publish-deb (pull_request) Has been skipped
Instance status now reflects actual container state:
- startOne() sets STARTING (not RUNNING) after container creation
- syncStatus() promotes STARTING→RUNNING when pod is ready
- syncStatus() demotes RUNNING→STARTING if pod restarts (CrashLoop)
- External servers still get RUNNING immediately (no container)

Previously, CrashLooping pods showed as RUNNING in mcpctl get instances.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 23:45:10 +01:00
Michal
1bd5087052 fix: add prompts/templates to backup + STDIO attach for docker-image servers
Two bugs fixed:

1. Backup completeness: JSON backup API now includes prompts and
   templates. Previously these were silently dropped during
   backup/restore, causing data loss on migration.

2. STDIO proxy for docker-image servers: servers with dockerImage
   but no packageName/command (like docmost) now use k8s Attach
   to connect to the container's PID 1 stdin/stdout instead of
   exec. This fixes "has no packageName or command" errors.

Changes:
- backup-service.ts: add BackupPrompt/BackupTemplate types, export them
- restore-service.ts: restore prompts (with project FK) and templates
- mcp-proxy-service.ts: sendViaPersistentAttach for docker-image STDIO
- orchestrator.ts: add attachInteractive to McpOrchestrator interface
- kubernetes-orchestrator.ts: implement attachInteractive via k8s Attach
- k8s-client-official.ts: expose Attach client

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 23:37:16 +01:00
Michal
d293df738a feat: automatic reconciliation loop for MCP server instances
mcpd now runs a periodic reconcileAll() every 30s that:
- Detects crashed/missing containers (syncStatus)
- Cleans up ERROR instances
- Creates replacement pods to match desired replica count

This replaces the old syncStatus-only timer. Servers migrated
from another deployment or recovering from node failures will
automatically get their instances recreated.

6 new tests for reconcileAll covering: missing instances, skip
replicas=0, already-at-count, ERROR cleanup, multi-server,
error isolation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 19:00:19 +01:00
Michal
5e45960a18 feat: add Kubernetes orchestrator for MCP server pod management
mcpd can now deploy MCP server instances as Kubernetes pods instead of
Docker containers. Set MCPD_ORCHESTRATOR=kubernetes to enable.

- Add @kubernetes/client-node with thin wrapper (context enforcement
  via MCPD_K8S_CONTEXT to prevent multi-cluster mishaps)
- Rewrite KubernetesOrchestrator: pod CRUD, pod IP extraction,
  exec via SPDY (one-shot + interactive), log streaming
- Manifest generator: stdin:true for STDIO servers, args (not command)
  to preserve runner image entrypoint, security hardening
- Orchestrator selection in main.ts via MCPD_ORCHESTRATOR env var
- 25 unit tests for k8s orchestrator, all 624 tests pass

Tested end-to-end on local k3s:
- mcpd deployed via Pulumi, creates pods in mcpctl-servers namespace
- NetworkPolicy verified: only mcpd can reach MCP server pods
- Python runner (uvx) successfully runs aws-documentation-mcp-server

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 01:55:13 +01:00
Michal
7818cb2194 feat: Git-based backup system replacing JSON bundle backup/restore
DB is source of truth with git as downstream replica. SSH key generated
on first start, all resource mutations committed as apply-compatible YAML.
Supports manual commit import, conflict resolution (DB wins), disaster
recovery (empty DB restores from git), and timeline branches on restore.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-08 01:14:28 +00:00
Michal
0995851810 feat: remove proxyMode — all traffic goes through mcplocal proxy
proxyMode "direct" was a security hole (leaked secrets as plaintext env
vars in .mcp.json) and bypassed all mcplocal features (gating, audit,
RBAC, content pipeline, namespacing). Removed from schema, API, CLI,
and all tests. Old configs with proxyMode are accepted but silently
stripped via Zod .transform() for backward compatibility.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 23:36:36 +00:00
Michal
86c5a61eaa feat: add userName tracking to audit events
- Add userName column to AuditEvent schema with index and migration
- Add GET /api/v1/auth/me endpoint returning current user identity
- AuditCollector auto-fills userName from session→user map, resolved
  lazily via /auth/me on first session creation
- Support userName and date range (from/to) filtering on audit events
  and sessions endpoints
- Audit console sidebar groups sessions by project → user
- Add date filter presets (d key: all/today/1h/24h/7d) to console
- Add scrolling and page up/down to sidebar navigation
- Tests: auth-me (4), audit-username collector (4), route filters (2),
  smoke tests (2)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 00:18:58 +00:00
Michal
5d859ca7d8 feat: audit console TUI, system prompt management, and CLI improvements
Audit Console Phase 1: tool_call_trace emission from mcplocal router,
session_bind/rbac_decision event kinds, GET /audit/sessions endpoint,
full Ink TUI with session sidebar, event timeline, and detail view
(mcpctl console --audit).

System prompts: move 6 hardcoded LLM prompts to mcpctl-system project
with extensible ResourceRuleRegistry validation framework, template
variable enforcement ({{maxTokens}}, {{pageCount}}), and delete-resets-
to-default behavior. All consumers fetch via SystemPromptFetcher with
hardcoded fallbacks.

CLI: -p shorthand for --project across get/create/delete/config commands,
console auto-scroll improvements, shell completions regenerated.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 23:50:54 +00:00
Michal
03827f11e4 feat: eager vLLM warmup and smart page titles in paginate stage
- Add warmup() to LlmProvider interface for eager subprocess startup
- ManagedVllmProvider.warmup() starts vLLM in background on project load
- ProviderRegistry.warmupAll() triggers all managed providers
- NamedProvider proxies warmup() to inner provider
- paginate stage generates LLM-powered descriptive page titles when
  available, cached by content hash, falls back to generic "Page N"
- project-mcp-endpoint calls warmupAll() on router creation so vLLM
  is loading while the session initializes

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 19:07:39 +00:00
Michal
69867bd47a feat: mcpctl v0.0.1 — first public release
Some checks are pending
CI / lint (push) Waiting to run
CI / typecheck (push) Waiting to run
CI / test (push) Waiting to run
CI / build (push) Blocked by required conditions
CI / package (push) Blocked by required conditions
Comprehensive MCP server management with kubectl-style CLI.

Key features in this release:
- Declarative YAML apply/get round-trip with project cloning support
- Gated sessions with prompt intelligence for Claude
- Interactive MCP console with traffic inspector
- Persistent STDIO connections for containerized servers
- RBAC with name-scoped bindings
- Shell completions (fish + bash) auto-generated
- Rate-limit retry with exponential backoff in apply
- Project-scoped prompt management
- Credential scrubbing from git history

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-27 17:05:05 +00:00
Michal
d4aa677bfc fix: bootstrap system user before system project (FK constraint)
The system project needs a valid ownerId that references an existing user.
Create a system@mcpctl.local user via upsert before creating the project.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 23:27:59 +00:00
Michal
ecc9c48597 feat: gated project experience & prompt intelligence
Implements the full gated session flow and prompt intelligence system:

- Prisma schema: add gated, priority, summary, chapters, linkTarget fields
- Session gate: state machine (gated → begin_session → ungated) with LLM-powered
  tool selection based on prompt index
- Tag matcher: intelligent prompt-to-tool matching with project/server/action tags
- LLM selector: tiered provider selection (fast for gating, heavy for complex tasks)
- Link resolver: cross-project MCP resource references (project/server:uri format)
- Prompt summary service: LLM-generated summaries and chapter extraction
- System project bootstrap: ensures default project exists on startup
- Structural link health checks: enrichWithLinkStatus on prompt GET endpoints
- CLI: create prompt --priority/--link, create project --gated/--no-gated,
  describe project shows prompts section, get prompts shows PRI/LINK/STATUS
- Apply/edit: priority, linkTarget, gated fields supported
- Shell completions: fish updated with new flags
- 1,253 tests passing across all packages

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 23:22:42 +00:00
Michal
b025ade2b0 feat: add prompt resources, fix MCP proxy transport, enrich tool descriptions
- Fix MCP proxy to support SSE and STDIO transports (not just HTTP POST)
- Enrich tool descriptions with server context for LLM clarity
- Add Prompt and PromptRequest resources with two-resource RBAC model
- Add propose_prompt MCP tool for LLM to create pending prompt requests
- Add prompt resources visible in MCP resources/list (approved + session's pending)
- Add project-level prompt/instructions in MCP initialize response
- Add ServiceAccount subject type for RBAC (SA identity from X-Service-Account header)
- Add CLI commands: create prompt, get prompts/promptrequests, approve promptrequest
- Add prompts to apply config schema
- 956 tests passing across all packages

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 14:53:00 +00:00
Michal
9badb0e478 feat: add tests.sh runner and project routes integration tests
- tests.sh: run all tests with `bash tests.sh`, summary with `--short`
- tests.sh --filter mcpd/cli: run specific package
- project-routes.test.ts: 17 new route-level tests covering CRUD,
  attach/detach, and the ownerId filtering bug fix

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 18:57:46 +00:00
Michal
329315ec71 feat: remove ProjectMember, add expose RBAC role, attach/detach-server commands
- Remove ProjectMember model entirely (RBAC manages project access)
- Add 'expose' RBAC role for /mcp-config endpoint access (edit implies expose)
- Rename CLI flags: --llm-provider → --proxy-mode-llm-provider, --llm-model → --proxy-mode-llm-model
- Add attach-server / detach-server CLI commands (mcpctl --project NAME attach-server SERVER)
- Add POST/DELETE /api/v1/projects/:id/servers endpoints for server attach/detach
- Remove members from backup/restore, apply, get, describe
- Prisma migration to drop ProjectMember table

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 17:50:01 +00:00
Michal
f0faa764e2 fix: RBAC name-scoped access — CUID resolution + list filtering
Two bugs fixed:
- GET /api/v1/servers/:cuid now resolves CUID→name before RBAC check,
  so name-scoped bindings match correctly
- List endpoints now filter responses via preSerialization hook using
  getAllowedScope(), so name-scoped users only see their resources

Also adds fulldeploy.sh orchestrator script.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 12:26:37 +00:00
Michal
ddc95134fb fix: migrate legacy admin role to granular roles at startup
- Add migrateAdminRole() that runs on mcpd boot
- Converts { role: 'admin', resource: X } → edit + run bindings
- Adds operation bindings for wildcard admin (impersonate, logs, etc.)
- Add tests verifying unknown/legacy roles are denied by canAccess

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 11:31:15 +00:00
Michal
c5147e8270 feat: granular RBAC with resource/operation bindings, users, groups
- Replace admin role with granular roles: view, create, delete, edit, run
- Two binding types: resource bindings (role+resource+optional name) and
  operation bindings (role:run + action like backup, logs, impersonate)
- Name-scoped resource bindings for per-instance access control
- Remove role from project members (all permissions via RBAC)
- Add users, groups, RBAC CRUD endpoints and CLI commands
- describe user/group shows all RBAC access (direct + inherited)
- create rbac supports --subject, --binding, --operation flags
- Backup/restore handles users, groups, RBAC definitions
- mcplocal project-based MCP endpoint discovery
- Full test coverage for all new functionality

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 11:05:19 +00:00
Michal
738bfafd46 feat: MCP health probe runner — periodic tool-call probes for instances
Implements Kubernetes-style liveness probes that call MCP tools defined
in server healthCheck configs. For STDIO servers, uses docker exec to
spawn a disposable MCP client that sends initialize + tool call. For
HTTP/SSE servers, sends JSON-RPC directly.

- HealthProbeRunner service with configurable interval/threshold/timeout
- execInContainer added to orchestrator interface + Docker implementation
- Instance findById now includes server relation (fixes describe showing IDs)
- Events appended to instance (last 50), healthStatus tracked as
  healthy/degraded/unhealthy
- 12 unit tests covering probing, thresholds, intervals, cleanup

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 00:38:48 +00:00
Michal
8a4ff6e378 fix: remove unused variables from profile cleanup
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-22 18:43:32 +00:00
Michal
6d9a9f572c feat: replace profiles with kubernetes-style secrets
Replace the confused Profile abstraction with a dedicated Secret resource
following Kubernetes conventions. Servers now have env entries with inline
values or secretRef references. Env vars are resolved and passed to
containers at startup (fixes existing gap).

- Add Secret CRUD (model, repo, service, routes, CLI commands)
- Server env: {name, value} or {name, valueFrom: {secretRef: {name, key}}}
- Add env-resolver utility shared by instance startup and config generation
- Remove all profile-related code (models, services, routes, CLI, tests)
- Update backup/restore for secrets instead of profiles
- describe secret masks values by default, --show-values to reveal

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-22 18:40:58 +00:00
Michal
bd09ae9687 feat: kubectl-style CLI + Deployment/Pod model for servers/instances
Server = Deployment (defines what to run + desired replicas)
Instance = Pod (ephemeral, auto-created by reconciliation)

Backend:
- Add replicas field to McpServer schema
- Add reconcile() to InstanceService (scales instances to match replicas)
- Remove manual start/stop/restart - instances are auto-managed
- Cascade: deleting server stops all containers then cascades DB
- Server create/update auto-triggers reconciliation

CLI:
- Add top-level delete command (servers, instances, profiles, projects)
- Add top-level logs command
- Remove instance compound command (use get/delete/logs instead)
- Clean up project command (list/show/delete → top-level get/describe/delete)
- Enhance describe for instances with container inspect info
- Add replicas to apply command's ServerSpec

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-22 13:30:46 +00:00
Michal
5f66fc82ef test: add integration test for full MCP server flow
Tests the complete lifecycle through Fastify routes with in-memory
repositories and a fake streamable-http MCP server:
- External server: register → start virtual instance → proxy tools/list
- Managed server: register with dockerImage → start container → verify spec
- Full lifecycle: register → start → list → stop → remove → delete
- Proxy auth enforcement
- Server update flow
- Error handling (Docker failure → ERROR status)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-22 12:34:55 +00:00
Michal
6da4ae495c feat: add backup and restore with encrypted secrets
Some checks are pending
CI / lint (push) Waiting to run
CI / typecheck (push) Waiting to run
CI / test (push) Waiting to run
CI / build (push) Blocked by required conditions
BackupService exports servers/profiles/projects to JSON bundle.
RestoreService imports with skip/overwrite/fail conflict strategies.
AES-256-GCM encryption for sensitive env vars via scrypt-derived keys.
REST endpoints and CLI commands for backup/restore operations.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 05:40:46 +00:00
Michal
9a67e51307 feat: add health monitoring with metrics collection and REST API
Some checks are pending
CI / lint (push) Waiting to run
CI / typecheck (push) Waiting to run
CI / test (push) Waiting to run
CI / build (push) Blocked by required conditions
MetricsCollector tracks per-instance request counts, error rates, latency,
and uptime. HealthAggregator computes system-wide health status. REST
endpoints at /api/v1/health/overview, /health/instances/:id, /metrics.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 05:34:20 +00:00
Michal
9e660140b3 feat: add Kubernetes orchestrator for MCP server deployment
Some checks are pending
CI / lint (push) Waiting to run
CI / typecheck (push) Waiting to run
CI / test (push) Waiting to run
CI / build (push) Blocked by required conditions
KubernetesOrchestrator implements McpOrchestrator interface with K8s API
client, manifest generation (Pod/Deployment), namespace management,
resource limits, and security contexts. 39 new tests.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 05:30:49 +00:00
Michal
4d796e2aa7 feat: add instance lifecycle management with restart, inspect, and CLI commands
Adds restart/inspect methods to InstanceService, state validation for stop,
REST endpoints for restart and inspect, and full CLI command suite for
instance list/start/stop/restart/remove/logs/inspect.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 05:11:48 +00:00
Michal
7c07749580 feat: add audit logging repository, service, and query API
Implements IAuditLogRepository with Prisma, AuditLogService with
configurable retention policy and purge, and REST routes for
querying/filtering audit logs at /api/v1/audit-logs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 05:09:14 +00:00
Michal
d1390313a3 feat: add Docker container management for MCP servers
McpOrchestrator interface with DockerContainerManager implementation,
instance service for lifecycle management, instance API routes,
and docker-compose with mcpd service. 127 tests passing.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 04:52:12 +00:00
Michal
0ff5c85cf6 feat: add project management APIs with MCP config generation
Project CRUD, profile association, and MCP config generation that
filters secret env vars. 104 tests passing.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 04:35:00 +00:00
Michal
3fa2bc5ffa feat: add MCP server and profile management API
Add validation schemas (Zod), repository pattern with Prisma, service layer
with business logic (NotFoundError, ConflictError), and REST routes for
MCP server and profile CRUD. 86 mcpd tests passing.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 04:35:00 +00:00
Michal
47f10f62c7 feat: implement mcpd core server framework with Fastify
Add Fastify server with config validation (Zod), health/healthz endpoints,
auth middleware (Bearer token + session lookup), security plugins (CORS,
Helmet, rate limiting), error handler, audit logging, and graceful shutdown.
36 tests passing.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 04:35:00 +00:00