Wires the Stage 2 services into HTTP. New routes:
GET /api/v1/agents — list
GET /api/v1/agents/:idOrName — describe
POST /api/v1/agents — create
PUT /api/v1/agents/:idOrName — update
DELETE /api/v1/agents/:idOrName — delete
GET /api/v1/projects/:p/agents — project-scoped list (mcplocal disco)
POST /api/v1/agents/:name/chat — chat (non-streaming or SSE stream)
POST /api/v1/agents/:name/threads — create thread explicitly
GET /api/v1/agents/:name/threads — list threads
GET /api/v1/threads/:id/messages — replay history
The chat endpoint reuses the SSE pattern from llm-infer.ts (same headers
incl. X-Accel-Buffering:no, same `data: …\n\n` framing, same `[DONE]`
terminator). Each ChatService chunk is one frame. Non-streaming returns
{threadId, assistant, turnIndex} as JSON.
RBAC mapping in main.ts:mapUrlToPermission:
- /agents/:name/{chat,threads*} → run:agents:<name>
- /threads/:id/* → view:agents (service-level owner check
handles fine-grained access since the URL doesn't carry the agent name)
- /agents and /agents/:idOrName → default {GET:view, POST:create,
PUT:edit, DELETE:delete} on resource 'agents'.
'agents' added to nameResolvers so RBAC's CUID→name lookup works.
ChatToolDispatcherImpl bridges ChatService to McpProxyService: it lists a
project's MCP servers, fans out tools/list calls to each, namespaces tool
names as `<server>__<tool>`, and routes tools/call back to the right
serverId on dispatch. tools/list errors on a single server are logged and
that server's tools are dropped from the turn's tool surface — one bad
server doesn't poison the whole list.
Tests:
agent-routes.test.ts (15) — full HTTP CRUD round-trip, 404/409 paths,
project-scoped list, non-streaming + SSE chat, thread create/list,
/threads/:id/messages replay, body-required 400.
chat-tool-dispatcher.test.ts (7) — empty list when no project / no
servers, namespacing + inputSchema forwarding, partial-failure
skipping with audit log, callTool dispatch shape, missing-server
rejection, JSON-RPC error surfacing.
All 22 new green; mcpd suite now 759/759 (was 737).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Layers the persistence-side logic on top of the Stage 1 schema. AgentService
mirrors LlmService's CRUD shape with name-resolved llm/project references and
yaml round-trip support; ChatService is the orchestrator that drives one chat
turn end-to-end: build the merged system block (agent.systemPrompt + project
Prompts ordered by priority desc + per-call systemAppend), persist the user
turn, run the adapter, dispatch any tool_calls through an injected
ChatToolDispatcher, persist tool turns linked back via toolCallId, and loop
until the model returns terminal text.
Per-call params resolve LiteLLM-style: request body → agent.defaultParams →
adapter default. The escape hatch `extra` is forwarded as-is so each adapter
can cherry-pick provider-specific knobs (Anthropic metadata, vLLM
repetition_penalty, etc.) without code changes here.
Persistence is non-transactional across the loop because tool calls can take
minutes; long-held DB transactions would starve other writers. Instead each
in-flight assistant turn is written `pending` and flipped to `complete` only
after its tool results land. On failure or max-iter overrun, every `pending`
row in the thread is flipped to `error` so the trail is auditable.
Tools are namespaced on the wire as `<server>__<tool>`, unmarshalled at
dispatch time; `tools_allowlist` filters before the model sees the list.
Tests:
agent-service.test.ts (7) — CRUD with name-resolved llm/project, conflict
on duplicate, llm switch, project detach, listByProject filtering,
upsertByName branch coverage.
chat-service.test.ts (9) — plain text turn, full text→tool→text loop with
toolCallId linkage, max-iter cap leaves zero pending, adapter-throws
leaves zero pending, body→defaultParams merge, `extra` passthrough,
project-Prompt priority ordering in the system block, tool-without-
project rejection, tools_allowlist filtering.
All 16 green; full mcpd suite still 737/737.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Lazy backfill in SecretService.getById covers per-row retries, but list
views still show 'KEYS: -' until each row is described. New
backfillSecretKeyNames bootstrap runs once at startup, finds Secrets
where keyNames=[] AND data={} (i.e. backend-stored, pre-existing rows),
calls resolveData to learn the keys, persists. Sequential to be kind to
the upstream backend on cold start. Idempotent + non-fatal.
Post-migration, every Secret on a non-plaintext backend had an empty `data`
column (values live in the backend; only externalRef on the row). The CLI's
\`get secrets\` showed \`KEYS: -\` and \`describe secret\` showed \`(empty)\` for
all 9 migrated secrets — useless without --show-values.
Fix: dedicated \`keyNames Json\` column on Secret that stores the sorted key
list independently from the values. Populated on every write path, lazily
backfilled on first read for pre-existing rows that pre-date the column.
Schema default \`[]\` keeps prisma db push self-healing on rolling upgrades.
- src/db/prisma/schema.prisma: add Secret.keyNames Json @default("[]")
- src/mcpd/src/repositories/secret.repository.ts: pipe keyNames through create
+ update
- src/mcpd/src/services/secret.service.ts:
- create/update populate keyNames = sorted Object.keys(data)
- getById lazy-backfills empty keyNames (cheap: derives from data for
plaintext, single backend read for openbao)
- src/mcpd/src/services/secret-migrate.service.ts: migrate writes keyNames
alongside the new backendId so freshly-migrated rows are populated without
a follow-up read
- src/cli/src/commands/get.ts: KEYS column reads keyNames first, falls back
to Object.keys(data) for older rows
- src/cli/src/commands/describe.ts: shows the Data section keys whenever
keyNames OR data has entries (so backend-stored secrets render their key
list); --show-values still resolves through the backend
After deploy, the 9 already-migrated secrets backfill their keyNames on the
next describe-by-id, with no operator action needed.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Post-migration, every Secret on a non-plaintext backend has empty `Secret.data`
(the actual value lives in the backend; only externalRef is on the row).
`describe secret --show-values` was reading the raw row, so the user saw
"Data: (empty)" for every migrated secret.
- Route GET /api/v1/secrets/:id accepts ?reveal=true; when set, resolves the
value via SecretService.resolveData() so the response carries the actual
data dispatched through the right driver.
- CLI --show-values flips that query param. Without --show-values the route
returns the raw row exactly as before (no leak risk).
Caught running the wizard end-to-end on the live cluster after the
ClusterMesh fix on the kubernetes-deployment side made bao reachable.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Debugging the wizard migration flow, every OpenBao error was just
`HTTP 403` with no context. The response body often carries the actual
reason (missing capability, specific path, namespace mismatch), so
surfacing it makes operator debugging a one-step task. Added a shared
bodyText() helper that trims huge HTML error pages to 400 chars.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two production issues caught running the wizard end-to-end:
1. `mcpctl migrate secrets --from default --to bao` listed `bao-creds` as a
candidate — the very token that lets mcpd reach bao. Moving it would
brick the auth chain (destination backend needs its own bootstrap token
to read its own bootstrap token). Fix: SecretMigrateService now calls
backends.list() and filters out any Secret whose name matches ANY
SecretBackend's `config.tokenSecretRef.name`. dryRun mirrors the same
filter so candidates match reality. `--names` explicitly bypasses the
filter for operators who really mean it.
2. Initial rotation in the wizard 403'd because the global RBAC hook
demands the `rotate-secretbackend` operation, which wasn't in
bootstrap-admin — migrateAdminRole only added ops when processing a
legacy `role: admin` entry, so already-migrated admin rows missed every
new op added after their initial migration. Fix: migrateAdminRole now
also runs a back-fill pass on rows that look admin-equivalent (have both
`edit:*` and `run:*`), appending any missing op from ADMIN_OPS. Writes
only when something actually changed, so restarts stay quiet. Same path
also retroactively grants `migrate-secrets` which had the same problem
yesterday.
Tests: 4 new migrate-service cases (bootstrap filter on/off, dryRun parity,
--names bypass). Full suite 1889/1889.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
One-command setup replaces the 6-step manual flow — `mcpctl create
secretbackend bao --type openbao --wizard` takes the OpenBao admin token
once, provisions a narrow policy + token role, mints the first periodic
token, stores it on mcpd, verifies end-to-end, and prints the migration
command. The admin token is NEVER persisted.
The stored credential auto-rotates daily: mcpd mints a successor via the
token role (self-rotation capability is part of the policy it was issued
with), verifies the successor, writes it over the backing Secret, then
revokes the predecessor by accessor. TTL 720h means a week of rotation
failures still leaves 20+ days of runway.
Shared:
- New `@mcpctl/shared/vault` — pure HTTP wrappers (verifyHealth,
ensureKvV2, writePolicy, ensureTokenRole, mintRoleToken, revokeAccessor,
lookupSelf, testWriteReadDelete) and policy HCL builder.
mcpd:
- `tokenMeta Json @default("{}")` on SecretBackend. Self-healing schema
migration — empty default lets `prisma db push` add the column cleanly.
- SecretBackendRotator.rotateOne: mint → verify → persist → revoke-old →
update tokenMeta. Failures surface via `lastRotationError` on the row;
the old token keeps working.
- SecretBackendRotatorLoop: on startup rotates overdue backends, schedules
per-backend timers with ±10min jitter. Stops cleanly on shutdown.
- New `POST /api/v1/secretbackends/:id/rotate` (operation
`rotate-secretbackend` — added to bootstrap-admin's auto-migrated ops
alongside migrate-secrets, which was previously missing too).
CLI:
- `--wizard` on `create secretbackend` delegates to the interactive flow.
All prompts can be pre-answered via flags (--url, --admin-token,
--mount, --path-prefix, --policy-name, --token-role,
--no-promote-default) for CI.
- `mcpctl rotate secretbackend <name>` — convenience verb; hits the new
rotate endpoint.
- `describe secretbackend` renders a Token health section (healthy /
STALE / WARNING / ERROR) with generated/renewal/expiry timestamps and
last rotation error. Only shown when tokenMeta.rotatable is true — the
existing k8s-auth + static-token backends don't surface it.
Tests: 15 vault-client unit tests (shared), 8 rotator unit tests (mcpd),
3 wizard flow tests (cli, including a regression test that the admin
token never appears in stdout). Full suite 1885/1885 (+32). Completions
regenerated for the new flags.
Out of scope (explicit): kubernetes-auth wizard, Vault Enterprise
namespaces in the wizard path, rotation for non-wizard static-token
backends. See plan file for details.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Why: requiring a static OpenBao root token to live (even once-bootstrap) on
the plaintext backend is the weakest link in the chain. With the bao-side
Kubernetes auth method enabled, mcpd's pod can authenticate using its own
projected SA token, exchange it for a short-lived Vault client token, and
keep the database free of any vault credentials at all.
Driver changes (src/mcpd/src/services/secret-backends/openbao.ts):
- New `OpenBaoConfig.auth = 'token' | 'kubernetes'`. Defaults to 'token' so
existing rows keep working. Both shapes share url + mount + pathPrefix +
namespace; auth-specific fields are mutually exclusive in the config schema.
- Kubernetes auth flow: read JWT from /var/run/secrets/.../token, POST to
/v1/auth/<authMount>/login {role, jwt}, cache the returned client_token
for `lease_duration - 60s` (grace window), then re-login.
- One-shot 403-retry: if a request comes back 403 (revoked / clock skew),
purge cache and retry the original request once with a fresh login.
- Reads + writes go through the same getToken() path so token-auth is
unchanged for existing deployments.
CLI (src/cli/src/commands/create.ts):
- `mcpctl create secretbackend bao --type openbao --auth kubernetes \
--url https://bao.example:8200 --role mcpctl`
- Optional `--auth-mount` (default 'kubernetes') + `--sa-token-path` (default
the standard projected-token path) for non-default deployments.
- Token-auth path unchanged: `--auth token --token-secret SECRET/KEY`
(or omit `--auth` since 'token' is the default).
Validation (factory.ts) gates on the auth strategy: each path enforces its
own required fields and produces a clear error if misconfigured.
Tests: 6 new k8s-auth unit cases (login wire shape, lease-based caching,
custom authMount, 403-on-login, missing-role rejection, missing-tokenSecretRef
rejection). Full suite 1859/1859. Completions regenerated for the new flags.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Why: when mcpd's inference proxy is unreachable, clients with a local
vllm-managed provider should be able to substitute — but only if they still
have view permission on the centralized Llm. Otherwise revoking an Llm
wouldn't actually stop a misbehaving client.
Infrastructure (the agent + mcplocal HTTP-mode wire-up will land separately
when those clients pivot to mcpd's proxy):
- LlmProviderFileEntry gains optional `failoverFor: <central llm name>`. The
entry is otherwise the same local provider it always was; the new field
just declares which central Llm it can substitute for.
- ProviderRegistry tracks a failover map (registerFailover / getFailoverFor /
listFailovers). Unregister removes any failover entry pointing at the
removed provider so we don't end up with dangling references.
- New FailoverRouter wraps a primary inference call. On primary failure: if
a local provider is registered for the Llm, HEAD-probe `mcpd /api/v1/llms/
:name` with the caller's bearer to verify view permission, then either
invoke the local provider (allowed) or re-throw the primary error (403,
401, network unreachable, anything else — all fail-closed).
- Server: GET /api/v1/llms/:idOrName accepts both CUID and human name. Lets
FailoverRouter probe by name without a separate id-resolution call. HEAD
derives automatically from GET in Fastify, which runs the same RBAC hook
and drops the body — exactly what the probe needs.
Tests: 11 failover unit tests (registry map, decision flow, fail-closed for
forbidden + unreachable, checkAuth status mapping) + 4 new route tests
(name lookup, HEAD existing/missing). Full suite 1844/1844 (+14 from Phase
2's 1830). TypeScript clean across mcpd + mcplocal.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Why: the point of the Llm resource (Phase 1) is that credentials never leave
the server. This lands the proxy: clients POST OpenAI chat/completions to
mcpd, mcpd attaches the provider API key server-side, and the response
streams back as OpenAI-format SSE.
Design:
- Wire format client-side is always OpenAI chat/completions — every existing
SDK speaks it. Adapters translate on the provider side.
- `openai | vllm | deepseek | ollama` → pure passthrough (they already speak
OpenAI). `anthropic` → translator to/from Anthropic Messages API
(system-string extraction, content-block flattening, SSE event remap).
- Plain fetch; no @anthropic-ai/sdk dep. Consistent with the OpenBao driver
shape and keeps the proxy layer thin.
- `gemini-cli` intentionally rejected — subprocess providers need extra
lifecycle plumbing; deferred to a follow-up.
- Streaming: adapters yield `StreamingChunk`s; the route frames them as
`data: <json>\n\n` + terminal `data: [DONE]\n\n` so any OpenAI client
works unchanged.
RBAC:
- New URL special-case in mapUrlToPermission: `POST /api/v1/llms/:name/infer`
→ `run:llms:<name>` (not the default create:llms). Users need an explicit
`{role: 'run', resource: 'llms', [name: X]}` binding to call infer.
- Possession of `edit:llms` does NOT imply `run` — keeps catalogue
management separate from spend.
Audit: route emits an `llm_inference_call` event per request (llm name,
model, user/tokenSha, streaming, duration, status). main.ts wires it to the
structured logger for now; hook is in place for a richer audit sink later.
Tests:
- 11 adapter tests (passthrough POST shape + default URLs + no-auth ollama +
SSE forwarding; anthropic translate request/response + non-2xx wrap + SSE
event translation; registry dispatch + caching + unsupported-provider).
- 7 route tests (404, 400, non-streaming dispatch + audit, apiKey failure,
null apiKeyRef path, streaming SSE output, 502 on adapter error).
- Full suite 1830/1830 (+18 from Phase 1's 1812). TypeScript clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Why: every client that wants an LLM (the agent, HTTP-mode mcplocal, Claude
Code's STDIO mcplocal) today has to know the provider URL + key, and each
user's ~/.mcpctl/config.json carries them. Centralising the catalogue on the
server is the prerequisite for Phase 2 (mcpd proxies inference so credentials
never leave the cluster).
This phase adds the `Llm` resource and its CRUD surface — no proxy yet, no
client pivot yet. Just enough to register what you have.
Schema:
- New `Llm` model: name/type/model/url/tier/description + {apiKeySecretId,
apiKeySecretKey} FK pair. Reverse `llms` relation on Secret.
- Provider types: anthropic | openai | deepseek | vllm | ollama | gemini-cli.
- Tiers: fast | heavy.
mcpd:
- LlmRepository + LlmService + Zod validation schema + /api/v1/llms routes.
- API surface exposes `apiKeyRef: {name, key}` — the service translates to/
from the FK pair so clients never deal in cuids.
- `resolveApiKey(llmName)` reads through SecretService (which itself dispatches
to the right SecretBackend). That's the hook Phase 2's inference proxy uses.
- RBAC: added `'llms'` to RBAC_RESOURCES + resource alias. Standard
view/create/edit/delete semantics.
- Wired into main.ts (repo, service, routes).
CLI:
- `mcpctl create llm <name> --type X --model Y --tier fast|heavy --api-key-ref SECRET/KEY [--url ...] [--extra k=v ...]`
- `mcpctl get|describe|delete llm` — standard resource verbs.
- `mcpctl apply -f` with `kind: llm` (single- or multi-doc yaml/json).
Applied after secrets, before servers — apiKeyRef resolves an existing Secret.
- Shell completions regenerated.
Tests: 11 service unit tests + 9 route tests (happy path, 404s, 409, validation).
Full suite 1812/1812 (+20 from the 1792 Phase 0 baseline). TypeScript clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Why: API keys live in Postgres as plaintext JSON. A DB read exposes every
credential in the system. Before centralising more secrets (LLM keys, etc.)
we want to be able to point at an external KV store and drop DB access to
sensitive rows.
New model:
- `SecretBackend` resource (CRUD + isDefault invariant) owns how a secret is
stored. `Secret` gains `backendId` FK and `externalRef`. Reads/writes
dispatch through a driver.
- `plaintext` driver (near-noop, uses existing Secret.data column) is seeded
as the `default` row at startup. Acts as trust root / bootstrap.
- `openbao` driver (also HashiCorp Vault KV v2 compatible) talks plain HTTP,
no SDK dependency. Auth via static token pulled from a plaintext-backed
`Secret` through the injected SecretRefResolver. Caches resolved token.
- `SecretMigrateService` moves secrets one-at-a-time: read → write dest →
flip row → best-effort source delete. Interrupted runs are idempotent
(skips secrets already on destination).
CLI surface:
- `mcpctl create|get|describe|delete secretbackend` + `--default` on create.
- `mcpctl migrate secrets --from X --to Y [--names a,b] [--keep-source] [--dry-run]`
- `apply -f` round-trips secretbackends (yaml/json multi-doc + grouped).
- RBAC: `secretbackends` resource + `run:migrate-secrets` operation.
- Fish + bash completions regenerated.
docs/secret-backends.md covers the OpenBao policy, chicken-and-egg auth flow,
and the migration semantics.
Broke the circular dep (OpenBao needs SecretService to resolve its own token,
SecretService needs SecretBackendService) with a deferred-resolver bridge in
mcpd startup. 11 new driver unit tests; existing env-resolver/secret-route/
backup tests updated for the new service signatures. Full suite: 1792/1792.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Symptom: LiteLLM → mcplocal → mcpd proxy calls for project-scoped MCP
tool discovery all 401'd with "Authentication failed: invalid or
expired token", even though the same mcpctl_pat_ bearer works against
/api/v1/mcptokens/introspect and /api/v1/projects/:name/servers. Result:
the new k8s mcplocal pod could accept the bearer and respond to
/projects/:name/mcp (initialize was 200), but every downstream upstream
discovery call through /api/v1/mcp/proxy failed.
Root cause: registerMcpProxyRoutes installs its own route-scoped
createAuthMiddleware with the `authDeps` parameter it receives. In
main.ts that was being constructed with only `findSession` — missing
the `findMcpToken` that the GLOBAL auth hook already had. So a
mcpctl_pat_ bearer got all the way to the proxy route and then was
handed to an old-shape middleware that knew nothing about the prefix.
Fix: extract authDeps (findSession + findMcpToken) to a named const
and reuse it for both the global hook and the proxy route. Comment at
the declaration site warns future additions to keep the two paths in
sync — they have to agree or McpToken bearers silently break on
whichever one drifts.
Verified against the live cluster: LiteLLM's discoverTools path no
longer 401s; mcplocal logs now show successful upstream proxy calls.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds the end-to-end CLI surface for McpTokens and the mcpd auth dispatch
that recognizes them.
mcpd auth middleware:
- Dispatch on the `mcpctl_pat_` bearer prefix. McpToken bearers resolve
through a new `findMcpToken(hash)` dep, populating `request.mcpToken`
and `request.userId = ownerId`. Everything else follows the existing
session path.
- Returns 401 for revoked / expired / unknown tokens.
- Global RBAC hook now threads `mcpTokenSha` into `canAccess` /
`canRunOperation` / `getAllowedScope`, and enforces a hard
project-scope check: a McpToken principal can only hit
`/api/v1/projects/<its-project>/...`.
CLI verbs:
- `mcpctl create mcptoken <name> -p <proj> [--rbac empty|clone]
[--bind role:view,resource:servers] [--ttl 30d|never|ISO]
[--description ...] [--force]` — returns the raw token once.
- `mcpctl get mcptokens [-p <proj>]` — table with
NAME/PROJECT/PREFIX/CREATED/LAST USED/EXPIRES/STATUS.
- `mcpctl get mcptoken <name> -p <proj>` and
`mcpctl describe mcptoken <name> -p <proj>` — describe surfaces the
auto-created RBAC bindings.
- `mcpctl delete mcptoken <name> -p <proj>`.
- `apply -f` support with `kind: mcptoken`. Tokens are immutable, so
apply creates if missing and skips if the name is already active.
Audit plumbing:
- `AuditEvent` / collector now carry optional `tokenName` / `tokenSha`.
`setSessionMcpToken` sits alongside `setSessionUserName`; both feed a
per-session principal map used at emit time.
- `AuditEventService` query accepts `tokenName` / `tokenSha` filters.
- Console `AuditEvent` type carries the new fields so a follow-up can
add a TOKEN column.
Completions regenerated. 1764/1764 tests pass workspace-wide.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a new McpToken Prisma model (project-scoped, SHA-256 hashed at rest,
optional expiry, revocable) plus backing repository, service, and REST
routes. Tokens are a first-class RBAC subject: new 'McpToken' kind is
added to the subject enum and the service auto-creates an RbacDefinition
with subject McpToken:<sha> when bindings are provided.
Creator-permission ceiling: the service rejects any requested binding
the creator cannot already satisfy themselves (re-uses
rbacService.canAccess / canRunOperation). rbacMode=clone snapshots the
creator's full permissions into the token.
Routes:
POST /api/v1/mcptokens create (returns raw token once)
GET /api/v1/mcptokens list (filter by project)
GET /api/v1/mcptokens/:id describe (no secret in response)
POST /api/v1/mcptokens/:id/revoke soft-delete + remove RbacDef
DELETE /api/v1/mcptokens/:id hard-delete
GET /api/v1/mcptokens/introspect validate raw bearer (used by mcplocal)
Extends AuditEvent with optional tokenName/tokenSha fields (indexed) so
token-driven activity can be filtered later. Adds token helpers in
@mcpctl/shared: TOKEN_PREFIX='mcpctl_pat_', generateToken, hashToken,
isMcpToken, timingSafeEqualHex.
Follow-up PRs add the auth-hook dispatch on the prefix, the CLI verbs,
and the HTTP-mode mcplocal that calls /introspect.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a per-server tools/list cache in McpRouter (positive + negative TTL)
so a slow or dead upstream only stalls the first discovery call, not every
subsequent client request. Invalidated on upstream add/remove.
Health probes now apply a default liveness spec (tools/list via the real
production path) to any RUNNING instance without an explicit healthCheck,
so synthetic and real failures converge on the same signal.
Includes supporting updates in mcpd-client, discovery, upstream/mcpd,
seeder, and fulldeploy/release scripts.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Commit 1bd5087 added attachInteractive to the orchestrator interface
but never hooked it up in mcp-proxy-service — sendViaPersistentAttach
was promised in the commit message but missing from the diff. Servers
with a distroless image whose entrypoint IS the MCP server (gitea-mcp)
ended up needing a bogus `command: [node, dist/index.js]` workaround
that silently failed on every exec, leaving clients with empty tool
lists.
Changes:
- PersistentStdioClient: take a StdioMode discriminated union. Exec
mode runs a command via execInteractive; attach mode talks to PID 1
via attachInteractive.
- mcp-proxy-service: dispatch by config — command → exec; packageName
→ exec via runtime runner; dockerImage-only → attach. Error
serialization no longer drops non-Error objects as "[object Object]".
- templates/gitea.yaml: remove the command workaround; the image CMD
runs as PID 1 and mcpd attaches.
- Add unit tests covering both modes and the unsupported-orchestrator
paths.
Also required (separate repo): mcpd's k8s Role needed pods/attach
added alongside pods/exec; updated in kubernetes-deployment/…/mcpctl/server.ts
and kubectl-patched on the live cluster.
Verified end-to-end against mcpctl.ad.itaz.eu:
- gitea (attach): 49 tools listed, real tools/call round-trip.
- aws-docs (exec via packageName): 4 tools, no regression.
- docmost (exec via command): 11 tools, no regression.
- mcpd suite: 634/634 passing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Instance status now reflects actual container state:
- startOne() sets STARTING (not RUNNING) after container creation
- syncStatus() promotes STARTING→RUNNING when pod is ready
- syncStatus() demotes RUNNING→STARTING if pod restarts (CrashLoop)
- External servers still get RUNNING immediately (no container)
Previously, CrashLooping pods showed as RUNNING in mcpctl get instances.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two bugs fixed:
1. Backup completeness: JSON backup API now includes prompts and
templates. Previously these were silently dropped during
backup/restore, causing data loss on migration.
2. STDIO proxy for docker-image servers: servers with dockerImage
but no packageName/command (like docmost) now use k8s Attach
to connect to the container's PID 1 stdin/stdout instead of
exec. This fixes "has no packageName or command" errors.
Changes:
- backup-service.ts: add BackupPrompt/BackupTemplate types, export them
- restore-service.ts: restore prompts (with project FK) and templates
- mcp-proxy-service.ts: sendViaPersistentAttach for docker-image STDIO
- orchestrator.ts: add attachInteractive to McpOrchestrator interface
- kubernetes-orchestrator.ts: implement attachInteractive via k8s Attach
- k8s-client-official.ts: expose Attach client
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
mcpd now runs a periodic reconcileAll() every 30s that:
- Detects crashed/missing containers (syncStatus)
- Cleans up ERROR instances
- Creates replacement pods to match desired replica count
This replaces the old syncStatus-only timer. Servers migrated
from another deployment or recovering from node failures will
automatically get their instances recreated.
6 new tests for reconcileAll covering: missing instances, skip
replicas=0, already-at-count, ERROR cleanup, multi-server,
error isolation.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add MCPD_NODE_SELECTOR env var support in manifest generator
for mixed-arch clusters (e.g. arm64+amd64)
- Fix backup restore: resolve system user ID instead of
hardcoded 'system' string
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The restore service hardcoded ownerId as the literal string 'system'
instead of looking up the actual system user ID. This caused FK
constraint violations when restoring projects to a fresh database.
Now resolves the system user by email, falling back to the first
available user.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
mcpd can now deploy MCP server instances as Kubernetes pods instead of
Docker containers. Set MCPD_ORCHESTRATOR=kubernetes to enable.
- Add @kubernetes/client-node with thin wrapper (context enforcement
via MCPD_K8S_CONTEXT to prevent multi-cluster mishaps)
- Rewrite KubernetesOrchestrator: pod CRUD, pod IP extraction,
exec via SPDY (one-shot + interactive), log streaming
- Manifest generator: stdin:true for STDIO servers, args (not command)
to preserve runner image entrypoint, security hardening
- Orchestrator selection in main.ts via MCPD_ORCHESTRATOR env var
- 25 unit tests for k8s orchestrator, all 624 tests pass
Tested end-to-end on local k3s:
- mcpd deployed via Pulumi, creates pods in mcpctl-servers namespace
- NetworkPolicy verified: only mcpd can reach MCP server pods
- Python runner (uvx) successfully runs aws-documentation-mcp-server
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When the repo directory already existed from a previous init (e.g.
local-only init without remote), the origin remote was missing. Now
initRepo() verifies and sets/updates the remote on every startup.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Move backup SSH keys and repo URL from MCPD_BACKUP_REPO env var to a
"backup-ssh" secret in the database. Keys are auto-generated on first
init and stored back into the secret. Also fix ERR_HTTP_HEADERS_SENT
crash caused by reply.send() without return in routes when onSend hook
is registered.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Exempt /healthz and /health from rate limiter
- Increase rate limit from 500 to 2000 req/min
- Register backup routes even when disabled (status shows disabled)
- Guard restore endpoints with 503 when backup not configured
- Add retry with backoff on 429 in audit smoke tests
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Two bugs: (1) empty string env var treated as enabled (use || instead of ??),
(2) health routes missing return reply causing double-send with onSend hook.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
DB is source of truth with git as downstream replica. SSH key generated
on first start, all resource mutations committed as apply-compatible YAML.
Supports manual commit import, conflict resolution (DB wins), disaster
recovery (empty DB restores from git), and timeline branches on restore.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
proxyMode "direct" was a security hole (leaked secrets as plaintext env
vars in .mcp.json) and bypassed all mcplocal features (gating, audit,
RBAC, content pipeline, namespacing). Removed from schema, API, CLI,
and all tests. Old configs with proxyMode are accepted but silently
stripped via Zod .transform() for backward compatibility.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add userName column to AuditEvent schema with index and migration
- Add GET /api/v1/auth/me endpoint returning current user identity
- AuditCollector auto-fills userName from session→user map, resolved
lazily via /auth/me on first session creation
- Support userName and date range (from/to) filtering on audit events
and sessions endpoints
- Audit console sidebar groups sessions by project → user
- Add date filter presets (d key: all/today/1h/24h/7d) to console
- Add scrolling and page up/down to sidebar navigation
- Tests: auth-me (4), audit-username collector (4), route filters (2),
smoke tests (2)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Audit Console Phase 1: tool_call_trace emission from mcplocal router,
session_bind/rbac_decision event kinds, GET /audit/sessions endpoint,
full Ink TUI with session sidebar, event timeline, and detail view
(mcpctl console --audit).
System prompts: move 6 hardcoded LLM prompts to mcpctl-system project
with extensible ResourceRuleRegistry validation framework, template
variable enforcement ({{maxTokens}}, {{pageCount}}), and delete-resets-
to-default behavior. All consumers fetch via SystemPromptFetcher with
hardcoded fallbacks.
CLI: -p shorthand for --project across get/create/delete/config commands,
console auto-scroll improvements, shell completions regenerated.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add warmup() to LlmProvider interface for eager subprocess startup
- ManagedVllmProvider.warmup() starts vLLM in background on project load
- ProviderRegistry.warmupAll() triggers all managed providers
- NamedProvider proxies warmup() to inner provider
- paginate stage generates LLM-powered descriptive page titles when
available, cached by content hash, falls back to generic "Page N"
- project-mcp-endpoint calls warmupAll() on router creation so vLLM
is loading while the session initializes
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Comprehensive MCP server management with kubectl-style CLI.
Key features in this release:
- Declarative YAML apply/get round-trip with project cloning support
- Gated sessions with prompt intelligence for Claude
- Interactive MCP console with traffic inspector
- Persistent STDIO connections for containerized servers
- RBAC with name-scoped bindings
- Shell completions (fish + bash) auto-generated
- Rate-limit retry with exponential backoff in apply
- Project-scoped prompt management
- Credential scrubbing from git history
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The system project needs a valid ownerId that references an existing user.
Create a system@mcpctl.local user via upsert before creating the project.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implements the full gated session flow and prompt intelligence system:
- Prisma schema: add gated, priority, summary, chapters, linkTarget fields
- Session gate: state machine (gated → begin_session → ungated) with LLM-powered
tool selection based on prompt index
- Tag matcher: intelligent prompt-to-tool matching with project/server/action tags
- LLM selector: tiered provider selection (fast for gating, heavy for complex tasks)
- Link resolver: cross-project MCP resource references (project/server:uri format)
- Prompt summary service: LLM-generated summaries and chapter extraction
- System project bootstrap: ensures default project exists on startup
- Structural link health checks: enrichWithLinkStatus on prompt GET endpoints
- CLI: create prompt --priority/--link, create project --gated/--no-gated,
describe project shows prompts section, get prompts shows PRI/LINK/STATUS
- Apply/edit: priority, linkTarget, gated fields supported
- Shell completions: fish updated with new flags
- 1,253 tests passing across all packages
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Wait for stdout.write callback before process.exit in STDIO transport
to prevent truncation of large responses (e.g. grafana tools/list)
- Handle MCP notification methods (notifications/initialized, etc.) in
router instead of returning "Method not found" error
- Use -p shorthand in config claude output
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix MCP proxy to support SSE and STDIO transports (not just HTTP POST)
- Enrich tool descriptions with server context for LLM clarity
- Add Prompt and PromptRequest resources with two-resource RBAC model
- Add propose_prompt MCP tool for LLM to create pending prompt requests
- Add prompt resources visible in MCP resources/list (approved + session's pending)
- Add project-level prompt/instructions in MCP initialize response
- Add ServiceAccount subject type for RBAC (SA identity from X-Service-Account header)
- Add CLI commands: create prompt, get prompts/promptrequests, approve promptrequest
- Add prompts to apply config schema
- 956 tests passing across all packages
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Only set Content-Type: application/json when request body is present (fixes
Fastify rejecting empty DELETE with "Body cannot be empty" 400 error)
- Changed PROJECT_INCLUDE to return full server objects instead of just {id, name}
so project server listings show transport, package, image columns
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- tests.sh: run all tests with `bash tests.sh`, summary with `--short`
- tests.sh --filter mcpd/cli: run specific package
- project-routes.test.ts: 17 new route-level tests covering CRUD,
attach/detach, and the ownerId filtering bug fix
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The list endpoint was filtering by ownerId before RBAC could include
projects the user has view access to via name-scoped bindings.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Two bugs fixed:
- GET /api/v1/servers/:cuid now resolves CUID→name before RBAC check,
so name-scoped bindings match correctly
- List endpoints now filter responses via preSerialization hook using
getAllowedScope(), so name-scoped users only see their resources
Also adds fulldeploy.sh orchestrator script.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix resourceName assignment in mapUrlToPermission for strictness
- Use RbacRoleBinding type in restore-service instead of loose cast
- Remove stale ProjectMemberInput export from validation index
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Replace admin role with granular roles: view, create, delete, edit, run
- Two binding types: resource bindings (role+resource+optional name) and
operation bindings (role:run + action like backup, logs, impersonate)
- Name-scoped resource bindings for per-instance access control
- Remove role from project members (all permissions via RBAC)
- Add users, groups, RBAC CRUD endpoints and CLI commands
- describe user/group shows all RBAC access (direct + inherited)
- create rbac supports --subject, --binding, --operation flags
- Backup/restore handles users, groups, RBAC definitions
- mcplocal project-based MCP endpoint discovery
- Full test coverage for all new functionality
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
SSE-transport MCP servers (like ha-mcp) use a different protocol flow:
GET /sse to establish event stream, read endpoint event, then POST
JSON-RPC messages to /messages?session_id=... URL. Previously was
POSTing to root which returned 404.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
mcpd and MCP containers share the mcp-servers Docker network. HTTP probes
must use the container's internal IP + containerPort instead of localhost
+ host-mapped port. Also extracts container IP from Docker inspect.
Updated home-assistant template to use ghcr.io/homeassistant-ai/ha-mcp
Docker image (SSE transport) instead of broken npm package.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implements Kubernetes-style liveness probes that call MCP tools defined
in server healthCheck configs. For STDIO servers, uses docker exec to
spawn a disposable MCP client that sends initialize + tool call. For
HTTP/SSE servers, sends JSON-RPC directly.
- HealthProbeRunner service with configurable interval/threshold/timeout
- execInContainer added to orchestrator interface + Docker implementation
- Instance findById now includes server relation (fixes describe showing IDs)
- Events appended to instance (last 50), healthStatus tracked as
healthy/degraded/unhealthy
- 12 unit tests covering probing, thresholds, intervals, cleanup
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>