BREAKING: `mcpctl create rbac` no longer accepts `--binding` or
`--operation`. Use `--roleBindings` instead with key:value pairs:
# resource binding
--roleBindings role:view,resource:servers
--roleBindings role:view,resource:servers,name:my-ha
# operation binding (role:run is implied by action:)
--roleBindings action:logs
The on-disk YAML shape (`roleBindings: [{role, resource, name?}]` or
`{role:'run', action}`) is unchanged, so Git backups and existing
`apply -f` files continue to work. Only the command-line input format
changes.
The parser is extracted to src/cli/src/commands/rbac-bindings.ts so the
upcoming `mcpctl create mcptoken --bind <kv>` verb can reuse it.
Completions, tests, and the new parser unit test all pass (406/406).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a new McpToken Prisma model (project-scoped, SHA-256 hashed at rest,
optional expiry, revocable) plus backing repository, service, and REST
routes. Tokens are a first-class RBAC subject: new 'McpToken' kind is
added to the subject enum and the service auto-creates an RbacDefinition
with subject McpToken:<sha> when bindings are provided.
Creator-permission ceiling: the service rejects any requested binding
the creator cannot already satisfy themselves (re-uses
rbacService.canAccess / canRunOperation). rbacMode=clone snapshots the
creator's full permissions into the token.
Routes:
POST /api/v1/mcptokens create (returns raw token once)
GET /api/v1/mcptokens list (filter by project)
GET /api/v1/mcptokens/:id describe (no secret in response)
POST /api/v1/mcptokens/:id/revoke soft-delete + remove RbacDef
DELETE /api/v1/mcptokens/:id hard-delete
GET /api/v1/mcptokens/introspect validate raw bearer (used by mcplocal)
Extends AuditEvent with optional tokenName/tokenSha fields (indexed) so
token-driven activity can be filtered later. Adds token helpers in
@mcpctl/shared: TOKEN_PREFIX='mcpctl_pat_', generateToken, hashToken,
isMcpToken, timingSafeEqualHex.
Follow-up PRs add the auth-hook dispatch on the prefix, the CLI verbs,
and the HTTP-mode mcplocal that calls /introspect.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a per-server tools/list cache in McpRouter (positive + negative TTL)
so a slow or dead upstream only stalls the first discovery call, not every
subsequent client request. Invalidated on upstream add/remove.
Health probes now apply a default liveness spec (tools/list via the real
production path) to any RUNNING instance without an explicit healthCheck,
so synthetic and real failures converge on the same signal.
Includes supporting updates in mcpd-client, discovery, upstream/mcpd,
seeder, and fulldeploy/release scripts.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Commit 1bd5087 added attachInteractive to the orchestrator interface
but never hooked it up in mcp-proxy-service — sendViaPersistentAttach
was promised in the commit message but missing from the diff. Servers
with a distroless image whose entrypoint IS the MCP server (gitea-mcp)
ended up needing a bogus `command: [node, dist/index.js]` workaround
that silently failed on every exec, leaving clients with empty tool
lists.
Changes:
- PersistentStdioClient: take a StdioMode discriminated union. Exec
mode runs a command via execInteractive; attach mode talks to PID 1
via attachInteractive.
- mcp-proxy-service: dispatch by config — command → exec; packageName
→ exec via runtime runner; dockerImage-only → attach. Error
serialization no longer drops non-Error objects as "[object Object]".
- templates/gitea.yaml: remove the command workaround; the image CMD
runs as PID 1 and mcpd attaches.
- Add unit tests covering both modes and the unsupported-orchestrator
paths.
Also required (separate repo): mcpd's k8s Role needed pods/attach
added alongside pods/exec; updated in kubernetes-deployment/…/mcpctl/server.ts
and kubectl-patched on the live cluster.
Verified end-to-end against mcpctl.ad.itaz.eu:
- gitea (attach): 49 tools listed, real tools/call round-trip.
- aws-docs (exec via packageName): 4 tools, no regression.
- docmost (exec via command): 11 tools, no regression.
- mcpd suite: 634/634 passing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Instance status now reflects actual container state:
- startOne() sets STARTING (not RUNNING) after container creation
- syncStatus() promotes STARTING→RUNNING when pod is ready
- syncStatus() demotes RUNNING→STARTING if pod restarts (CrashLoop)
- External servers still get RUNNING immediately (no container)
Previously, CrashLooping pods showed as RUNNING in mcpctl get instances.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two bugs fixed:
1. Backup completeness: JSON backup API now includes prompts and
templates. Previously these were silently dropped during
backup/restore, causing data loss on migration.
2. STDIO proxy for docker-image servers: servers with dockerImage
but no packageName/command (like docmost) now use k8s Attach
to connect to the container's PID 1 stdin/stdout instead of
exec. This fixes "has no packageName or command" errors.
Changes:
- backup-service.ts: add BackupPrompt/BackupTemplate types, export them
- restore-service.ts: restore prompts (with project FK) and templates
- mcp-proxy-service.ts: sendViaPersistentAttach for docker-image STDIO
- orchestrator.ts: add attachInteractive to McpOrchestrator interface
- kubernetes-orchestrator.ts: implement attachInteractive via k8s Attach
- k8s-client-official.ts: expose Attach client
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
mcpd now runs a periodic reconcileAll() every 30s that:
- Detects crashed/missing containers (syncStatus)
- Cleans up ERROR instances
- Creates replacement pods to match desired replica count
This replaces the old syncStatus-only timer. Servers migrated
from another deployment or recovering from node failures will
automatically get their instances recreated.
6 new tests for reconcileAll covering: missing instances, skip
replicas=0, already-at-count, ERROR cleanup, multi-server,
error isolation.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add MCPD_NODE_SELECTOR env var support in manifest generator
for mixed-arch clusters (e.g. arm64+amd64)
- Fix backup restore: resolve system user ID instead of
hardcoded 'system' string
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The restore service hardcoded ownerId as the literal string 'system'
instead of looking up the actual system user ID. This caused FK
constraint violations when restoring projects to a fresh database.
Now resolves the system user by email, falling back to the first
available user.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
mcpd can now deploy MCP server instances as Kubernetes pods instead of
Docker containers. Set MCPD_ORCHESTRATOR=kubernetes to enable.
- Add @kubernetes/client-node with thin wrapper (context enforcement
via MCPD_K8S_CONTEXT to prevent multi-cluster mishaps)
- Rewrite KubernetesOrchestrator: pod CRUD, pod IP extraction,
exec via SPDY (one-shot + interactive), log streaming
- Manifest generator: stdin:true for STDIO servers, args (not command)
to preserve runner image entrypoint, security hardening
- Orchestrator selection in main.ts via MCPD_ORCHESTRATOR env var
- 25 unit tests for k8s orchestrator, all 624 tests pass
Tested end-to-end on local k3s:
- mcpd deployed via Pulumi, creates pods in mcpctl-servers namespace
- NetworkPolicy verified: only mcpd can reach MCP server pods
- Python runner (uvx) successfully runs aws-documentation-mcp-server
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fan-out discovery methods (tools/list, prompts/list, resources/list)
used synthetic request IDs that couldn't be looked up in the
correlation map. This caused upstream_response events to have no
correlationId, making the console unable to find upstream content
for replay ("No content to replay").
Fix: pass correlationId through RouteContext → discovery methods →
onUpstreamCall callback, so the handler can use it directly.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When the repo directory already existed from a previous init (e.g.
local-only init without remote), the origin remote was missing. Now
initRepo() verifies and sets/updates the remote on every startup.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Move backup SSH keys and repo URL from MCPD_BACKUP_REPO env var to a
"backup-ssh" secret in the database. Keys are auto-generated on first
init and stored back into the secret. Also fix ERR_HTTP_HEADERS_SENT
crash caused by reply.send() without return in routes when onSend hook
is registered.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Exempt /healthz and /health from rate limiter
- Increase rate limit from 500 to 2000 req/min
- Register backup routes even when disabled (status shows disabled)
- Guard restore endpoints with 503 when backup not configured
- Add retry with backoff on 429 in audit smoke tests
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Two bugs: (1) empty string env var treated as enabled (use || instead of ??),
(2) health routes missing return reply causing double-send with onSend hook.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
DB is source of truth with git as downstream replica. SSH key generated
on first start, all resource mutations committed as apply-compatible YAML.
Supports manual commit import, conflict resolution (DB wins), disaster
recovery (empty DB restores from git), and timeline branches on restore.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Persistent file cache in ~/.mcpctl/cache/proxymodel/ with LRU eviction
- Pause queue for temporarily holding MCP traffic
- Hot-reload watcher for custom stages and proxymodel definitions
- CLI: mcpctl cache list/clear/stats commands
- HTTP endpoints for cache and pause management
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Extends section drill-down (previously tool-only) to work with
prompts/get using _resultId + _section arguments. Shares the same
section store as tool results, enabling cross-method drill-down.
Large prompts (>2000 chars) are automatically split into sections.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
proxyMode "direct" was a security hole (leaked secrets as plaintext env
vars in .mcp.json) and bypassed all mcplocal features (gating, audit,
RBAC, content pipeline, namespacing). Removed from schema, API, CLI,
and all tests. Old configs with proxyMode are accepted but silently
stripped via Zod .transform() for backward compatibility.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Rewrite README Content Pipeline section as Plugin System section
documenting built-in plugins (default, gate, content-pipeline),
plugin hooks, and the relationship between gating and proxyModel
- Update all README examples to use --proxy-model instead of --gated
- Add unit tests: proxyModel normalization in JSON/YAML output (4 tests),
Plugin Config section in describe output (2 tests)
- Add smoke tests: yaml/json output shows resolved proxyModel without
gated field, round-trip compatibility (4 tests)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Resolves proxyModel from gated boolean when the DB value is empty
(pre-migration projects). The gated field is no longer included in
get -o yaml/json output, making it apply-compatible with the new schema.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Exclude src/db/tests from workspace vitest config (needs test DB)
- Make global-setup.ts gracefully skip when test DB unavailable
- Fix exactOptionalPropertyTypes issues in proxymodel-endpoint.ts
- Use proper ProxyModelPlugin type for getPluginHooks function
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- proxyModel field now determines both YAML pipeline stages AND plugin
gating behavior ('default'/'gate' = gated, 'content-pipeline' = not)
- Deprecate --gated/--no-gated CLI flags (backward compat preserved:
--no-gated maps to --proxy-model content-pipeline)
- Replace GATED column with PLUGIN in `get projects` output
- Update `describe project` to show "Plugin Config" section
- Unify proxymodel discovery: GET /proxymodels now returns both YAML
pipeline models and TypeScript plugins with type field
- `describe proxymodel gate` shows plugin hooks and extends info
- Update CLI apply schema: gated is now optional (not required)
- Regenerate shell completions
- Tests: proxymodel endpoint (5), smoke tests (8)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add userName column to AuditEvent schema with index and migration
- Add GET /api/v1/auth/me endpoint returning current user identity
- AuditCollector auto-fills userName from session→user map, resolved
lazily via /auth/me on first session creation
- Support userName and date range (from/to) filtering on audit events
and sessions endpoints
- Audit console sidebar groups sessions by project → user
- Add date filter presets (d key: all/today/1h/24h/7d) to console
- Add scrolling and page up/down to sidebar navigation
- Tests: auth-me (4), audit-username collector (4), route filters (2),
smoke tests (2)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Audit Console Phase 1: tool_call_trace emission from mcplocal router,
session_bind/rbac_decision event kinds, GET /audit/sessions endpoint,
full Ink TUI with session sidebar, event timeline, and detail view
(mcpctl console --audit).
System prompts: move 6 hardcoded LLM prompts to mcpctl-system project
with extensible ResourceRuleRegistry validation framework, template
variable enforcement ({{maxTokens}}, {{pageCount}}), and delete-resets-
to-default behavior. All consumers fetch via SystemPromptFetcher with
hardcoded fallbacks.
CLI: -p shorthand for --project across get/create/delete/config commands,
console auto-scroll improvements, shell completions regenerated.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Qwen 7B sometimes returns fewer titles than pages (12 for 14).
Instead of rejecting the entire response, pad missing entries with
generic "Page N" titles and truncate extras. Also emphasize exact
count in the prompt.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
LLMProviderAdapter now tries all registered providers before giving up:
1. Named provider (if specified)
2. All 'fast' tier providers in order
3. All 'heavy' tier providers in order
4. Legacy active provider
Previously, if the first provider (e.g., vllm-local) failed, the adapter
threw immediately even though Anthropic and Gemini were available. Now it
logs the failure and tries the next candidate.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add warmup() to LlmProvider interface for eager subprocess startup
- ManagedVllmProvider.warmup() starts vLLM in background on project load
- ProviderRegistry.warmupAll() triggers all managed providers
- NamedProvider proxies warmup() to inner provider
- paginate stage generates LLM-powered descriptive page titles when
available, cached by content hash, falls back to generic "Page N"
- project-mcp-endpoint calls warmupAll() on router creation so vLLM
is loading while the session initializes
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Comprehensive MCP server management with kubectl-style CLI.
Key features in this release:
- Declarative YAML apply/get round-trip with project cloning support
- Gated sessions with prompt intelligence for Claude
- Interactive MCP console with traffic inspector
- Persistent STDIO connections for containerized servers
- RBAC with name-scoped bindings
- Shell completions (fish + bash) auto-generated
- Rate-limit retry with exponential backoff in apply
- Project-scoped prompt management
- Credential scrubbing from git history
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Ink statically imports react-devtools-core (only used when DEV=true).
With --external, bun compile leaves a runtime require that fails in the
standalone binary. Instead, provide a no-op stub that bun bundles inline.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Ink-based TUI that shows exactly what an LLM sees through MCP.
Browse tools/resources/prompts, execute them, and see raw JSON-RPC
traffic in a protocol log. Supports gated session flow with
begin_session, raw JSON-RPC input, and session reconnect.
- McpSession class wrapping HTTP transport with typed methods
- 12 React/Ink components (header, protocol-log, menu, tool/resource/prompt views, etc.)
- 21 unit tests for McpSession against a mock MCP server
- Fish + Bash completions with project name argument
- bun compile with --external react-devtools-core
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The system project needs a valid ownerId that references an existing user.
Create a system@mcpctl.local user via upsert before creating the project.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implements the full gated session flow and prompt intelligence system:
- Prisma schema: add gated, priority, summary, chapters, linkTarget fields
- Session gate: state machine (gated → begin_session → ungated) with LLM-powered
tool selection based on prompt index
- Tag matcher: intelligent prompt-to-tool matching with project/server/action tags
- LLM selector: tiered provider selection (fast for gating, heavy for complex tasks)
- Link resolver: cross-project MCP resource references (project/server:uri format)
- Prompt summary service: LLM-generated summaries and chapter extraction
- System project bootstrap: ensures default project exists on startup
- Structural link health checks: enrichWithLinkStatus on prompt GET endpoints
- CLI: create prompt --priority/--link, create project --gated/--no-gated,
describe project shows prompts section, get prompts shows PRI/LINK/STATUS
- Apply/edit: priority, linkTarget, gated fields supported
- Shell completions: fish updated with new flags
- 1,253 tests passing across all packages
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The /llm/providers endpoint now runs isAvailable() on each provider in
parallel and returns health status per provider. The status command shows
✓/✗ per provider based on actual availability, not just the fast tier.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds tier-based LLM routing so fast local models (vLLM, Ollama) handle
structured tasks while cloud models (Gemini, Anthropic) are reserved for
heavy reasoning. Single-provider configs continue to work via fallback.
- Tier type + ProviderRegistry with assignTier/getProvider/fallback chain
- Multi-provider config format: { providers: [{ name, type, tier, ... }] }
- NamedProvider wrapper for multiple instances of same provider type
- Setup wizard: Simple (legacy) / Advanced (fast+heavy tiers) modes
- Status display: tiered view with /llm/providers endpoint
- Call sites use getProvider('fast') instead of getActive()
- Full backward compatibility with existing single-provider configs
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Avoids burning tokens on every `mcpctl status` call. The /llm/health
endpoint now caches successful results for 10min, errors for 1min.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The pool refactor made ACP client creation lazy, causing the first
/llm/health call to spawn + initialize + prompt Gemini in one request
(30s+). Now warmup() eagerly starts the subprocess on mcplocal boot.
Also fetch models in parallel with LLM health check.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- ACP session pool with per-model subprocesses and 8h idle eviction
- Per-project LLM config: local override → mcpd recommendation → global default
- Model override support in ResponsePaginator
- /llm/models endpoint + available models in mcpctl status
- Remove --llm-provider/--llm-model from create project (use edit/apply)
- 8 new smart pagination integration tests (e2e flow)
- 260 mcplocal tests, 330 CLI tests passing
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Status command now queries mcplocal's /llm/health endpoint instead of
spawning the gemini binary. This uses the persistent ACP connection
(fast) and works for any configured provider, not just gemini-cli.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace per-call gemini CLI spawning (~10s cold start each time) with
persistent ACP (Agent Client Protocol) subprocess. First call absorbs
the cold start, subsequent calls are near-instant over JSON-RPC stdio.
- Add AcpClient: manages persistent gemini --experimental-acp subprocess
with lazy init, auto-restart on crash/timeout, NDJSON framing
- Add GeminiAcpProvider: LlmProvider wrapper with serial queue for
concurrent calls, same interface as GeminiCliProvider
- Add dispose() to LlmProvider interface + disposeAll() to registry
- Wire provider disposal into mcplocal shutdown handler
- Add status command spinner with progressive output and color-coded
LLM health check results (green checkmark/red cross)
- 25 new tests (17 ACP client + 8 provider)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Setup wizard auto-detects gemini binary via `which`, saves full path
so systemd service can find it without user PATH
- `mcpctl status` tests LLM provider health (gemini: quick prompt test,
ollama: health check, API providers: key stored confirmation)
- Shows error details inline: "gemini-cli / gemini-2.5-flash (not authenticated)"
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>