# mcpctl **kubectl for MCP servers.** A management system for [Model Context Protocol](https://modelcontextprotocol.io) servers — define, deploy, and connect MCP servers to Claude using familiar kubectl-style commands. ``` mcpctl get servers NAME TRANSPORT REPLICAS DOCKER IMAGE DESCRIPTION grafana STDIO 1 grafana/mcp-grafana:latest Grafana MCP server home-assistant SSE 1 ghcr.io/homeassistant-ai/ha-mcp:latest Home Assistant MCP docmost SSE 1 10.0.0.194:3012/michal/docmost-mcp:latest Docmost wiki MCP ``` ## What is this? mcpctl manages MCP servers the same way kubectl manages Kubernetes pods. You define servers declaratively in YAML, group them into projects, and connect them to Claude Code or any MCP client through a local proxy. **The architecture:** ``` Claude Code <--STDIO--> mcplocal (local proxy) <--HTTP--> mcpd (daemon) <--Docker--> MCP servers ``` - **mcpd** — the daemon. Runs on a server, manages MCP server containers (Docker/Podman), stores configuration in PostgreSQL. - **mcplocal** — local proxy. Runs on your machine, presents a single MCP endpoint to Claude that merges tools from all your servers. Handles namespacing (`grafana/search_dashboards`), plugin execution (gating, content pipelines), and prompt delivery. - **mcpctl** — the CLI. Talks to mcpd (via mcplocal or directly) to manage everything. ## Quick Start ### 1. Install ```bash # From RPM repository sudo dnf config-manager --add-repo https://your-registry/api/packages/mcpctl/rpm.repo sudo dnf install mcpctl # Or build from source git clone https://github.com/your-org/mcpctl.git cd mcpctl pnpm install pnpm build pnpm rpm:build # requires bun and nfpm ``` ### 2. Connect to a daemon ```bash # Login to an mcpd instance mcpctl login --mcpd-url http://your-server:3000 # Check connectivity mcpctl status ``` ### 3. Create your first secret Secrets store credentials that servers need — API tokens, passwords, etc. ```bash mcpctl create secret grafana-token \ --data TOKEN=glsa_xxxxxxxxxxxx ``` ### 4. Create your first server A server is an MCP server definition — what Docker image to run, what transport it speaks, what environment it needs. ```bash mcpctl create server grafana \ --docker-image grafana/mcp-grafana:latest \ --transport STDIO \ --env GRAFANA_URL=http://grafana.local:3000 \ --env GRAFANA_AUTH_TOKEN=secretRef:grafana-token:TOKEN ``` mcpd pulls the image, starts a container, and keeps it running. Check on it: ```bash mcpctl get instances # See running containers mcpctl logs grafana # View server logs mcpctl describe server grafana # Full details ``` ### 5. Create a project A project groups servers together and configures how Claude interacts with them. ```bash mcpctl create project monitoring \ --description "Grafana dashboards and alerting" \ --server grafana \ --proxy-model content-pipeline ``` ### 6. Connect Claude Code Generate the `.mcp.json` config for Claude Code: ```bash mcpctl config claude --project monitoring ``` This writes a `.mcp.json` that tells Claude Code to connect through mcplocal. Restart Claude Code and your Grafana tools appear: ``` mcpctl console monitoring # Preview what Claude sees ``` ## Declarative Configuration Everything can be defined in YAML and applied with `mcpctl apply`: ```yaml # infrastructure.yaml secrets: - name: grafana-token data: TOKEN: "glsa_xxxxxxxxxxxx" servers: - name: grafana description: "Grafana dashboards and alerting" dockerImage: grafana/mcp-grafana:latest transport: STDIO env: - name: GRAFANA_URL value: "http://grafana.local:3000" - name: GRAFANA_AUTH_TOKEN valueFrom: secretRef: name: grafana-token key: TOKEN projects: - name: monitoring description: "Infrastructure monitoring" proxyModel: content-pipeline servers: - grafana ``` ```bash mcpctl apply -f infrastructure.yaml ``` Round-trip works too — export, edit, re-apply: ```bash mcpctl get all --project monitoring -o yaml > backup.yaml # edit backup.yaml... mcpctl apply -f backup.yaml ``` ## Plugin System (ProxyModel) ProxyModel is mcpctl's plugin system. Each project is assigned a **plugin** that controls how Claude interacts with its servers. Plugins are composed from two layers: **TypeScript plugins** (MCP middleware hooks) and **YAML pipelines** (content transformation stages). ### Built-in Plugins | Plugin | Includes gating | Content pipeline | Description | |--------|:-:|:-:|---| | **default** | Yes | Yes | Gate + content pipeline. The default for all projects. | | **gate** | Yes | No | Gating only — `begin_session` gate with prompt delivery. | | **content-pipeline** | No | No | Content transformation only — no gating. | **Gating** means Claude initially sees only a `begin_session` tool. After calling it with a task description, relevant prompts are delivered and the full tool list is revealed. This keeps Claude's context focused. ```bash # Create a gated project (default behavior) mcpctl create project home --server my-ha --proxy-model default # Create an ungated project (direct tool access, no gate) mcpctl create project tools --server grafana --proxy-model content-pipeline ``` ### Plugin Hooks TypeScript plugins intercept MCP requests/responses at specific lifecycle points: | Hook | When it fires | |------|--------------| | `onSessionCreate` | New MCP session established | | `onSessionDestroy` | Session ends | | `onInitialize` | MCP `initialize` request — can inject instructions | | `onToolsList` | `tools/list` — can filter/modify tool list | | `onToolCallBefore` | Before forwarding a tool call — can intercept | | `onToolCallAfter` | After receiving tool result — can transform | | `onResourcesList` | `resources/list` — can filter resources | | `onResourceRead` | `resources/read` — can intercept resource reads | | `onPromptsList` | `prompts/list` — can filter prompts | | `onPromptGet` | `prompts/get` — can intercept prompt reads | Plugins compose via `extends` — the `default` plugin extends both `gate` and `content-pipeline`, inheriting all their hooks. ### Content Pipelines Content pipelines transform tool results through ordered stages before delivering to Claude: | Pipeline | Stages | Use case | |----------|--------|----------| | **default** | `passthrough` → `paginate` (8KB pages) | Safe pass-through with pagination for large responses | | **subindex** | `section-split` → `summarize-tree` | Splits large content into sections, returns a summary index | #### How `subindex` Works 1. Upstream returns a large tool result (e.g., 50KB of device states) 2. `section-split` divides content into logical sections (2KB-15KB each) 3. `summarize-tree` generates a compact index with section summaries (~200 tokens each) 4. Client receives the index and can request specific sections via `_section` parameter ### Configuration Set per-project: ```yaml kind: project name: home-automation proxyModel: default servers: - home-assistant - node-red ``` Via CLI: ```bash mcpctl create project monitoring --server grafana --proxy-model content-pipeline ``` ### Custom ProxyModels Place YAML files in `~/.mcpctl/proxymodels/` to define custom pipelines: ```yaml kind: ProxyModel metadata: name: my-pipeline spec: stages: - type: section-split config: minSectionSize: 1000 maxSectionSize: 10000 - type: summarize-tree config: maxTokens: 150 maxDepth: 2 appliesTo: [toolResult, prompt] cacheable: true ``` Inspect available plugins and pipelines: ```bash mcpctl get proxymodels # List all plugins and pipelines mcpctl describe proxymodel default # Pipeline details (stages, controller) mcpctl describe proxymodel gate # Plugin details (hooks, extends) ``` ### Custom Stages Drop `.js` or `.mjs` files in `~/.mcpctl/stages/` to add custom transformation stages. Each file must `export default` an async function matching the `StageHandler` contract: ```javascript // ~/.mcpctl/stages/redact-keys.js export default async function(content, ctx) { // ctx provides: contentType, sourceName, projectName, sessionId, // originalContent, llm, cache, log, config const redacted = content.replace(/([A-Z_]+_KEY)=\S+/g, '$1=***'); ctx.log.info(`Redacted ${content.length - redacted.length} chars of secrets`); return { content: redacted }; } ``` Stages loaded from disk appear as `local` source. Use them in a custom ProxyModel YAML: ```yaml kind: ProxyModel metadata: name: secure-pipeline spec: stages: - type: redact-keys # matches filename without extension - type: section-split - type: summarize-tree ``` **Stage contract reference:** | Field | Type | Description | |-------|------|-------------| | `content` | `string` | Input content (from previous stage or raw upstream) | | `ctx.contentType` | `'toolResult' \| 'prompt' \| 'resource'` | What kind of content is being processed | | `ctx.sourceName` | `string` | Tool name, prompt name, or resource URI | | `ctx.originalContent` | `string` | The unmodified content before any stage ran | | `ctx.llm` | `LLMProvider` | Call `ctx.llm.complete(prompt)` for LLM summarization | | `ctx.cache` | `CacheProvider` | Call `ctx.cache.getOrCompute(key, fn)` to cache expensive results | | `ctx.log` | `StageLogger` | `debug()`, `info()`, `warn()`, `error()` | | `ctx.config` | `Record` | Config values from the ProxyModel YAML | **Return value:** ```typescript { content: string; sections?: Section[]; metadata?: Record } ``` If `sections` is returned, the framework stores them and presents a table of contents to the client. The client can drill into individual sections via `_resultId` + `_section` parameters on subsequent tool or prompt calls. ### Section Drill-Down When a stage (like `section-split`) produces sections, the pipeline automatically: 1. Replaces the full content with a compact table of contents 2. Appends a `_resultId` for subsequent drill-down 3. Stores the full sections in memory (5-minute TTL) Claude then calls the same tool (or `prompts/get`) again with `_resultId` and `_section` parameters to retrieve a specific section. This works for both tool results and prompt responses. ``` # What Claude sees (tool result): 3 sections (json): [users] Users (4K chars) [config] Config (1K chars) [logs] Logs (8K chars) _resultId: pm-abc123 — use _resultId and _section parameters to drill into a section. # Claude drills down: → tools/call: grafana/query { _resultId: "pm-abc123", _section: "logs" } ← [full 8K content of the logs section] ``` ### Hot-Reload Stages and ProxyModels reload automatically when files change — no restart needed. - **Stages** (`~/.mcpctl/stages/*.js`): File watcher with 300ms debounce. Add, edit, or remove stage files and they take effect on the next tool call. - **ProxyModels** (`~/.mcpctl/proxymodels/*.yaml`): Re-read from disk on every request, so changes are always picked up. Force a manual reload via the HTTP API: ```bash curl -X POST http://localhost:3200/proxymodels/reload # {"loaded": 3} curl http://localhost:3200/proxymodels/stages # [{"name":"passthrough","source":"built-in"},{"name":"redact-keys","source":"local"},...] ``` ### Built-in Stages Reference | Stage | Description | Key Config | |-------|------------|------------| | `passthrough` | Returns content unchanged | — | | `paginate` | Splits large content into numbered pages | `pageSize` (default: 8000 chars) | | `section-split` | Splits content into named sections by structure (headers, JSON keys, code boundaries) | `minSectionSize` (500), `maxSectionSize` (15000) | | `summarize-tree` | Generates LLM summaries for each section | `maxTokens` (200), `maxDepth` (2) | `section-split` detects content type automatically: | Content Type | Split Strategy | |-------------|---------------| | JSON array | One section per array element, using `name`/`id`/`label` as section ID | | JSON object | One section per top-level key | | YAML | One section per top-level key | | Markdown | One section per `##` header | | Code | One section per function/class boundary | | XML | One section per top-level element | ### Pause Queue (Model Studio) The pause queue lets you intercept pipeline results in real-time — inspect what the pipeline produced, edit it, or drop it before Claude receives the response. ```bash # Enable pause mode curl -X PUT http://localhost:3200/pause -d '{"paused":true}' # View queued items (blocked tool calls waiting for your decision) curl http://localhost:3200/pause/queue # Release an item (send transformed content to Claude) curl -X POST http://localhost:3200/pause/queue//release # Edit and release (send your modified content instead) curl -X POST http://localhost:3200/pause/queue//edit -d '{"content":"modified content"}' # Drop an item (send empty response) curl -X POST http://localhost:3200/pause/queue//drop # Release all queued items at once curl -X POST http://localhost:3200/pause/release-all # Disable pause mode curl -X PUT http://localhost:3200/pause -d '{"paused":false}' ``` The pause queue is also available as MCP tools via `mcpctl console --stdin-mcp`, which gives Claude direct access to `pause`, `get_pause_queue`, and `release_paused` tools for self-monitoring. ## LLM Providers ProxyModel stages that need LLM capabilities (like `summarize-tree`) use configurable providers. Configure in `~/.mcpctl/config.yaml`: ```yaml llm: - name: vllm-local type: openai-compatible baseUrl: http://localhost:8000/v1 model: Qwen/Qwen3-32B - name: anthropic type: anthropic model: claude-sonnet-4-20250514 # API key from: mcpctl create secret llm-keys --data ANTHROPIC_API_KEY=sk-... ``` Providers support **tiered routing** (`fast` for quick summaries, `heavy` for complex analysis) and **automatic failover** — if one provider is down, the next is tried. ```bash # Check active providers mcpctl status # Shows LLM provider status # View provider details curl http://localhost:3200/llm/providers ``` ## Pipeline Cache ProxyModel pipelines cache LLM-generated results (summaries, section indexes) to avoid redundant API calls. The cache is persistent across mcplocal restarts. ### Namespace Isolation Each combination of **LLM provider + model + ProxyModel** gets its own cache namespace: ``` ~/.mcpctl/cache/openai--gpt-4o--content-pipeline/ ~/.mcpctl/cache/anthropic--claude-sonnet-4-20250514--content-pipeline/ ~/.mcpctl/cache/vllm--qwen-72b--subindex/ ``` Switching LLM providers or models automatically uses a fresh cache — no stale results from a different model. ### CLI Management ```bash # View cache statistics (per-namespace breakdown) mcpctl cache stats # Clear all cache entries mcpctl cache clear # Clear a specific namespace mcpctl cache clear openai--gpt-4o--content-pipeline # Clear entries older than 7 days mcpctl cache clear --older-than 7 ``` ### Size Limits The cache enforces a configurable maximum size (default: 256MB). When exceeded, the oldest entries are evicted (LRU). Entries older than 30 days are automatically expired. Size can be specified as bytes, human-readable units, or a percentage of the filesystem: ```typescript new FileCache('ns', { maxSize: '512MB' }) // fixed size new FileCache('ns', { maxSize: '1.5GB' }) // fractional units new FileCache('ns', { maxSize: '10%' }) // 10% of partition ``` ## Resources | Resource | What it is | Example | |----------|-----------|---------| | **server** | MCP server definition | Docker image + transport + env vars | | **instance** | Running container (immutable) | Auto-created from server replicas | | **secret** | Key-value credentials | API tokens, passwords | | **template** | Reusable server blueprint | Community server configs | | **project** | Workspace grouping servers | "monitoring", "home-automation" | | **prompt** | Curated content for Claude | Instructions, docs, guides | | **promptrequest** | Pending prompt proposal | LLM-submitted, needs approval | | **rbac** | Access control bindings | Who can do what | | **serverattachment** | Server-to-project link | Virtual resource for `apply` | ## Commands ```bash # List resources mcpctl get servers mcpctl get instances mcpctl get projects mcpctl get prompts --project myproject # Detailed view mcpctl describe server grafana mcpctl describe project monitoring # Create resources mcpctl create server [flags] mcpctl create secret --data KEY=value mcpctl create project --server [--proxy-model ] mcpctl create prompt --project --content "..." # Modify resources mcpctl edit server grafana # Opens in $EDITOR mcpctl patch project myproj proxyModel=default mcpctl apply -f config.yaml # Declarative create/update # Delete resources mcpctl delete server grafana # Logs and debugging mcpctl logs grafana # Container logs mcpctl console monitoring # Interactive MCP console mcpctl console --inspect # Traffic inspector mcpctl console --audit # Audit event timeline mcpctl console --stdin-mcp # Claude monitor (MCP tools for Claude) # Backup and restore mcpctl backup -o backup.json mcpctl restore -i backup.json # Project management mcpctl --project monitoring get servers # Project-scoped listing mcpctl --project monitoring attach-server grafana mcpctl --project monitoring detach-server grafana ``` ## Templates Templates are reusable server configurations. Create a server from a template without repeating all the config: ```bash # Register a template mcpctl create template home-assistant \ --docker-image "ghcr.io/homeassistant-ai/ha-mcp:latest" \ --transport SSE \ --container-port 8086 # Create a server from it mcpctl create server my-ha \ --from-template home-assistant \ --env-from-secret ha-secrets ``` ## Gated Sessions Projects using the `default` or `gate` plugin are **gated**. When Claude connects to a gated project: 1. Claude sees only a `begin_session` tool initially 2. Claude calls `begin_session` with a description of its task 3. mcplocal matches relevant prompts and delivers them 4. The full tool list is revealed This keeps Claude's context focused — instead of dumping 100+ tools and pages of docs upfront, only the relevant ones are delivered based on the task at hand. ```bash # Gated (default) mcpctl create project monitoring --server grafana --proxy-model default # Ungated (direct tool access) mcpctl create project tools --server grafana --proxy-model content-pipeline ``` ## Prompts Prompts are curated content delivered to Claude through the MCP protocol. They can be plain text or linked to external MCP resources (like wiki pages). ```bash # Create a text prompt mcpctl create prompt deployment-guide \ --project monitoring \ --content-file docs/deployment.md \ --priority 7 # Create a linked prompt (content fetched live from an MCP resource) mcpctl create prompt wiki-page \ --project monitoring \ --link "monitoring/docmost:docmost://pages/abc123" \ --priority 5 ``` Claude can also **propose** prompts during a session. These appear as prompt requests that you can review and approve: ```bash mcpctl get promptrequests mcpctl approve promptrequest proposed-guide ``` ## Interactive Console The console lets you see exactly what Claude sees — tools, resources, prompts — and call tools interactively: ```bash mcpctl console monitoring ``` The traffic inspector watches MCP traffic from other clients in real-time: ```bash mcpctl console --inspect ``` ### Claude Monitor (stdin-mcp) Connect Claude itself as a monitor via the inspect MCP server: ```bash mcpctl console --stdin-mcp ``` This exposes MCP tools that let Claude observe and control traffic: | Tool | Description | |------|------------| | `list_models` | List configured LLM providers and their status | | `list_stages` | List all available pipeline stages (built-in + custom) | | `switch_model` | Change the active LLM provider for pipeline stages | | `get_model_info` | Get details about a specific LLM provider | | `reload_stages` | Force reload custom stages from disk | | `pause` | Toggle pause mode (intercept pipeline results) | | `get_pause_queue` | List items held in the pause queue | | `release_paused` | Release, edit, or drop a paused item | ## Architecture ``` ┌──────────────┐ ┌─────────────────────────────────────────┐ │ Claude Code │ STDIO │ mcplocal (proxy) │ │ │◄─────────►│ │ │ (or any MCP │ │ Namespace-merging MCP proxy │ │ client) │ │ Gated sessions + prompt delivery │ │ │ │ Per-project endpoints │ └──────────────┘ │ Traffic inspection │ └──────────────┬──────────────────────────┘ │ HTTP (REST + MCP proxy) │ ┌──────────────┴──────────────────────────┐ │ mcpd (daemon) │ │ │ │ REST API (/api/v1/*) │ │ MCP proxy (routes tool calls) │ │ PostgreSQL (Prisma ORM) │ │ Docker/Podman container management │ │ Health probes (STDIO, SSE, HTTP) │ │ RBAC enforcement │ │ │ │ ┌───────────────────────────────────┐ │ │ │ MCP Server Containers │ │ │ │ │ │ │ │ grafana/ home-assistant/ ... │ │ │ │ (managed + proxied by mcpd) │ │ │ └───────────────────────────────────┘ │ └─────────────────────────────────────────┘ ``` Clients never connect to MCP server containers directly — all tool calls go through mcplocal → mcpd, which proxies them to the right container via STDIO/SSE/HTTP. This keeps containers unexposed and lets mcpd enforce RBAC and health checks. **Tool namespacing**: When Claude connects to a project with servers `grafana` and `slack`, it sees tools like `grafana/search_dashboards` and `slack/send_message`. mcplocal routes each call through mcpd to the correct upstream server. ## Project Structure ``` mcpctl/ ├── src/ │ ├── cli/ # mcpctl command-line interface (Commander.js) │ ├── mcpd/ # Daemon server (Fastify 5, REST API) │ ├── mcplocal/ # Local MCP proxy (namespace merging, gating) │ ├── db/ # Database schema (Prisma) and migrations │ └── shared/ # Shared types and utilities ├── deploy/ # Docker Compose for local development ├── stack/ # Production deployment (Portainer) ├── scripts/ # Build, release, and deploy scripts ├── examples/ # Example YAML configurations └── completions/ # Shell completions (fish, bash) ``` ## Development ```bash # Prerequisites: Node.js 20+, pnpm 9+, Docker/Podman # Install dependencies pnpm install # Start local database pnpm db:up # Generate Prisma client cd src/db && npx prisma generate && cd ../.. # Build all packages pnpm build # Run tests pnpm test:run # Development mode (mcpd with hot-reload) cd src/mcpd && pnpm dev ``` ## License MIT