# mcpctl — Comprehensive Project Summary **kubectl for Model Context Protocol servers.** mcpctl is a production-grade management system for MCP servers, providing a Kubernetes-inspired declarative interface for deploying, orchestrating, and observing MCP servers that connect to Claude and other LLM clients. --- ## Table of Contents 1. [System Architecture](#1-system-architecture) 2. [Component Overview](#2-component-overview) 3. [Resource Model & Design Decisions](#3-resource-model--design-decisions) 4. [CLI Reference](#4-cli-reference) 5. [API Surface (mcpd)](#5-api-surface-mcpd) 6. [Database Schema](#6-database-schema) 7. [Local Proxy (mcplocal)](#7-local-proxy-mcplocal) 8. [ProxyModel Plugin System](#8-proxymodel-plugin-system) 9. [Gated Sessions](#9-gated-sessions) 10. [Content Pipeline & Stages](#10-content-pipeline--stages) 11. [LLM Provider Integration](#11-llm-provider-integration) 12. [Caching](#12-caching) 13. [Authentication & RBAC](#13-authentication--rbac) 14. [Audit Infrastructure & Trust Model](#14-audit-infrastructure--trust-model) 15. [Container Orchestration](#15-container-orchestration) 16. [Deployment & Distribution](#16-deployment--distribution) 17. [Testing Strategy](#17-testing-strategy) 18. [Technology Stack](#18-technology-stack) 19. [Project Structure](#19-project-structure) 20. [Deferred & Future Work](#20-deferred--future-work) --- ## 1. System Architecture ### Three-Tier Design ``` Claude Code / LLM Client | (STDIO — MCP JSON-RPC protocol) v mcplocal (Local Daemon — developer machine) | (HTTP REST) v mcpd (Remote Daemon — server/NAS, e.g. 10.0.0.194) | (Docker/Podman API) v MCP Server Containers (isolated network) ``` ``` ┌────────────┐ ┌─────────────────┐ HTTP ┌──────────────┐ │ PostgreSQL │ │ mcpctl CLI │──────────────>│ mcpd │──>│ │ │ (Commander.js) │ │ (Fastify 5) │ └────────────┘ └─────────────────┘ └──────┬───────┘ │ Docker/Podman API v ┌──────────────┐ │ Containers │ │ (MCP servers)│ └──────────────┘ ┌─────────────────┐ STDIO ┌──────────────┐ STDIO/HTTP ┌────────────┐ │ Claude / LLM │────────────>│ mcplocal │───────────────>│ MCP Servers│ │ │ │ (McpRouter) │ │ │ └─────────────────┘ └──────────────┘ └────────────┘ ``` ### Key Principle - **mcpd owns the database** (PostgreSQL) — the only component that talks to the DB - **mcplocal is stateless** — config-only, no database, acts as intelligent proxy - **mcpctl stores only credentials** — `~/.mcpctl/config.json` and `~/.mcpctl/credentials.json` - **All MCP servers run inside the mcpd container** on the NAS via podman (container-in-container) --- ## 2. Component Overview ### mcpctl (CLI) - kubectl-like interface for managing the entire system - Talks to mcplocal (local daemon) via HTTP REST, or directly to mcpd with `--direct` - Distributed as RPM/DEB package via Gitea registry - Built with Commander.js, Ink/React for TUI, Inquirer for prompts ### mcplocal (Local Daemon) - Runs on developer machine as a systemd user service - Exposes MCP protocol via STDIO to Claude - Exposes HTTP REST API for mcpctl management commands - Core responsibilities: - Tool namespacing and routing (`server/tool` format) - Gated sessions and prompt delivery - Content pipeline (transformation stages) - LLM integration for intelligent prompt selection - Pipeline result caching - Audit event collection ### mcpd (Remote Daemon) - Server-side daemon on NAS/cloud (Fastify 5) - Manages MCP server containers (Docker/Podman via dockerode) - PostgreSQL for state, audit logs, access control - Owns credentials (never exposed to mcplocal) - REST API for all management operations - MCP proxy endpoint for direct tool invocation - Health probe runner for container monitoring - Git-based backup system ### @mcpctl/db (Database Layer) - Prisma ORM with PostgreSQL - 22 models, 11 migrations - Template seeding from YAML files at startup ### @mcpctl/shared (Shared Utilities) - Constants, types, validation schemas (Zod) - Secret encryption/decryption utilities - Zero external dependencies beyond Zod --- ## 3. Resource Model & Design Decisions ### ADR-001: Kubernetes-Style Resource Model | mcpctl Resource | K8s Analogy | Behavior | |----------------|-------------|----------| | **Server** | Deployment | Self-contained, complete definition. Contains image, command, transport, env refs, replicas. No external template dependencies at runtime. | | **Instance** | Pod | Immutable, ephemeral, auto-managed by reconciliation loop. No `create instance` or `edit instance`. Delete triggers re-creation. | | **Secret** | Secret | Holds sensitive key-value pairs. Servers reference via `env[].valueFrom.secretRef`. | | **Project** | Namespace | Groups servers, configures ProxyModel and LLM provider. Generates `.mcp.json`. | | **Prompt** | ConfigMap (sort of) | Instruction text delivered to Claude. Global or project-scoped. Priority-ranked. | | **Template** | — | Blueprints for server creation. Used at create-time only, not runtime. | | **RbacDefinition** | ClusterRoleBinding | Named policies with subjects and roleBindings. | ### ADR-002: Profiles Replaced with Secrets The original `McpProfile` resource tried to be secrets, configmaps, and project-server links simultaneously. Environment variables declared in profiles were never actually passed to running containers. **Decision:** Replace with dedicated `Secret` resource following Kubernetes conventions: ```yaml servers: - name: ha-mcp env: - name: HOMEASSISTANT_TOKEN valueFrom: secretRef: name: ha-credentials key: HOMEASSISTANT_TOKEN ``` ### ADR-003: Self-Contained Servers with Source Tracking Servers store complete definitions (no runtime template dependencies). Optional `source` metadata enables registry-based upgrades via 3-way diff (old snapshot vs current server vs new template). **Rationale:** Matches kubectl mental model. `get server X -o yaml > new.yaml && edit && apply` works naturally. Duplication is minimal (~10 lines YAML). ### ADR-004: ConfigMaps Deferred Only Secrets implemented. ConfigMap separation can be added later if needed. Keeps the model simple. ### ADR-005: Apply-Compatible YAML Round-Trip `mcpctl get server ha-mcp -o yaml > s.yaml && mcpctl apply -f s.yaml` must work: - `get -o yaml/json` strips internal fields (id, createdAt, updatedAt, version, ownerId) - Output wrapped in resource key: `{ servers: [...] }` - `describe -o yaml/json` keeps full raw output (for debugging) ### ADR-006: CLI Design Principles 1. Everything possible via `apply -f` MUST also be possible via `create` CLI flags 2. Support `-o yaml` and `-o json` like kubectl 3. `describe` shows visually clean sectioned output with tables 4. Name resolution works everywhere (not just IDs) 5. Instances are immutable (like pods) — no create/edit --- ## 4. CLI Reference ### Global Options ``` --daemon-url mcplocal daemon URL --direct bypass mcplocal, connect directly to mcpd -p, --project Target project -o, --output table | json | yaml -v, --version Show version ``` ### Resource Operations | Command | Description | |---------|-------------| | `mcpctl get [name]` | List resources or fetch by name/ID. Supports glob patterns (`graf*`). | | `mcpctl describe ` | Detailed view with sections and tables. | | `mcpctl create [opts]` | Create resource. Mirrors `apply -f` capabilities. | | `mcpctl edit ` | Open in `$EDITOR` as YAML, apply on save. | | `mcpctl patch key=val...` | Patch individual fields without editor. | | `mcpctl delete ` | Delete resource. | | `mcpctl apply -f ` | Declarative YAML/JSON application (like `kubectl apply`). Supports `--dry-run`. | ### Supported Resources servers, projects, instances, secrets, templates, users, groups, rbac, prompts, promptrequests, serverattachments (virtual), proxymodels (virtual, from mcplocal), all (project export) ### Resource Aliases ``` server/srv → servers project/proj → projects instance/inst → instances secret/sec → secrets template/tpl → templates prompt → prompts user → users group → groups rbac/rbac-definition → rbac promptrequest/pr → promptrequests serverattachment/sa → serverattachments proxymodel/pm → proxymodels ``` ### Lifecycle & Diagnostics | Command | Description | |---------|-------------| | `mcpctl status` | Show connectivity, auth status, LLM provider health, available models. | | `mcpctl login` | Authenticate with mcpd (first login bootstraps initial user). | | `mcpctl logout` | Clear stored credentials. | | `mcpctl logs [-t N] [-i index]` | Stream container logs. Resolves server name → running instance. | | `mcpctl cache stats` | Show pipeline cache statistics per namespace. | | `mcpctl cache clear [ns] [--older-than N]` | Clear pipeline cache. | | `mcpctl backup` | Show git backup status, public SSH key. | | `mcpctl backup log [-n N]` | Show backup commit history. | | `mcpctl backup restore list/diff/to` | Restore to specific backup commit. | ### Console & Inspection | Command | Description | |---------|-------------| | `mcpctl console [project]` | Interactive TUI — request/response timeline, tool inspection. | | `mcpctl console --stdin-mcp` | MCP server mode over stdin/stdout (for Claude integration). | | `mcpctl console --audit` | Browse audit events from mcpd interactively. | ### Configuration | Command | Description | |---------|-------------| | `mcpctl config view` | Show current configuration. | | `mcpctl config set ` | Set config value (mcplocalUrl, mcpdUrl, registries, outputFormat, etc.). | | `mcpctl config path` | Show config file path. | | `mcpctl config setup` | Interactive configuration wizard. | | `mcpctl config claude -p ` | Generate `.mcp.json` for Claude Code. | ### Create Subcommands ```bash mcpctl create server [--package-name X] [--docker-image X] [--transport STDIO|SSE|STREAMABLE_HTTP] [--runtime node|python] [--replicas N] [--env KEY=val] [--from-template name:version] mcpctl create secret [--data key=val ...] [--data-file path.json] mcpctl create project [-d desc] [--proxy-model default|gate|content-pipeline] [--server name ...] mcpctl create user [--password pass] [--name name] mcpctl create group [-d desc] [--member email ...] mcpctl create rbac [--subject kind:name] [--role-binding role:resource[:name]] mcpctl create prompt [--content text] [--project name] [--priority 1-10] [--link url] ``` ### Apply File Format ```yaml secrets: - name: my-secret data: KEY: value servers: - name: my-server transport: STDIO packageName: "@modelcontextprotocol/server-example" env: - name: API_KEY valueFrom: secretRef: name: my-secret key: KEY projects: - name: my-project proxyModel: default servers: - my-server serverattachments: - server: my-server project: my-project prompts: - name: my-prompt project: my-project content: "Instruction text..." priority: 5 ``` --- ## 5. API Surface (mcpd) All endpoints under `/api/v1/` require Bearer token auth except `/auth/*` and `/health*`. ### Authentication | Endpoint | Method | Description | |----------|--------|-------------| | `/auth/bootstrap` | POST | First-user setup (creates admin + bootstrap RBAC) | | `/auth/status` | GET | `{hasUsers: boolean}` (unauthenticated) | | `/auth/login` | POST | Returns token + user info | | `/auth/logout` | POST | Invalidate session | | `/auth/me` | GET | Current user identity | | `/auth/impersonate` | POST | Create session for another user (requires `run:impersonate`) | ### Servers | Endpoint | Method | Description | |----------|--------|-------------| | `/servers` | GET | List all servers | | `/servers/:id` | GET | Get server by CUID | | `/servers` | POST | Create server (validates name uniqueness, image/package) | | `/servers/:id` | PUT | Update server, re-reconciles replicas | | `/servers/:id` | DELETE | Delete server + cascade-delete all instances | ### Instances | Endpoint | Method | Description | |----------|--------|-------------| | `/instances` | GET | List (optional `?serverId=` filter) | | `/instances/:id` | GET | Get instance | | `/instances/:id` | DELETE | Delete instance, triggers reconciliation | | `/instances/:id/inspect` | GET | Docker inspect output (state, port, IP) | | `/instances/:id/logs` | GET | Container logs (`?tail=N`) | ### Projects | Endpoint | Method | Description | |----------|--------|-------------| | `/projects` | GET | List (RBAC-filtered) | | `/projects/:id` | GET/POST/PUT/DELETE | CRUD by CUID or name | | `/projects/:id/mcp-config` | GET | Generate `.mcp.json` | | `/projects/:id/instructions` | GET | Get prompt + attached servers for system message | | `/projects/:id/servers` | GET/POST | List/attach servers | | `/projects/:id/servers/:name` | DELETE | Detach server | ### Prompts & Prompt Requests | Endpoint | Method | Description | |----------|--------|-------------| | `/prompts` | GET/POST | List/create approved prompts | | `/prompts/:id` | PUT/DELETE | Update/delete (system prompts reset to default) | | `/prompts/:id/regenerate-summary` | POST | Force re-generate summary/chapters | | `/promptrequests` | GET/POST | List/create pending requests | | `/promptrequests/:id/approve` | POST | Atomic delete request → create prompt | | `/projects/:name/prompts/visible` | GET | Approved + session's pending | | `/projects/:name/prompt-index` | GET | Compact index for gating | ### Secrets, Users, Groups, RBAC Standard CRUD on `/secrets`, `/users`, `/groups`, `/rbac-definitions`. ### Health & Monitoring | Endpoint | Method | Description | |----------|--------|-------------| | `/health/overview` | GET | System health, instance counts, error rate | | `/health/instances/:id` | GET | Instance-specific health, uptime, latency | | `/metrics` | GET | Request counts, error counts, last request time | | `/healthz` | GET | Liveness probe | ### Backup, Restore, Audit | Endpoint | Method | Description | |----------|--------|-------------| | `/backup` | POST | Create encrypted bundle (servers, secrets, projects, users, groups, rbac) | | `/restore` | POST | Restore bundle (merge/skip/overwrite strategy) | | `/audit/events` | POST/GET | Batch insert from mcplocal / query with filters | | `/audit/sessions` | GET | Session aggregates (first/last seen, event counts) | | `/git/backup/init` | POST | Initialize git backup with SSH credentials | | `/git/backup/status` | GET | Backup sync status | | `/git/backup/sync` | POST | Manual trigger sync | ### MCP Proxy | Endpoint | Method | Description | |----------|--------|-------------| | `/mcp/proxy` | POST | Forward JSON-RPC to running MCP server instance. Dispatches by transport (STDIO via docker exec, SSE/HTTP via direct HTTP). Maintains persistent STDIO connections. | --- ## 6. Database Schema PostgreSQL via Prisma ORM. 22 models across 11 migrations. ### Core Models **User & Auth:** - `User` — email/password (bcrypt), role (USER/ADMIN), optional OAuth - `Session` — Bearer token with 30-day TTL - `Group` / `GroupMember` — user groups for RBAC **MCP Infrastructure:** - `McpServer` — transport (STDIO/SSE/STREAMABLE_HTTP), docker image, package name, runtime (node/python), env vars (JSON), health check config, replicas, external URL - `McpTemplate` — reusable blueprints for server creation (mirrors McpServer fields) - `McpInstance` — running containers, status (STARTING/RUNNING/STOPPING/STOPPED/ERROR), container ID, port, health status, events **Organization:** - `Project` — LLM config (provider, model), proxy model, gated flag, prompt instructions, server overrides - `ProjectServer` — junction table linking projects to servers - `Secret` — named secret bundles (data as encrypted JSON), versioned **Content:** - `Prompt` — approved system prompts (global or project-scoped), priority, summary/chapters, optional link target - `PromptRequest` — pending prompt proposals from LLM sessions **Audit & Backup:** - `AuditLog` — user action trail (action, resource, resourceId, details) - `AuditEvent` — pipeline/gate/tool trace events from mcplocal (sessionId, projectName, eventKind, correlationId, userName) - `BackupPending` — queue for git-based backup sync - `RbacDefinition` — named RBAC policies --- ## 7. Local Proxy (mcplocal) ### Request Flow ``` Claude (STDIO JSON-RPC) ↓ StdioProxyServer (reads from stdin) ↓ McpRouter.route(request) ├→ PluginSessionContext (per-session state) ├→ ProxyModelPlugin hooks (intercept/transform) ├→ Upstream lookup (tool name prefix → server) └→ Response (with optional drill-down sections) ``` ### Router Responsibilities - Manages upstream connections (STDIO child processes, HTTP) - Maps tools/resources/prompts to servers via name prefix (`servername/toolname`) - Maintains prompt index + system prompt cache (TTL-based) - Dispatches plugin hooks via `getOrCreatePluginContext` - Section storage for drill-down navigation - Audit event collection and batching - Link resolution (relative → absolute URLs) ### Upstream Transports | Transport | Implementation | |-----------|---------------| | STDIO | Spawns child process, bidirectional pipe, JSON-RPC over newline-delimited JSON | | SSE | HTTP GET for event stream, POST for messages | | Streamable HTTP | HTTP POST with JSON-RPC payloads | ### HTTP Endpoints (mcplocal) | Endpoint | Description | |----------|-------------| | `GET /proxymodels` | List all models (YAML pipelines + TS plugins) | | `GET /proxymodels/:name` | Single model details | | `GET /proxymodels/stages` | List available stages | | `POST /proxymodels/reload` | Force reload stages from disk | | `GET /cache/stats` | Per-namespace cache statistics | | `DELETE /cache` | Clear all or by age | | `DELETE /cache/:namespace` | Clear specific namespace | | `POST /mcp` | JSON-RPC request forwarding | --- ## 8. ProxyModel Plugin System A **ProxyModel** is either a **Pipeline** (YAML) or a **Plugin** (TypeScript). ### Plugin Interface | Hook | When it fires | |------|--------------| | `onSessionCreate` | New MCP session established | | `onSessionDestroy` | Session ends | | `onInitialize` | MCP initialize request — can inject instructions | | `onToolsList` | tools/list — can filter/modify tool list | | `onToolCallBefore` | Before forwarding a tool call — can intercept | | `onToolCallAfter` | After receiving tool result — can transform | | `onResourcesList` | resources/list — can filter resources | | `onResourceRead` | resources/read — can intercept reads | | `onPromptsList` | prompts/list — can filter prompts | | `onPromptGet` | prompts/get — can intercept reads | ### Built-in Plugins | Plugin | Extends | Gating | Content Pipeline | Use Case | |--------|---------|:------:|:----------------:|----------| | `gate` | — | Yes | No | Gating + prompt delivery only | | `content-pipeline` | — | No | Yes | Content transformation only | | `default` | gate + content-pipeline | Yes | Yes | Full pipeline (most common) | **Inheritance:** Plugins can extend parents. Conflicting hooks from multiple parents cause load-time errors (except chainable lifecycle hooks which run sequentially). ### Pipeline Configuration (YAML) ```yaml name: default spec: controller: gate controllerConfig: { byteBudget: 8192 } stages: - type: passthrough - type: paginate config: { pageSize: 8000 } appliesTo: [prompt, toolResult] cacheable: true ``` ### Per-Session Context Each session gets a `PluginSessionContext` providing: - Session state (`Map`) - LLM provider, cache provider, structured logger - Virtual tool/server registration - Upstream routing and tool discovery - Content processing and notifications - Audit event emission --- ## 9. Gated Sessions ### Problem When Claude connects to an MCP server, it sees all tools immediately and starts using them. In a managed environment, you want to deliver relevant context (prompts/instructions) before granting tool access. ### Solution: Keyword-Driven Prompt Retrieval 1. **Initialize:** Instructions include prompt index + "call `begin_session` immediately" 2. **Gated `tools/list`:** Only `begin_session` visible 3. **Claude calls `begin_session`** with keywords describing the task 4. **Prompt matching:** Keywords matched against prompt summaries/chapters 5. **Ungating:** Matched prompts returned + `tools/list_changed` notification sent 6. **Full access:** All upstream tools now visible ### Prompt Scoring **Formula:** `priority + (matchCount * priority)` - Priority alone is baseline — ensures global prompts compete for inclusion - Tag matches multiply priority — relevant prompts score higher - Priority 10 = always included (bypasses budget) - 8KB byte budget cap; overflow prompts listed as index-only ### Critical Design Lessons **What works:** - One gate tool (`begin_session`), zero ambiguity - Instructions say "check its input schema" (not naming specific parameters) - "immediately" and "required" prevent Claude from exploring first - Tool names listed as preview in instructions (helps keyword generation) - `tools/list_changed` notification mandatory after ungating - Auto-ungate fallback if Claude bypasses gate **What fails:** - Naming parameters that don't match the schema - Complex conditional instructions (Claude prefers simple paths) - Multiple tools in gated state (Claude skips the gate) - Gate instructions only in tool description (must be in initialize response) - Burying the call-to-action after 200 lines of context ### Complete Flow ``` Client mcplocal upstream │── initialize ────────>│ │<── instructions ──────│ (gate instructions + prompt index + tool preview) │── tools/list ────────>│ │<── [begin_session] ───│ (ONLY begin_session visible) │── tools/call ────────>│ │ begin_session │── match prompts ─────────>│ │ {tags:[...]} │<── prompt content ────────│ │<── matched prompts ───│ (full content + encouragement) │<── notification ──────│ (tools/list_changed) │── tools/list ────────>│ │<── [108+ tools] ──────│ (ALL tools now visible) │ │ │ Claude proceeds with full tool access ``` --- ## 10. Content Pipeline & Stages ### How It Works Tool results pass through an ordered sequence of stages before reaching Claude: 1. Each stage receives previous stage's content 2. Returns `{content, sections?, metadata?}` 3. Sections enable drill-down navigation 4. Stage errors are caught — pipeline continues with previous content ### Built-in Stages | Stage | Purpose | |-------|---------| | `passthrough` | Identity transform (testing/baseline) | | `paginate` | Split large content into numbered pages (8KB default). LLM-generated page titles (cached). | | `section-split` | Split by structure: JSON arrays/objects → elements/keys, YAML → keys, prose → `##` headers, code → function/class boundaries. Merges tiny sections, re-splits oversized ones. | | `summarize-tree` | Hierarchical LLM-generated section summaries. Groups sections into trees. Cached. | ### Section Drill-Down After pipeline produces sections: 1. Full content replaced with compact table of contents + `_resultId` 2. Sections stored in session-scoped store (5-minute TTL) 3. Client calls same tool with `_resultId` + `_section` to retrieve specific section 4. Supports hierarchical navigation (sections within sections) ### Custom Stages Drop `.js` files in `~/.mcpctl/stages/`: ```javascript export default async function myStage(input, context) { // context.llm, context.cache, context.log available return { content: transformedContent, sections: [...] }; } ``` Hot-reload with 300ms file watch debounce. Built-in stages take precedence. --- ## 11. LLM Provider Integration ### Supported Providers | Provider | Type | Tier | |----------|------|------| | Gemini CLI | Local | Fast | | Ollama | Local | Fast | | DeepSeek | API | Fast/Heavy | | OpenAI | API | Heavy | | Anthropic | API | Heavy | | vLLM | Local | Configurable | | vLLM Managed | Auto-managed local | Configurable | ### Tier System - **Fast tier:** Quick, cheap models for pipeline stages and keyword extraction - **Heavy tier:** Full models for complex prompt selection and summarization - **Legacy active:** Single default provider (fallback) ### LLM Adapter Stages use a simple interface: ```typescript interface LLMProvider { complete(prompt: string, options?): Promise; available(): boolean; } ``` Resolution order: named provider → fast tier → heavy tier → active provider. ### Multi-Provider Configuration ```json { "llm": { "providers": [ { "name": "fast-local", "type": "ollama", "model": "llama3", "tier": "fast" }, { "name": "heavy-api", "type": "openai", "model": "gpt-4", "tier": "heavy" } ] } } ``` --- ## 12. Caching ### Architecture: L1 Memory + L2 Disk - **L1 in-memory:** LRU map (default 500 entries) for fast lookups - **L2 disk:** `~/.mcpctl/cache//.dat` - Namespace: `provider--model--proxymodel` (e.g., `openai--gpt-4o--content-pipeline`) - Key: 16-char hex SHA256 prefix of content - Value: raw content (no JSON wrapper) ### Configuration | Option | Default | Examples | |--------|---------|---------| | `maxSize` | 256MB | `"1GB"`, `"10%"` (of partition), `536870912` (bytes) | | `ttlMs` | 30 days | Any millisecond value | | `maxMemoryEntries` | 500 | L1 LRU cap | | `dir` | `~/.mcpctl/cache` | Custom path | ### Management ```bash mcpctl cache stats # Per-namespace breakdown mcpctl cache clear # Clear everything mcpctl cache clear openai--gpt-4--default # Clear specific namespace mcpctl cache clear --older-than 7 # Clear entries older than 7 days ``` ### HTTP API ``` GET /cache/stats # Per-namespace stats DELETE /cache # Clear all (or ?olderThan=N) DELETE /cache/:namespace # Clear specific namespace ``` --- ## 13. Authentication & RBAC ### Auth Flow 1. `mcpctl login` → prompts for email/password 2. First login (no users in system) → bootstrap: creates admin user + admin group + bootstrap RBAC 3. POST `/auth/login` → returns 30-day bearer token 4. Token stored in `~/.mcpctl/credentials.json` 5. CLI passes token to mcplocal config 6. mcplocal attaches `Authorization: Bearer ` to all mcpd requests 7. mcpd validates token against Session table ### RBAC Model **Subjects:** User (by email), Group, ServiceAccount **Roles & Capabilities:** | Role | Grants | |------|--------| | `edit` | view, create, delete, edit, expose | | `view` | view | | `create` | create | | `delete` | delete | | `run` | run (for operations) | | `expose` | expose, view | **Resources:** `*`, servers, instances, secrets, projects, templates, users, groups, rbac, prompts, promptrequests **Binding Types:** - **Resource binding:** `{role: 'edit', resource: 'servers', name?: 'my-server'}` - With `name`: user can only access that specific resource - Without `name`: user can access all resources of that type - **Operation binding:** `{role: 'run', action: 'impersonate'}` - Grants permission for named operations (backup, restore, audit-purge, logs, impersonate) ### Resolution 1. CLI resolves name → CUID client-side before API calls 2. RBAC hook resolves CUID → name before checking bindings 3. List filtering: `getAllowedScope()` computes allowed names, `preSerialization` hook filters arrays 4. Wildcard scope: user has unscoped binding → sees all resources 5. Named scope: user has only name-scoped bindings → filtered to allowed names --- ## 14. Audit Infrastructure & Trust Model ### Event Kinds | Event Kind | Description | |------------|-------------| | `pipeline_execution` | Full pipeline run summary (duration, stage count, sizes) | | `stage_execution` | Individual stage detail (duration, input/output size, error) | | `gate_decision` | Gate open/close with client intent and matched prompts | | `prompt_delivery` | Which prompts were sent, match scores | | `tool_call_trace` | Tool call with server + timing + result size | | `rbac_decision` | Access control decisions | | `session_bind` | Session initialization | ### Trust Model | Source | Verified | Meaning | |--------|----------|---------| | `client` | false | Client LLM claims (begin_session intent, tags) | | `mcplocal` | true | Server-side data (prompt matches, pipeline transforms) | | `mcpd` | true | mcpd-originated events | ### AuditCollector Fire-and-forget batching: 50 events max, 5-second flush interval. POSTs to mcpd. Non-blocking — audit failures don't affect tool calls. ### Correlation & Causality - `correlationId` links related events (all events from one tool call) - `parentEventId` enables causal chains (gate_decision → pipeline_execution) - `userName` tracks which user triggered the event - Designed for future graphiti knowledge graph ingestion ### Per-Server Targeting Different servers in a project can have different proxymodel configs via `serverOverrides` on the project resource. Resolution: server override → project default → null. ### Future (Designed, Not Implemented) - **Virtual MCP Audit Server:** mcpd-hosted virtual server providing `query_audit_log`, `get_session_timeline` tools. Claude can directly query audit data. - **Graphiti Integration:** Causal graph with entity types (Session, Tool, Server, ProxyModel, Prompt, Stage) and edges (`triggered_by`, `transformed_by`, `verified_by`). - **Lab Parameter Simulation:** Select any pipeline event, retrieve original input, re-run with different proxyModel/LLM/stages, side-by-side diff. - **Audit Level Config:** Per-server `auditLevel: 'full' | 'hash-only' | 'disabled'`. --- ## 15. Container Orchestration ### Orchestrator Interface `McpOrchestrator` abstracts container management (Docker/Podman today, Kubernetes in the future): - `pullImage`, `createContainer`, `stopContainer`, `removeContainer` - `inspectContainer`, `getContainerLogs`, `execInContainer`, `ping` ### Container Management - Labels: `mcpctl.managed=true` for filtering - Network: `mcp-servers` (configurable via `MCPD_MCP_NETWORK`) - Resource limits: 512MB RAM, 0.5 CPU (configurable) - Internal container IP exposed via inspect ### Runtime Spawn Commands | Runtime | Command | |---------|---------| | Node | `npx --prefer-offline -y ` | | Python | `uvx ` | | Custom | Explicit `command` field | ### Health Probes Periodic MCP tool-call probes (like K8s livenessProbe): - Default interval: 15 seconds - Dispatch by transport: STDIO (docker exec), HTTP (JSON-RPC) - Failure threshold: 3 consecutive failures → unhealthy - Updates instance `healthStatus` and `lastHealthCheck` ### Reconciliation Loop Maintains desired replica count: - If running < desired → start new instances - If running > desired → stop excess instances - Detects crashed containers → marks ERROR → triggers re-creation ### Persistent STDIO Connections For STDIO transport, mcpd maintains long-lived exec sessions (`PersistentStdioClient`) to avoid repeated `docker exec` overhead. Bidirectional streaming for interactive sessions. --- ## 16. Deployment & Distribution ### Production Deployment **mcpd runs on 10.0.0.194** (NAS, managed via Portainer), NOT on the dev machine. ```bash # Full deploy (preferred after merging) bash fulldeploy.sh ``` `fulldeploy.sh` runs three steps: 1. `scripts/build-mcpd.sh` — build + push Docker image to `mysources.co.uk/michal/mcpctl-mcpd` 2. `deploy.sh` — deploy stack to production via Portainer API at `http://10.0.0.194:9000` 3. `scripts/release.sh` — build RPM + publish to Gitea + install locally + smoke tests ### Docker Images | Image | Purpose | |-------|---------| | `mcpctl-mcpd` | Multi-stage build: Node 20 Alpine, includes git/ssh, Prisma | | `mcpctl-node-runner` | Node 20 slim, runs `npx -y` for npm packages | | `mcpctl-python-runner` | Python 3.12 slim, uses `uv` for Python packages | All pushed to `mysources.co.uk/michal/` registry. ### Stack Services (Production) - `postgres` — PostgreSQL 16 (port 5432) - `mcpd` — Daemon (port 3100) - `node-runner`, `python-runner` — Base images - Networks: `mcpctl` (management), `mcp-servers` (container communication) ### RPM/DEB Distribution ```bash source .env && bash scripts/release.sh ``` Installs via nfpm: - `/usr/bin/mcpctl` — CLI binary (bun compiled) - `/usr/bin/mcpctl-local` — Local proxy binary (bun compiled) - `/usr/share/fish/vendor_completions.d/mcpctl.fish` — Fish completions - `/usr/share/bash-completion/completions/mcpctl` — Bash completions - `/usr/lib/systemd/user/mcplocal.service` — Systemd user service User install: ```bash dnf config-manager --add-repo https://mysources.co.uk/api/packages/michal/rpm.repo dnf install mcpctl ``` ### Git & PR Workflow - Gitea at `http://10.0.0.194:3012` (internal) / `https://mysources.co.uk/michal/mcpctl` (public) - `pr.sh` in project root creates PRs via Gitea API - `gh` CLI not installed — use `pr.sh` or direct API calls --- ## 17. Testing Strategy ### Test Tiers | Tier | Tool | Scope | When | |------|------|-------|------| | Unit tests | Vitest | Package-level, mocked dependencies | `pnpm test:run` | | DB tests | Vitest | Full Prisma + test PostgreSQL | `pnpm --filter db exec vitest run` (separate) | | Smoke tests | Vitest | Live mcplocal + mcpd (not mocked) | `pnpm test:smoke` (post-deploy) | ### Convention - Every new feature MUST include smoke tests - Smoke tests live in `src/mcplocal/tests/smoke/` - Use `SmokeMcpSession` from `tests/smoke/mcp-client.ts` for MCP protocol interactions - Smoke tests run automatically in the build/deploy pipeline ### Critical Rules - **NEVER pipe pnpm test output** to `tail`, `grep`, `head` — pnpm hangs when it detects non-TTY - Always capture full output with `2>&1` and read directly - DB tests excluded from workspace-root vitest (need test database) - Tests integrated into pipeline: `build-rpm.sh` runs unit tests; `release.sh` runs smoke tests --- ## 18. Technology Stack | Layer | Technology | |-------|-----------| | CLI Framework | Commander.js, Ink/React (TUI), Inquirer | | API Server | Fastify 5, TypeScript strict mode | | Database | PostgreSQL 16, Prisma ORM v6 | | Container Runtime | Docker/Podman via dockerode | | MCP Protocol | @modelcontextprotocol/sdk | | Validation | Zod schemas everywhere | | LLM Providers | OpenAI, Anthropic, Google Gemini, Ollama, DeepSeek, Groq, Mistral, OpenRouter, Azure | | Testing | Vitest, coverage via v8 | | Build | TypeScript project references, pnpm workspaces | | Compilation | Bun (binary compilation for RPM) | | Packaging | nfpm (RPM/DEB), Docker multi-stage | | CI/CD | fulldeploy.sh → Portainer API + Gitea packages | | Shell Completions | Fish + Bash (auto-generated via `scripts/generate-completions.ts`) | ### Design Patterns 1. **Monorepo** — pnpm workspaces with shared base TypeScript config 2. **Layered architecture** — Routes → Services → Repositories (Prisma) 3. **Interface-based repositories** — all data access through interfaces for testability 4. **Dependency injection** — services receive dependencies via constructor 5. **Zod validation** — all input validated at API boundary 6. **Plugin inheritance** — composable ProxyModel plugins with conflict detection 7. **Content-addressed caching** — SHA256 hash keys for deduplication 8. **TTL-based stores** — prompt index (60s), system prompts (5min), sections (5min) 9. **Fire-and-forget audit** — non-blocking event collection 10. **Declarative config** — kubectl-style YAML/JSON for all resource management --- ## 19. Project Structure ``` mcpctl/ ├── src/ │ ├── cli/ @mcpctl/cli CLI (Commander.js) │ │ ├── src/commands/ 22 command handlers │ │ ├── src/registry/ MCP server registry client │ │ ├── src/formatters/ Output formatting (table/json/yaml) │ │ └── src/auth/ Credential storage │ ├── mcpd/ @mcpctl/mcpd Daemon (Fastify 5) │ │ ├── src/routes/ 18 route handlers │ │ ├── src/services/ 13 services │ │ ├── src/repositories/ Data access layer │ │ ├── src/middleware/ Auth, logging, error handling │ │ └── src/validation/ Zod schemas, RBAC rules │ ├── mcplocal/ @mcpctl/mcplocal Local proxy │ │ ├── src/gate/ Session gating + tag matching │ │ ├── src/proxymodel/ Plugin system + stages + cache │ │ ├── src/providers/ 6 LLM providers │ │ ├── src/upstream/ STDIO + HTTP upstream connections │ │ ├── src/audit/ Event collection + batching │ │ └── src/health/ Health monitoring │ ├── db/ @mcpctl/db Database (Prisma) │ │ ├── prisma/schema.prisma 22 models │ │ └── prisma/migrations/ 11 migrations │ └── shared/ @mcpctl/shared Constants, types, validation ├── deploy/ Dockerfiles + entrypoint ├── stack/ Production docker-compose + env ├── scripts/ Build, release, deploy scripts ├── completions/ Fish + Bash completions ├── templates/ MCP server YAML templates ├── docs/ Architecture + design docs ├── fulldeploy.sh Full build → deploy → release ├── deploy.sh Portainer stack deploy ├── pr.sh Gitea PR creation ├── nfpm.yaml RPM/DEB package metadata ├── vitest.config.ts Root test config ├── vitest.workspace.ts Workspace test config ├── tsconfig.base.json Base TypeScript config (strict) └── pnpm-workspace.yaml Monorepo workspace definition ``` --- ## 20. Deferred & Future Work ### Deferred Tasks | ID | Description | Status | |----|-------------|--------| | 88 | Rename proxyMode: filtered → proxy | Deferred | | 105-109 | Model Studio TUI | Deferred | | 110 | RBAC for ProxyModels | Deferred | | 113 | Model Studio docs | Deferred | ### Future Architecture - **Virtual MCP Audit Server** — Claude-queryable audit tools - **Graphiti Knowledge Graph** — causal graph from audit events - **Lab Parameter Simulation** — re-run pipelines with different configs - **Kubernetes Orchestrator** — beyond Docker/Podman - **ConfigMaps** — non-sensitive config separate from Secrets - **Multi-provider failover** — automatic LLM provider cascading ### Completed Major Features - Project structure + monorepo setup - MCP Registry Client (official, glama, smithery — 53 tests) - Health Probe Runner (STDIO, SSE, Streamable HTTP — 12 tests) - Container orchestration with reconciliation - Full RBAC with name-scoped bindings - Gated sessions with prompt scoring - ProxyModel plugin system with inheritance - Content pipeline with 4 built-in stages - Pipeline cache (L1 memory + L2 disk) - Audit infrastructure with trust model - Git-based backup and restore - Shell completions (Fish + Bash) - RPM/DEB packaging and distribution - Smoke test framework - Console inspector for debugging