Files
mcpctl/docs/project-summary.md

1049 lines
41 KiB
Markdown
Raw Permalink Normal View History

# mcpctl — Comprehensive Project Summary
**kubectl for Model Context Protocol servers.**
mcpctl is a production-grade management system for MCP servers, providing a Kubernetes-inspired declarative interface for deploying, orchestrating, and observing MCP servers that connect to Claude and other LLM clients.
---
## Table of Contents
1. [System Architecture](#1-system-architecture)
2. [Component Overview](#2-component-overview)
3. [Resource Model & Design Decisions](#3-resource-model--design-decisions)
4. [CLI Reference](#4-cli-reference)
5. [API Surface (mcpd)](#5-api-surface-mcpd)
6. [Database Schema](#6-database-schema)
7. [Local Proxy (mcplocal)](#7-local-proxy-mcplocal)
8. [ProxyModel Plugin System](#8-proxymodel-plugin-system)
9. [Gated Sessions](#9-gated-sessions)
10. [Content Pipeline & Stages](#10-content-pipeline--stages)
11. [LLM Provider Integration](#11-llm-provider-integration)
12. [Caching](#12-caching)
13. [Authentication & RBAC](#13-authentication--rbac)
14. [Audit Infrastructure & Trust Model](#14-audit-infrastructure--trust-model)
15. [Container Orchestration](#15-container-orchestration)
16. [Deployment & Distribution](#16-deployment--distribution)
17. [Testing Strategy](#17-testing-strategy)
18. [Technology Stack](#18-technology-stack)
19. [Project Structure](#19-project-structure)
20. [Deferred & Future Work](#20-deferred--future-work)
---
## 1. System Architecture
### Three-Tier Design
```
Claude Code / LLM Client
| (STDIO — MCP JSON-RPC protocol)
v
mcplocal (Local Daemon — developer machine)
| (HTTP REST)
v
mcpd (Remote Daemon — server/NAS, e.g. 10.0.0.194)
| (Docker/Podman API)
v
MCP Server Containers (isolated network)
```
```
┌────────────┐
┌─────────────────┐ HTTP ┌──────────────┐ │ PostgreSQL │
│ mcpctl CLI │──────────────>│ mcpd │──>│ │
│ (Commander.js) │ │ (Fastify 5) │ └────────────┘
└─────────────────┘ └──────┬───────┘
│ Docker/Podman API
v
┌──────────────┐
│ Containers │
│ (MCP servers)│
└──────────────┘
┌─────────────────┐ STDIO ┌──────────────┐ STDIO/HTTP ┌────────────┐
│ Claude / LLM │────────────>│ mcplocal │───────────────>│ MCP Servers│
│ │ │ (McpRouter) │ │ │
└─────────────────┘ └──────────────┘ └────────────┘
```
### Key Principle
- **mcpd owns the database** (PostgreSQL) — the only component that talks to the DB
- **mcplocal is stateless** — config-only, no database, acts as intelligent proxy
- **mcpctl stores only credentials** — `~/.mcpctl/config.json` and `~/.mcpctl/credentials.json`
- **All MCP servers run inside the mcpd container** on the NAS via podman (container-in-container)
---
## 2. Component Overview
### mcpctl (CLI)
- kubectl-like interface for managing the entire system
- Talks to mcplocal (local daemon) via HTTP REST, or directly to mcpd with `--direct`
- Distributed as RPM/DEB package via Gitea registry
- Built with Commander.js, Ink/React for TUI, Inquirer for prompts
### mcplocal (Local Daemon)
- Runs on developer machine as a systemd user service
- Exposes MCP protocol via STDIO to Claude
- Exposes HTTP REST API for mcpctl management commands
- Core responsibilities:
- Tool namespacing and routing (`server/tool` format)
- Gated sessions and prompt delivery
- Content pipeline (transformation stages)
- LLM integration for intelligent prompt selection
- Pipeline result caching
- Audit event collection
### mcpd (Remote Daemon)
- Server-side daemon on NAS/cloud (Fastify 5)
- Manages MCP server containers (Docker/Podman via dockerode)
- PostgreSQL for state, audit logs, access control
- Owns credentials (never exposed to mcplocal)
- REST API for all management operations
- MCP proxy endpoint for direct tool invocation
- Health probe runner for container monitoring
- Git-based backup system
### @mcpctl/db (Database Layer)
- Prisma ORM with PostgreSQL
- 22 models, 11 migrations
- Template seeding from YAML files at startup
### @mcpctl/shared (Shared Utilities)
- Constants, types, validation schemas (Zod)
- Secret encryption/decryption utilities
- Zero external dependencies beyond Zod
---
## 3. Resource Model & Design Decisions
### ADR-001: Kubernetes-Style Resource Model
| mcpctl Resource | K8s Analogy | Behavior |
|----------------|-------------|----------|
| **Server** | Deployment | Self-contained, complete definition. Contains image, command, transport, env refs, replicas. No external template dependencies at runtime. |
| **Instance** | Pod | Immutable, ephemeral, auto-managed by reconciliation loop. No `create instance` or `edit instance`. Delete triggers re-creation. |
| **Secret** | Secret | Holds sensitive key-value pairs. Servers reference via `env[].valueFrom.secretRef`. |
| **Project** | Namespace | Groups servers, configures ProxyModel and LLM provider. Generates `.mcp.json`. |
| **Prompt** | ConfigMap (sort of) | Instruction text delivered to Claude. Global or project-scoped. Priority-ranked. |
| **Template** | — | Blueprints for server creation. Used at create-time only, not runtime. |
| **RbacDefinition** | ClusterRoleBinding | Named policies with subjects and roleBindings. |
### ADR-002: Profiles Replaced with Secrets
The original `McpProfile` resource tried to be secrets, configmaps, and project-server links simultaneously. Environment variables declared in profiles were never actually passed to running containers.
**Decision:** Replace with dedicated `Secret` resource following Kubernetes conventions:
```yaml
servers:
- name: ha-mcp
env:
- name: HOMEASSISTANT_TOKEN
valueFrom:
secretRef:
name: ha-credentials
key: HOMEASSISTANT_TOKEN
```
### ADR-003: Self-Contained Servers with Source Tracking
Servers store complete definitions (no runtime template dependencies). Optional `source` metadata enables registry-based upgrades via 3-way diff (old snapshot vs current server vs new template).
**Rationale:** Matches kubectl mental model. `get server X -o yaml > new.yaml && edit && apply` works naturally. Duplication is minimal (~10 lines YAML).
### ADR-004: ConfigMaps Deferred
Only Secrets implemented. ConfigMap separation can be added later if needed. Keeps the model simple.
### ADR-005: Apply-Compatible YAML Round-Trip
`mcpctl get server ha-mcp -o yaml > s.yaml && mcpctl apply -f s.yaml` must work:
- `get -o yaml/json` strips internal fields (id, createdAt, updatedAt, version, ownerId)
- Output wrapped in resource key: `{ servers: [...] }`
- `describe -o yaml/json` keeps full raw output (for debugging)
### ADR-006: CLI Design Principles
1. Everything possible via `apply -f` MUST also be possible via `create` CLI flags
2. Support `-o yaml` and `-o json` like kubectl
3. `describe` shows visually clean sectioned output with tables
4. Name resolution works everywhere (not just IDs)
5. Instances are immutable (like pods) — no create/edit
---
## 4. CLI Reference
### Global Options
```
--daemon-url <url> mcplocal daemon URL
--direct bypass mcplocal, connect directly to mcpd
-p, --project <name> Target project
-o, --output <format> table | json | yaml
-v, --version Show version
```
### Resource Operations
| Command | Description |
|---------|-------------|
| `mcpctl get <resource> [name]` | List resources or fetch by name/ID. Supports glob patterns (`graf*`). |
| `mcpctl describe <resource> <name>` | Detailed view with sections and tables. |
| `mcpctl create <resource> <name> [opts]` | Create resource. Mirrors `apply -f` capabilities. |
| `mcpctl edit <resource> <name>` | Open in `$EDITOR` as YAML, apply on save. |
| `mcpctl patch <resource> <name> key=val...` | Patch individual fields without editor. |
| `mcpctl delete <resource> <name>` | Delete resource. |
| `mcpctl apply -f <file>` | Declarative YAML/JSON application (like `kubectl apply`). Supports `--dry-run`. |
### Supported Resources
servers, projects, instances, secrets, templates, users, groups, rbac, prompts, promptrequests, serverattachments (virtual), proxymodels (virtual, from mcplocal), all (project export)
### Resource Aliases
```
server/srv → servers project/proj → projects
instance/inst → instances secret/sec → secrets
template/tpl → templates prompt → prompts
user → users group → groups
rbac/rbac-definition → rbac promptrequest/pr → promptrequests
serverattachment/sa → serverattachments proxymodel/pm → proxymodels
```
### Lifecycle & Diagnostics
| Command | Description |
|---------|-------------|
| `mcpctl status` | Show connectivity, auth status, LLM provider health, available models. |
| `mcpctl login` | Authenticate with mcpd (first login bootstraps initial user). |
| `mcpctl logout` | Clear stored credentials. |
| `mcpctl logs <name> [-t N] [-i index]` | Stream container logs. Resolves server name → running instance. |
| `mcpctl cache stats` | Show pipeline cache statistics per namespace. |
| `mcpctl cache clear [ns] [--older-than N]` | Clear pipeline cache. |
| `mcpctl backup` | Show git backup status, public SSH key. |
| `mcpctl backup log [-n N]` | Show backup commit history. |
| `mcpctl backup restore list/diff/to` | Restore to specific backup commit. |
### Console & Inspection
| Command | Description |
|---------|-------------|
| `mcpctl console [project]` | Interactive TUI — request/response timeline, tool inspection. |
| `mcpctl console --stdin-mcp` | MCP server mode over stdin/stdout (for Claude integration). |
| `mcpctl console --audit` | Browse audit events from mcpd interactively. |
### Configuration
| Command | Description |
|---------|-------------|
| `mcpctl config view` | Show current configuration. |
| `mcpctl config set <key> <value>` | Set config value (mcplocalUrl, mcpdUrl, registries, outputFormat, etc.). |
| `mcpctl config path` | Show config file path. |
| `mcpctl config setup` | Interactive configuration wizard. |
| `mcpctl config claude -p <project>` | Generate `.mcp.json` for Claude Code. |
### Create Subcommands
```bash
mcpctl create server <name> [--package-name X] [--docker-image X] [--transport STDIO|SSE|STREAMABLE_HTTP]
[--runtime node|python] [--replicas N] [--env KEY=val] [--from-template name:version]
mcpctl create secret <name> [--data key=val ...] [--data-file path.json]
mcpctl create project <name> [-d desc] [--proxy-model default|gate|content-pipeline] [--server name ...]
mcpctl create user <email> [--password pass] [--name name]
mcpctl create group <name> [-d desc] [--member email ...]
mcpctl create rbac <name> [--subject kind:name] [--role-binding role:resource[:name]]
mcpctl create prompt <name> [--content text] [--project name] [--priority 1-10] [--link url]
```
### Apply File Format
```yaml
secrets:
- name: my-secret
data:
KEY: value
servers:
- name: my-server
transport: STDIO
packageName: "@modelcontextprotocol/server-example"
env:
- name: API_KEY
valueFrom:
secretRef:
name: my-secret
key: KEY
projects:
- name: my-project
proxyModel: default
servers:
- my-server
serverattachments:
- server: my-server
project: my-project
prompts:
- name: my-prompt
project: my-project
content: "Instruction text..."
priority: 5
```
---
## 5. API Surface (mcpd)
All endpoints under `/api/v1/` require Bearer token auth except `/auth/*` and `/health*`.
### Authentication
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/auth/bootstrap` | POST | First-user setup (creates admin + bootstrap RBAC) |
| `/auth/status` | GET | `{hasUsers: boolean}` (unauthenticated) |
| `/auth/login` | POST | Returns token + user info |
| `/auth/logout` | POST | Invalidate session |
| `/auth/me` | GET | Current user identity |
| `/auth/impersonate` | POST | Create session for another user (requires `run:impersonate`) |
### Servers
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/servers` | GET | List all servers |
| `/servers/:id` | GET | Get server by CUID |
| `/servers` | POST | Create server (validates name uniqueness, image/package) |
| `/servers/:id` | PUT | Update server, re-reconciles replicas |
| `/servers/:id` | DELETE | Delete server + cascade-delete all instances |
### Instances
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/instances` | GET | List (optional `?serverId=` filter) |
| `/instances/:id` | GET | Get instance |
| `/instances/:id` | DELETE | Delete instance, triggers reconciliation |
| `/instances/:id/inspect` | GET | Docker inspect output (state, port, IP) |
| `/instances/:id/logs` | GET | Container logs (`?tail=N`) |
### Projects
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/projects` | GET | List (RBAC-filtered) |
| `/projects/:id` | GET/POST/PUT/DELETE | CRUD by CUID or name |
| `/projects/:id/mcp-config` | GET | Generate `.mcp.json` |
| `/projects/:id/instructions` | GET | Get prompt + attached servers for system message |
| `/projects/:id/servers` | GET/POST | List/attach servers |
| `/projects/:id/servers/:name` | DELETE | Detach server |
### Prompts & Prompt Requests
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/prompts` | GET/POST | List/create approved prompts |
| `/prompts/:id` | PUT/DELETE | Update/delete (system prompts reset to default) |
| `/prompts/:id/regenerate-summary` | POST | Force re-generate summary/chapters |
| `/promptrequests` | GET/POST | List/create pending requests |
| `/promptrequests/:id/approve` | POST | Atomic delete request → create prompt |
| `/projects/:name/prompts/visible` | GET | Approved + session's pending |
| `/projects/:name/prompt-index` | GET | Compact index for gating |
### Secrets, Users, Groups, RBAC
Standard CRUD on `/secrets`, `/users`, `/groups`, `/rbac-definitions`.
### Health & Monitoring
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/health/overview` | GET | System health, instance counts, error rate |
| `/health/instances/:id` | GET | Instance-specific health, uptime, latency |
| `/metrics` | GET | Request counts, error counts, last request time |
| `/healthz` | GET | Liveness probe |
### Backup, Restore, Audit
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/backup` | POST | Create encrypted bundle (servers, secrets, projects, users, groups, rbac) |
| `/restore` | POST | Restore bundle (merge/skip/overwrite strategy) |
| `/audit/events` | POST/GET | Batch insert from mcplocal / query with filters |
| `/audit/sessions` | GET | Session aggregates (first/last seen, event counts) |
| `/git/backup/init` | POST | Initialize git backup with SSH credentials |
| `/git/backup/status` | GET | Backup sync status |
| `/git/backup/sync` | POST | Manual trigger sync |
### MCP Proxy
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/mcp/proxy` | POST | Forward JSON-RPC to running MCP server instance. Dispatches by transport (STDIO via docker exec, SSE/HTTP via direct HTTP). Maintains persistent STDIO connections. |
---
## 6. Database Schema
PostgreSQL via Prisma ORM. 22 models across 11 migrations.
### Core Models
**User & Auth:**
- `User` — email/password (bcrypt), role (USER/ADMIN), optional OAuth
- `Session` — Bearer token with 30-day TTL
- `Group` / `GroupMember` — user groups for RBAC
**MCP Infrastructure:**
- `McpServer` — transport (STDIO/SSE/STREAMABLE_HTTP), docker image, package name, runtime (node/python), env vars (JSON), health check config, replicas, external URL
- `McpTemplate` — reusable blueprints for server creation (mirrors McpServer fields)
- `McpInstance` — running containers, status (STARTING/RUNNING/STOPPING/STOPPED/ERROR), container ID, port, health status, events
**Organization:**
- `Project` — LLM config (provider, model), proxy model, gated flag, prompt instructions, server overrides
- `ProjectServer` — junction table linking projects to servers
- `Secret` — named secret bundles (data as encrypted JSON), versioned
**Content:**
- `Prompt` — approved system prompts (global or project-scoped), priority, summary/chapters, optional link target
- `PromptRequest` — pending prompt proposals from LLM sessions
**Audit & Backup:**
- `AuditLog` — user action trail (action, resource, resourceId, details)
- `AuditEvent` — pipeline/gate/tool trace events from mcplocal (sessionId, projectName, eventKind, correlationId, userName)
- `BackupPending` — queue for git-based backup sync
- `RbacDefinition` — named RBAC policies
---
## 7. Local Proxy (mcplocal)
### Request Flow
```
Claude (STDIO JSON-RPC)
StdioProxyServer (reads from stdin)
McpRouter.route(request)
├→ PluginSessionContext (per-session state)
├→ ProxyModelPlugin hooks (intercept/transform)
├→ Upstream lookup (tool name prefix → server)
└→ Response (with optional drill-down sections)
```
### Router Responsibilities
- Manages upstream connections (STDIO child processes, HTTP)
- Maps tools/resources/prompts to servers via name prefix (`servername/toolname`)
- Maintains prompt index + system prompt cache (TTL-based)
- Dispatches plugin hooks via `getOrCreatePluginContext`
- Section storage for drill-down navigation
- Audit event collection and batching
- Link resolution (relative → absolute URLs)
### Upstream Transports
| Transport | Implementation |
|-----------|---------------|
| STDIO | Spawns child process, bidirectional pipe, JSON-RPC over newline-delimited JSON |
| SSE | HTTP GET for event stream, POST for messages |
| Streamable HTTP | HTTP POST with JSON-RPC payloads |
### HTTP Endpoints (mcplocal)
| Endpoint | Description |
|----------|-------------|
| `GET /proxymodels` | List all models (YAML pipelines + TS plugins) |
| `GET /proxymodels/:name` | Single model details |
| `GET /proxymodels/stages` | List available stages |
| `POST /proxymodels/reload` | Force reload stages from disk |
| `GET /cache/stats` | Per-namespace cache statistics |
| `DELETE /cache` | Clear all or by age |
| `DELETE /cache/:namespace` | Clear specific namespace |
| `POST /mcp` | JSON-RPC request forwarding |
---
## 8. ProxyModel Plugin System
A **ProxyModel** is either a **Pipeline** (YAML) or a **Plugin** (TypeScript).
### Plugin Interface
| Hook | When it fires |
|------|--------------|
| `onSessionCreate` | New MCP session established |
| `onSessionDestroy` | Session ends |
| `onInitialize` | MCP initialize request — can inject instructions |
| `onToolsList` | tools/list — can filter/modify tool list |
| `onToolCallBefore` | Before forwarding a tool call — can intercept |
| `onToolCallAfter` | After receiving tool result — can transform |
| `onResourcesList` | resources/list — can filter resources |
| `onResourceRead` | resources/read — can intercept reads |
| `onPromptsList` | prompts/list — can filter prompts |
| `onPromptGet` | prompts/get — can intercept reads |
### Built-in Plugins
| Plugin | Extends | Gating | Content Pipeline | Use Case |
|--------|---------|:------:|:----------------:|----------|
| `gate` | — | Yes | No | Gating + prompt delivery only |
| `content-pipeline` | — | No | Yes | Content transformation only |
| `default` | gate + content-pipeline | Yes | Yes | Full pipeline (most common) |
**Inheritance:** Plugins can extend parents. Conflicting hooks from multiple parents cause load-time errors (except chainable lifecycle hooks which run sequentially).
### Pipeline Configuration (YAML)
```yaml
name: default
spec:
controller: gate
controllerConfig: { byteBudget: 8192 }
stages:
- type: passthrough
- type: paginate
config: { pageSize: 8000 }
appliesTo: [prompt, toolResult]
cacheable: true
```
### Per-Session Context
Each session gets a `PluginSessionContext` providing:
- Session state (`Map<string, unknown>`)
- LLM provider, cache provider, structured logger
- Virtual tool/server registration
- Upstream routing and tool discovery
- Content processing and notifications
- Audit event emission
---
## 9. Gated Sessions
### Problem
When Claude connects to an MCP server, it sees all tools immediately and starts using them. In a managed environment, you want to deliver relevant context (prompts/instructions) before granting tool access.
### Solution: Keyword-Driven Prompt Retrieval
1. **Initialize:** Instructions include prompt index + "call `begin_session` immediately"
2. **Gated `tools/list`:** Only `begin_session` visible
3. **Claude calls `begin_session`** with keywords describing the task
4. **Prompt matching:** Keywords matched against prompt summaries/chapters
5. **Ungating:** Matched prompts returned + `tools/list_changed` notification sent
6. **Full access:** All upstream tools now visible
### Prompt Scoring
**Formula:** `priority + (matchCount * priority)`
- Priority alone is baseline — ensures global prompts compete for inclusion
- Tag matches multiply priority — relevant prompts score higher
- Priority 10 = always included (bypasses budget)
- 8KB byte budget cap; overflow prompts listed as index-only
### Critical Design Lessons
**What works:**
- One gate tool (`begin_session`), zero ambiguity
- Instructions say "check its input schema" (not naming specific parameters)
- "immediately" and "required" prevent Claude from exploring first
- Tool names listed as preview in instructions (helps keyword generation)
- `tools/list_changed` notification mandatory after ungating
- Auto-ungate fallback if Claude bypasses gate
**What fails:**
- Naming parameters that don't match the schema
- Complex conditional instructions (Claude prefers simple paths)
- Multiple tools in gated state (Claude skips the gate)
- Gate instructions only in tool description (must be in initialize response)
- Burying the call-to-action after 200 lines of context
### Complete Flow
```
Client mcplocal upstream
│── initialize ────────>│
│<── instructions ──────│ (gate instructions + prompt index + tool preview)
│── tools/list ────────>│
│<── [begin_session] ───│ (ONLY begin_session visible)
│── tools/call ────────>│
│ begin_session │── match prompts ─────────>│
│ {tags:[...]} │<── prompt content ────────│
│<── matched prompts ───│ (full content + encouragement)
│<── notification ──────│ (tools/list_changed)
│── tools/list ────────>│
│<── [108+ tools] ──────│ (ALL tools now visible)
│ │
│ Claude proceeds with full tool access
```
---
## 10. Content Pipeline & Stages
### How It Works
Tool results pass through an ordered sequence of stages before reaching Claude:
1. Each stage receives previous stage's content
2. Returns `{content, sections?, metadata?}`
3. Sections enable drill-down navigation
4. Stage errors are caught — pipeline continues with previous content
### Built-in Stages
| Stage | Purpose |
|-------|---------|
| `passthrough` | Identity transform (testing/baseline) |
| `paginate` | Split large content into numbered pages (8KB default). LLM-generated page titles (cached). |
| `section-split` | Split by structure: JSON arrays/objects → elements/keys, YAML → keys, prose → `##` headers, code → function/class boundaries. Merges tiny sections, re-splits oversized ones. |
| `summarize-tree` | Hierarchical LLM-generated section summaries. Groups sections into trees. Cached. |
### Section Drill-Down
After pipeline produces sections:
1. Full content replaced with compact table of contents + `_resultId`
2. Sections stored in session-scoped store (5-minute TTL)
3. Client calls same tool with `_resultId` + `_section` to retrieve specific section
4. Supports hierarchical navigation (sections within sections)
### Custom Stages
Drop `.js` files in `~/.mcpctl/stages/`:
```javascript
export default async function myStage(input, context) {
// context.llm, context.cache, context.log available
return { content: transformedContent, sections: [...] };
}
```
Hot-reload with 300ms file watch debounce. Built-in stages take precedence.
---
## 11. LLM Provider Integration
### Supported Providers
| Provider | Type | Tier |
|----------|------|------|
| Gemini CLI | Local | Fast |
| Ollama | Local | Fast |
| DeepSeek | API | Fast/Heavy |
| OpenAI | API | Heavy |
| Anthropic | API | Heavy |
| vLLM | Local | Configurable |
| vLLM Managed | Auto-managed local | Configurable |
### Tier System
- **Fast tier:** Quick, cheap models for pipeline stages and keyword extraction
- **Heavy tier:** Full models for complex prompt selection and summarization
- **Legacy active:** Single default provider (fallback)
### LLM Adapter
Stages use a simple interface:
```typescript
interface LLMProvider {
complete(prompt: string, options?): Promise<string>;
available(): boolean;
}
```
Resolution order: named provider → fast tier → heavy tier → active provider.
### Multi-Provider Configuration
```json
{
"llm": {
"providers": [
{ "name": "fast-local", "type": "ollama", "model": "llama3", "tier": "fast" },
{ "name": "heavy-api", "type": "openai", "model": "gpt-4", "tier": "heavy" }
]
}
}
```
---
## 12. Caching
### Architecture: L1 Memory + L2 Disk
- **L1 in-memory:** LRU map (default 500 entries) for fast lookups
- **L2 disk:** `~/.mcpctl/cache/<namespace>/<key>.dat`
- Namespace: `provider--model--proxymodel` (e.g., `openai--gpt-4o--content-pipeline`)
- Key: 16-char hex SHA256 prefix of content
- Value: raw content (no JSON wrapper)
### Configuration
| Option | Default | Examples |
|--------|---------|---------|
| `maxSize` | 256MB | `"1GB"`, `"10%"` (of partition), `536870912` (bytes) |
| `ttlMs` | 30 days | Any millisecond value |
| `maxMemoryEntries` | 500 | L1 LRU cap |
| `dir` | `~/.mcpctl/cache` | Custom path |
### Management
```bash
mcpctl cache stats # Per-namespace breakdown
mcpctl cache clear # Clear everything
mcpctl cache clear openai--gpt-4--default # Clear specific namespace
mcpctl cache clear --older-than 7 # Clear entries older than 7 days
```
### HTTP API
```
GET /cache/stats # Per-namespace stats
DELETE /cache # Clear all (or ?olderThan=N)
DELETE /cache/:namespace # Clear specific namespace
```
---
## 13. Authentication & RBAC
### Auth Flow
1. `mcpctl login` → prompts for email/password
2. First login (no users in system) → bootstrap: creates admin user + admin group + bootstrap RBAC
3. POST `/auth/login` → returns 30-day bearer token
4. Token stored in `~/.mcpctl/credentials.json`
5. CLI passes token to mcplocal config
6. mcplocal attaches `Authorization: Bearer <token>` to all mcpd requests
7. mcpd validates token against Session table
### RBAC Model
**Subjects:** User (by email), Group, ServiceAccount
**Roles & Capabilities:**
| Role | Grants |
|------|--------|
| `edit` | view, create, delete, edit, expose |
| `view` | view |
| `create` | create |
| `delete` | delete |
| `run` | run (for operations) |
| `expose` | expose, view |
**Resources:** `*`, servers, instances, secrets, projects, templates, users, groups, rbac, prompts, promptrequests
**Binding Types:**
- **Resource binding:** `{role: 'edit', resource: 'servers', name?: 'my-server'}`
- With `name`: user can only access that specific resource
- Without `name`: user can access all resources of that type
- **Operation binding:** `{role: 'run', action: 'impersonate'}`
- Grants permission for named operations (backup, restore, audit-purge, logs, impersonate)
### Resolution
1. CLI resolves name → CUID client-side before API calls
2. RBAC hook resolves CUID → name before checking bindings
3. List filtering: `getAllowedScope()` computes allowed names, `preSerialization` hook filters arrays
4. Wildcard scope: user has unscoped binding → sees all resources
5. Named scope: user has only name-scoped bindings → filtered to allowed names
---
## 14. Audit Infrastructure & Trust Model
### Event Kinds
| Event Kind | Description |
|------------|-------------|
| `pipeline_execution` | Full pipeline run summary (duration, stage count, sizes) |
| `stage_execution` | Individual stage detail (duration, input/output size, error) |
| `gate_decision` | Gate open/close with client intent and matched prompts |
| `prompt_delivery` | Which prompts were sent, match scores |
| `tool_call_trace` | Tool call with server + timing + result size |
| `rbac_decision` | Access control decisions |
| `session_bind` | Session initialization |
### Trust Model
| Source | Verified | Meaning |
|--------|----------|---------|
| `client` | false | Client LLM claims (begin_session intent, tags) |
| `mcplocal` | true | Server-side data (prompt matches, pipeline transforms) |
| `mcpd` | true | mcpd-originated events |
### AuditCollector
Fire-and-forget batching: 50 events max, 5-second flush interval. POSTs to mcpd. Non-blocking — audit failures don't affect tool calls.
### Correlation & Causality
- `correlationId` links related events (all events from one tool call)
- `parentEventId` enables causal chains (gate_decision → pipeline_execution)
- `userName` tracks which user triggered the event
- Designed for future graphiti knowledge graph ingestion
### Per-Server Targeting
Different servers in a project can have different proxymodel configs via `serverOverrides` on the project resource. Resolution: server override → project default → null.
### Future (Designed, Not Implemented)
- **Virtual MCP Audit Server:** mcpd-hosted virtual server providing `query_audit_log`, `get_session_timeline` tools. Claude can directly query audit data.
- **Graphiti Integration:** Causal graph with entity types (Session, Tool, Server, ProxyModel, Prompt, Stage) and edges (`triggered_by`, `transformed_by`, `verified_by`).
- **Lab Parameter Simulation:** Select any pipeline event, retrieve original input, re-run with different proxyModel/LLM/stages, side-by-side diff.
- **Audit Level Config:** Per-server `auditLevel: 'full' | 'hash-only' | 'disabled'`.
---
## 15. Container Orchestration
### Orchestrator Interface
`McpOrchestrator` abstracts container management (Docker/Podman today, Kubernetes in the future):
- `pullImage`, `createContainer`, `stopContainer`, `removeContainer`
- `inspectContainer`, `getContainerLogs`, `execInContainer`, `ping`
### Container Management
- Labels: `mcpctl.managed=true` for filtering
- Network: `mcp-servers` (configurable via `MCPD_MCP_NETWORK`)
- Resource limits: 512MB RAM, 0.5 CPU (configurable)
- Internal container IP exposed via inspect
### Runtime Spawn Commands
| Runtime | Command |
|---------|---------|
| Node | `npx --prefer-offline -y <packageName>` |
| Python | `uvx <packageName>` |
| Custom | Explicit `command` field |
### Health Probes
Periodic MCP tool-call probes (like K8s livenessProbe):
- Default interval: 15 seconds
- Dispatch by transport: STDIO (docker exec), HTTP (JSON-RPC)
- Failure threshold: 3 consecutive failures → unhealthy
- Updates instance `healthStatus` and `lastHealthCheck`
### Reconciliation Loop
Maintains desired replica count:
- If running < desired → start new instances
- If running > desired → stop excess instances
- Detects crashed containers → marks ERROR → triggers re-creation
### Persistent STDIO Connections
For STDIO transport, mcpd maintains long-lived exec sessions (`PersistentStdioClient`) to avoid repeated `docker exec` overhead. Bidirectional streaming for interactive sessions.
---
## 16. Deployment & Distribution
### Production Deployment
**mcpd runs on 10.0.0.194** (NAS, managed via Portainer), NOT on the dev machine.
```bash
# Full deploy (preferred after merging)
bash fulldeploy.sh
```
`fulldeploy.sh` runs three steps:
1. `scripts/build-mcpd.sh` — build + push Docker image to `mysources.co.uk/michal/mcpctl-mcpd`
2. `deploy.sh` — deploy stack to production via Portainer API at `http://10.0.0.194:9000`
3. `scripts/release.sh` — build RPM + publish to Gitea + install locally + smoke tests
### Docker Images
| Image | Purpose |
|-------|---------|
| `mcpctl-mcpd` | Multi-stage build: Node 20 Alpine, includes git/ssh, Prisma |
| `mcpctl-node-runner` | Node 20 slim, runs `npx -y` for npm packages |
| `mcpctl-python-runner` | Python 3.12 slim, uses `uv` for Python packages |
All pushed to `mysources.co.uk/michal/` registry.
### Stack Services (Production)
- `postgres` — PostgreSQL 16 (port 5432)
- `mcpd` — Daemon (port 3100)
- `node-runner`, `python-runner` — Base images
- Networks: `mcpctl` (management), `mcp-servers` (container communication)
### RPM/DEB Distribution
```bash
source .env && bash scripts/release.sh
```
Installs via nfpm:
- `/usr/bin/mcpctl` — CLI binary (bun compiled)
- `/usr/bin/mcpctl-local` — Local proxy binary (bun compiled)
- `/usr/share/fish/vendor_completions.d/mcpctl.fish` — Fish completions
- `/usr/share/bash-completion/completions/mcpctl` — Bash completions
- `/usr/lib/systemd/user/mcplocal.service` — Systemd user service
User install:
```bash
dnf config-manager --add-repo https://mysources.co.uk/api/packages/michal/rpm.repo
dnf install mcpctl
```
### Git & PR Workflow
- Gitea at `http://10.0.0.194:3012` (internal) / `https://mysources.co.uk/michal/mcpctl` (public)
- `pr.sh` in project root creates PRs via Gitea API
- `gh` CLI not installed — use `pr.sh` or direct API calls
---
## 17. Testing Strategy
### Test Tiers
| Tier | Tool | Scope | When |
|------|------|-------|------|
| Unit tests | Vitest | Package-level, mocked dependencies | `pnpm test:run` |
| DB tests | Vitest | Full Prisma + test PostgreSQL | `pnpm --filter db exec vitest run` (separate) |
| Smoke tests | Vitest | Live mcplocal + mcpd (not mocked) | `pnpm test:smoke` (post-deploy) |
### Convention
- Every new feature MUST include smoke tests
- Smoke tests live in `src/mcplocal/tests/smoke/`
- Use `SmokeMcpSession` from `tests/smoke/mcp-client.ts` for MCP protocol interactions
- Smoke tests run automatically in the build/deploy pipeline
### Critical Rules
- **NEVER pipe pnpm test output** to `tail`, `grep`, `head` — pnpm hangs when it detects non-TTY
- Always capture full output with `2>&1` and read directly
- DB tests excluded from workspace-root vitest (need test database)
- Tests integrated into pipeline: `build-rpm.sh` runs unit tests; `release.sh` runs smoke tests
---
## 18. Technology Stack
| Layer | Technology |
|-------|-----------|
| CLI Framework | Commander.js, Ink/React (TUI), Inquirer |
| API Server | Fastify 5, TypeScript strict mode |
| Database | PostgreSQL 16, Prisma ORM v6 |
| Container Runtime | Docker/Podman via dockerode |
| MCP Protocol | @modelcontextprotocol/sdk |
| Validation | Zod schemas everywhere |
| LLM Providers | OpenAI, Anthropic, Google Gemini, Ollama, DeepSeek, Groq, Mistral, OpenRouter, Azure |
| Testing | Vitest, coverage via v8 |
| Build | TypeScript project references, pnpm workspaces |
| Compilation | Bun (binary compilation for RPM) |
| Packaging | nfpm (RPM/DEB), Docker multi-stage |
| CI/CD | fulldeploy.sh → Portainer API + Gitea packages |
| Shell Completions | Fish + Bash (auto-generated via `scripts/generate-completions.ts`) |
### Design Patterns
1. **Monorepo** — pnpm workspaces with shared base TypeScript config
2. **Layered architecture** — Routes → Services → Repositories (Prisma)
3. **Interface-based repositories** — all data access through interfaces for testability
4. **Dependency injection** — services receive dependencies via constructor
5. **Zod validation** — all input validated at API boundary
6. **Plugin inheritance** — composable ProxyModel plugins with conflict detection
7. **Content-addressed caching** — SHA256 hash keys for deduplication
8. **TTL-based stores** — prompt index (60s), system prompts (5min), sections (5min)
9. **Fire-and-forget audit** — non-blocking event collection
10. **Declarative config** — kubectl-style YAML/JSON for all resource management
---
## 19. Project Structure
```
mcpctl/
├── src/
│ ├── cli/ @mcpctl/cli CLI (Commander.js)
│ │ ├── src/commands/ 22 command handlers
│ │ ├── src/registry/ MCP server registry client
│ │ ├── src/formatters/ Output formatting (table/json/yaml)
│ │ └── src/auth/ Credential storage
│ ├── mcpd/ @mcpctl/mcpd Daemon (Fastify 5)
│ │ ├── src/routes/ 18 route handlers
│ │ ├── src/services/ 13 services
│ │ ├── src/repositories/ Data access layer
│ │ ├── src/middleware/ Auth, logging, error handling
│ │ └── src/validation/ Zod schemas, RBAC rules
│ ├── mcplocal/ @mcpctl/mcplocal Local proxy
│ │ ├── src/gate/ Session gating + tag matching
│ │ ├── src/proxymodel/ Plugin system + stages + cache
│ │ ├── src/providers/ 6 LLM providers
│ │ ├── src/upstream/ STDIO + HTTP upstream connections
│ │ ├── src/audit/ Event collection + batching
│ │ └── src/health/ Health monitoring
│ ├── db/ @mcpctl/db Database (Prisma)
│ │ ├── prisma/schema.prisma 22 models
│ │ └── prisma/migrations/ 11 migrations
│ └── shared/ @mcpctl/shared Constants, types, validation
├── deploy/ Dockerfiles + entrypoint
├── stack/ Production docker-compose + env
├── scripts/ Build, release, deploy scripts
├── completions/ Fish + Bash completions
├── templates/ MCP server YAML templates
├── docs/ Architecture + design docs
├── fulldeploy.sh Full build → deploy → release
├── deploy.sh Portainer stack deploy
├── pr.sh Gitea PR creation
├── nfpm.yaml RPM/DEB package metadata
├── vitest.config.ts Root test config
├── vitest.workspace.ts Workspace test config
├── tsconfig.base.json Base TypeScript config (strict)
└── pnpm-workspace.yaml Monorepo workspace definition
```
---
## 20. Deferred & Future Work
### Deferred Tasks
| ID | Description | Status |
|----|-------------|--------|
| 88 | Rename proxyMode: filtered → proxy | Deferred |
| 105-109 | Model Studio TUI | Deferred |
| 110 | RBAC for ProxyModels | Deferred |
| 113 | Model Studio docs | Deferred |
### Future Architecture
- **Virtual MCP Audit Server** — Claude-queryable audit tools
- **Graphiti Knowledge Graph** — causal graph from audit events
- **Lab Parameter Simulation** — re-run pipelines with different configs
- **Kubernetes Orchestrator** — beyond Docker/Podman
- **ConfigMaps** — non-sensitive config separate from Secrets
- **Multi-provider failover** — automatic LLM provider cascading
### Completed Major Features
- Project structure + monorepo setup
- MCP Registry Client (official, glama, smithery — 53 tests)
- Health Probe Runner (STDIO, SSE, Streamable HTTP — 12 tests)
- Container orchestration with reconciliation
- Full RBAC with name-scoped bindings
- Gated sessions with prompt scoring
- ProxyModel plugin system with inheritance
- Content pipeline with 4 built-in stages
- Pipeline cache (L1 memory + L2 disk)
- Audit infrastructure with trust model
- Git-based backup and restore
- Shell completions (Fish + Bash)
- RPM/DEB packaging and distribution
- Smoke test framework
- Console inspector for debugging