docs/project-summary.md

# mcpctl — Comprehensive Project Summary

**kubectl for Model Context Protocol servers.**

mcpctl is a production-grade management system for MCP servers, providing a Kubernetes-inspired declarative interface for deploying, orchestrating, and observing MCP servers that connect to Claude and other LLM clients.

---

## Table of Contents

1. [System Architecture](#1-system-architecture)
2. [Component Overview](#2-component-overview)
3. [Resource Model & Design Decisions](#3-resource-model--design-decisions)
4. [CLI Reference](#4-cli-reference)
5. [API Surface (mcpd)](#5-api-surface-mcpd)
6. [Database Schema](#6-database-schema)
7. [Local Proxy (mcplocal)](#7-local-proxy-mcplocal)
8. [ProxyModel Plugin System](#8-proxymodel-plugin-system)
9. [Gated Sessions](#9-gated-sessions)
10. [Content Pipeline & Stages](#10-content-pipeline--stages)
11. [LLM Provider Integration](#11-llm-provider-integration)
12. [Caching](#12-caching)
13. [Authentication & RBAC](#13-authentication--rbac)
14. [Audit Infrastructure & Trust Model](#14-audit-infrastructure--trust-model)
15. [Container Orchestration](#15-container-orchestration)
16. [Deployment & Distribution](#16-deployment--distribution)
17. [Testing Strategy](#17-testing-strategy)
18. [Technology Stack](#18-technology-stack)
19. [Project Structure](#19-project-structure)
20. [Deferred & Future Work](#20-deferred--future-work)

---

## 1. System Architecture

### Three-Tier Design

```
Claude Code / LLM Client
    | (STDIO — MCP JSON-RPC protocol)
    v
mcplocal (Local Daemon — developer machine)
    | (HTTP REST)
    v
mcpd (Remote Daemon — server/NAS, e.g. 10.0.0.194)
    | (Docker/Podman API)
    v
MCP Server Containers (isolated network)
```

```
                                                     ┌────────────┐
┌─────────────────┐     HTTP      ┌──────────────┐   │ PostgreSQL │
│   mcpctl CLI    │──────────────>│    mcpd      │──>│            │
│  (Commander.js) │               │  (Fastify 5) │   └────────────┘
└─────────────────┘               └──────┬───────┘
                                         │ Docker/Podman API
                                         v
                                  ┌──────────────┐
                                  │  Containers  │
                                  │ (MCP servers)│
                                  └──────────────┘

┌─────────────────┐    STDIO     ┌──────────────┐    STDIO/HTTP   ┌────────────┐
│  Claude / LLM   │────────────>│   mcplocal   │───────────────>│ MCP Servers│
│                 │              │  (McpRouter)  │               │            │
└─────────────────┘              └──────────────┘               └────────────┘
```

### Key Principle

- **mcpd owns the database** (PostgreSQL) — the only component that talks to the DB
- **mcplocal is stateless** — config-only, no database, acts as intelligent proxy
- **mcpctl stores only credentials** — `~/.mcpctl/config.json` and `~/.mcpctl/credentials.json`
- **All MCP servers run inside the mcpd container** on the NAS via podman (container-in-container)

---

## 2. Component Overview

### mcpctl (CLI)
- kubectl-like interface for managing the entire system
- Talks to mcplocal (local daemon) via HTTP REST, or directly to mcpd with `--direct`
- Distributed as RPM/DEB package via Gitea registry
- Built with Commander.js, Ink/React for TUI, Inquirer for prompts

### mcplocal (Local Daemon)
- Runs on developer machine as a systemd user service
- Exposes MCP protocol via STDIO to Claude
- Exposes HTTP REST API for mcpctl management commands
- Core responsibilities:
  - Tool namespacing and routing (`server/tool` format)
  - Gated sessions and prompt delivery
  - Content pipeline (transformation stages)
  - LLM integration for intelligent prompt selection
  - Pipeline result caching
  - Audit event collection

### mcpd (Remote Daemon)
- Server-side daemon on NAS/cloud (Fastify 5)
- Manages MCP server containers (Docker/Podman via dockerode)
- PostgreSQL for state, audit logs, access control
- Owns credentials (never exposed to mcplocal)
- REST API for all management operations
- MCP proxy endpoint for direct tool invocation
- Health probe runner for container monitoring
- Git-based backup system

### @mcpctl/db (Database Layer)
- Prisma ORM with PostgreSQL
- 22 models, 11 migrations
- Template seeding from YAML files at startup

### @mcpctl/shared (Shared Utilities)
- Constants, types, validation schemas (Zod)
- Secret encryption/decryption utilities
- Zero external dependencies beyond Zod

---

## 3. Resource Model & Design Decisions

### ADR-001: Kubernetes-Style Resource Model

| mcpctl Resource | K8s Analogy | Behavior |
|----------------|-------------|----------|
| **Server** | Deployment | Self-contained, complete definition. Contains image, command, transport, env refs, replicas. No external template dependencies at runtime. |
| **Instance** | Pod | Immutable, ephemeral, auto-managed by reconciliation loop. No `create instance` or `edit instance`. Delete triggers re-creation. |
| **Secret** | Secret | Holds sensitive key-value pairs. Servers reference via `env[].valueFrom.secretRef`. |
| **Project** | Namespace | Groups servers, configures ProxyModel and LLM provider. Generates `.mcp.json`. |
| **Prompt** | ConfigMap (sort of) | Instruction text delivered to Claude. Global or project-scoped. Priority-ranked. |
| **Template** | — | Blueprints for server creation. Used at create-time only, not runtime. |
| **RbacDefinition** | ClusterRoleBinding | Named policies with subjects and roleBindings. |

### ADR-002: Profiles Replaced with Secrets

The original `McpProfile` resource tried to be secrets, configmaps, and project-server links simultaneously. Environment variables declared in profiles were never actually passed to running containers.

**Decision:** Replace with dedicated `Secret` resource following Kubernetes conventions:
```yaml
servers:
  - name: ha-mcp
    env:
      - name: HOMEASSISTANT_TOKEN
        valueFrom:
          secretRef:
            name: ha-credentials
            key: HOMEASSISTANT_TOKEN
```

### ADR-003: Self-Contained Servers with Source Tracking

Servers store complete definitions (no runtime template dependencies). Optional `source` metadata enables registry-based upgrades via 3-way diff (old snapshot vs current server vs new template).

**Rationale:** Matches kubectl mental model. `get server X -o yaml > new.yaml && edit && apply` works naturally. Duplication is minimal (~10 lines YAML).

### ADR-004: ConfigMaps Deferred

Only Secrets implemented. ConfigMap separation can be added later if needed. Keeps the model simple.

### ADR-005: Apply-Compatible YAML Round-Trip

`mcpctl get server ha-mcp -o yaml > s.yaml && mcpctl apply -f s.yaml` must work:
- `get -o yaml/json` strips internal fields (id, createdAt, updatedAt, version, ownerId)
- Output wrapped in resource key: `{ servers: [...] }`
- `describe -o yaml/json` keeps full raw output (for debugging)

### ADR-006: CLI Design Principles

1. Everything possible via `apply -f` MUST also be possible via `create` CLI flags
2. Support `-o yaml` and `-o json` like kubectl
3. `describe` shows visually clean sectioned output with tables
4. Name resolution works everywhere (not just IDs)
5. Instances are immutable (like pods) — no create/edit

---

## 4. CLI Reference

### Global Options
```
--daemon-url <url>       mcplocal daemon URL
--direct                 bypass mcplocal, connect directly to mcpd
-p, --project <name>     Target project
-o, --output <format>    table | json | yaml
-v, --version            Show version
```

### Resource Operations

| Command | Description |
|---------|-------------|
| `mcpctl get <resource> [name]` | List resources or fetch by name/ID. Supports glob patterns (`graf*`). |
| `mcpctl describe <resource> <name>` | Detailed view with sections and tables. |
| `mcpctl create <resource> <name> [opts]` | Create resource. Mirrors `apply -f` capabilities. |
| `mcpctl edit <resource> <name>` | Open in `$EDITOR` as YAML, apply on save. |
| `mcpctl patch <resource> <name> key=val...` | Patch individual fields without editor. |
| `mcpctl delete <resource> <name>` | Delete resource. |
| `mcpctl apply -f <file>` | Declarative YAML/JSON application (like `kubectl apply`). Supports `--dry-run`. |

### Supported Resources

servers, projects, instances, secrets, templates, users, groups, rbac, prompts, promptrequests, serverattachments (virtual), proxymodels (virtual, from mcplocal), all (project export)

### Resource Aliases
```
server/srv → servers      project/proj → projects
instance/inst → instances  secret/sec → secrets
template/tpl → templates   prompt → prompts
user → users              group → groups
rbac/rbac-definition → rbac  promptrequest/pr → promptrequests
serverattachment/sa → serverattachments  proxymodel/pm → proxymodels
```

### Lifecycle & Diagnostics

| Command | Description |
|---------|-------------|
| `mcpctl status` | Show connectivity, auth status, LLM provider health, available models. |
| `mcpctl login` | Authenticate with mcpd (first login bootstraps initial user). |
| `mcpctl logout` | Clear stored credentials. |
| `mcpctl logs <name> [-t N] [-i index]` | Stream container logs. Resolves server name → running instance. |
| `mcpctl cache stats` | Show pipeline cache statistics per namespace. |
| `mcpctl cache clear [ns] [--older-than N]` | Clear pipeline cache. |
| `mcpctl backup` | Show git backup status, public SSH key. |
| `mcpctl backup log [-n N]` | Show backup commit history. |
| `mcpctl backup restore list/diff/to` | Restore to specific backup commit. |

### Console & Inspection

| Command | Description |
|---------|-------------|
| `mcpctl console [project]` | Interactive TUI — request/response timeline, tool inspection. |
| `mcpctl console --stdin-mcp` | MCP server mode over stdin/stdout (for Claude integration). |
| `mcpctl console --audit` | Browse audit events from mcpd interactively. |

### Configuration

| Command | Description |
|---------|-------------|
| `mcpctl config view` | Show current configuration. |
| `mcpctl config set <key> <value>` | Set config value (mcplocalUrl, mcpdUrl, registries, outputFormat, etc.). |
| `mcpctl config path` | Show config file path. |
| `mcpctl config setup` | Interactive configuration wizard. |
| `mcpctl config claude -p <project>` | Generate `.mcp.json` for Claude Code. |

### Create Subcommands

```bash
mcpctl create server <name> [--package-name X] [--docker-image X] [--transport STDIO|SSE|STREAMABLE_HTTP]
  [--runtime node|python] [--replicas N] [--env KEY=val] [--from-template name:version]

mcpctl create secret <name> [--data key=val ...] [--data-file path.json]

mcpctl create project <name> [-d desc] [--proxy-model default|gate|content-pipeline] [--server name ...]

mcpctl create user <email> [--password pass] [--name name]

mcpctl create group <name> [-d desc] [--member email ...]

mcpctl create rbac <name> [--subject kind:name] [--role-binding role:resource[:name]]

mcpctl create prompt <name> [--content text] [--project name] [--priority 1-10] [--link url]
```

### Apply File Format

```yaml
secrets:
  - name: my-secret
    data:
      KEY: value

servers:
  - name: my-server
    transport: STDIO
    packageName: "@modelcontextprotocol/server-example"
    env:
      - name: API_KEY
        valueFrom:
          secretRef:
            name: my-secret
            key: KEY

projects:
  - name: my-project
    proxyModel: default
    servers:
      - my-server

serverattachments:
  - server: my-server
    project: my-project

prompts:
  - name: my-prompt
    project: my-project
    content: "Instruction text..."
    priority: 5
```

---

## 5. API Surface (mcpd)

All endpoints under `/api/v1/` require Bearer token auth except `/auth/*` and `/health*`.

### Authentication
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/auth/bootstrap` | POST | First-user setup (creates admin + bootstrap RBAC) |
| `/auth/status` | GET | `{hasUsers: boolean}` (unauthenticated) |
| `/auth/login` | POST | Returns token + user info |
| `/auth/logout` | POST | Invalidate session |
| `/auth/me` | GET | Current user identity |
| `/auth/impersonate` | POST | Create session for another user (requires `run:impersonate`) |

### Servers
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/servers` | GET | List all servers |
| `/servers/:id` | GET | Get server by CUID |
| `/servers` | POST | Create server (validates name uniqueness, image/package) |
| `/servers/:id` | PUT | Update server, re-reconciles replicas |
| `/servers/:id` | DELETE | Delete server + cascade-delete all instances |

### Instances
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/instances` | GET | List (optional `?serverId=` filter) |
| `/instances/:id` | GET | Get instance |
| `/instances/:id` | DELETE | Delete instance, triggers reconciliation |
| `/instances/:id/inspect` | GET | Docker inspect output (state, port, IP) |
| `/instances/:id/logs` | GET | Container logs (`?tail=N`) |

### Projects
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/projects` | GET | List (RBAC-filtered) |
| `/projects/:id` | GET/POST/PUT/DELETE | CRUD by CUID or name |
| `/projects/:id/mcp-config` | GET | Generate `.mcp.json` |
| `/projects/:id/instructions` | GET | Get prompt + attached servers for system message |
| `/projects/:id/servers` | GET/POST | List/attach servers |
| `/projects/:id/servers/:name` | DELETE | Detach server |

### Prompts & Prompt Requests
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/prompts` | GET/POST | List/create approved prompts |
| `/prompts/:id` | PUT/DELETE | Update/delete (system prompts reset to default) |
| `/prompts/:id/regenerate-summary` | POST | Force re-generate summary/chapters |
| `/promptrequests` | GET/POST | List/create pending requests |
| `/promptrequests/:id/approve` | POST | Atomic delete request → create prompt |
| `/projects/:name/prompts/visible` | GET | Approved + session's pending |
| `/projects/:name/prompt-index` | GET | Compact index for gating |

### Secrets, Users, Groups, RBAC
Standard CRUD on `/secrets`, `/users`, `/groups`, `/rbac-definitions`.

### Health & Monitoring
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/health/overview` | GET | System health, instance counts, error rate |
| `/health/instances/:id` | GET | Instance-specific health, uptime, latency |
| `/metrics` | GET | Request counts, error counts, last request time |
| `/healthz` | GET | Liveness probe |

### Backup, Restore, Audit
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/backup` | POST | Create encrypted bundle (servers, secrets, projects, users, groups, rbac) |
| `/restore` | POST | Restore bundle (merge/skip/overwrite strategy) |
| `/audit/events` | POST/GET | Batch insert from mcplocal / query with filters |
| `/audit/sessions` | GET | Session aggregates (first/last seen, event counts) |
| `/git/backup/init` | POST | Initialize git backup with SSH credentials |
| `/git/backup/status` | GET | Backup sync status |
| `/git/backup/sync` | POST | Manual trigger sync |

### MCP Proxy
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/mcp/proxy` | POST | Forward JSON-RPC to running MCP server instance. Dispatches by transport (STDIO via docker exec, SSE/HTTP via direct HTTP). Maintains persistent STDIO connections. |

---

## 6. Database Schema

PostgreSQL via Prisma ORM. 22 models across 11 migrations.

### Core Models

**User & Auth:**
- `User` — email/password (bcrypt), role (USER/ADMIN), optional OAuth
- `Session` — Bearer token with 30-day TTL
- `Group` / `GroupMember` — user groups for RBAC

**MCP Infrastructure:**
- `McpServer` — transport (STDIO/SSE/STREAMABLE_HTTP), docker image, package name, runtime (node/python), env vars (JSON), health check config, replicas, external URL
- `McpTemplate` — reusable blueprints for server creation (mirrors McpServer fields)
- `McpInstance` — running containers, status (STARTING/RUNNING/STOPPING/STOPPED/ERROR), container ID, port, health status, events

**Organization:**
- `Project` — LLM config (provider, model), proxy model, gated flag, prompt instructions, server overrides
- `ProjectServer` — junction table linking projects to servers
- `Secret` — named secret bundles (data as encrypted JSON), versioned

**Content:**
- `Prompt` — approved system prompts (global or project-scoped), priority, summary/chapters, optional link target
- `PromptRequest` — pending prompt proposals from LLM sessions

**Audit & Backup:**
- `AuditLog` — user action trail (action, resource, resourceId, details)
- `AuditEvent` — pipeline/gate/tool trace events from mcplocal (sessionId, projectName, eventKind, correlationId, userName)
- `BackupPending` — queue for git-based backup sync
- `RbacDefinition` — named RBAC policies

---

## 7. Local Proxy (mcplocal)

### Request Flow

```
Claude (STDIO JSON-RPC)
    ↓
StdioProxyServer (reads from stdin)
    ↓
McpRouter.route(request)
    ├→ PluginSessionContext (per-session state)
    ├→ ProxyModelPlugin hooks (intercept/transform)
    ├→ Upstream lookup (tool name prefix → server)
    └→ Response (with optional drill-down sections)
```

### Router Responsibilities

- Manages upstream connections (STDIO child processes, HTTP)
- Maps tools/resources/prompts to servers via name prefix (`servername/toolname`)
- Maintains prompt index + system prompt cache (TTL-based)
- Dispatches plugin hooks via `getOrCreatePluginContext`
- Section storage for drill-down navigation
- Audit event collection and batching
- Link resolution (relative → absolute URLs)

### Upstream Transports

| Transport | Implementation |
|-----------|---------------|
| STDIO | Spawns child process, bidirectional pipe, JSON-RPC over newline-delimited JSON |
| SSE | HTTP GET for event stream, POST for messages |
| Streamable HTTP | HTTP POST with JSON-RPC payloads |

### HTTP Endpoints (mcplocal)

| Endpoint | Description |
|----------|-------------|
| `GET /proxymodels` | List all models (YAML pipelines + TS plugins) |
| `GET /proxymodels/:name` | Single model details |
| `GET /proxymodels/stages` | List available stages |
| `POST /proxymodels/reload` | Force reload stages from disk |
| `GET /cache/stats` | Per-namespace cache statistics |
| `DELETE /cache` | Clear all or by age |
| `DELETE /cache/:namespace` | Clear specific namespace |
| `POST /mcp` | JSON-RPC request forwarding |

---

## 8. ProxyModel Plugin System

A **ProxyModel** is either a **Pipeline** (YAML) or a **Plugin** (TypeScript).

### Plugin Interface

| Hook | When it fires |
|------|--------------|
| `onSessionCreate` | New MCP session established |
| `onSessionDestroy` | Session ends |
| `onInitialize` | MCP initialize request — can inject instructions |
| `onToolsList` | tools/list — can filter/modify tool list |
| `onToolCallBefore` | Before forwarding a tool call — can intercept |
| `onToolCallAfter` | After receiving tool result — can transform |
| `onResourcesList` | resources/list — can filter resources |
| `onResourceRead` | resources/read — can intercept reads |
| `onPromptsList` | prompts/list — can filter prompts |
| `onPromptGet` | prompts/get — can intercept reads |

### Built-in Plugins

| Plugin | Extends | Gating | Content Pipeline | Use Case |
|--------|---------|:------:|:----------------:|----------|
| `gate` | — | Yes | No | Gating + prompt delivery only |
| `content-pipeline` | — | No | Yes | Content transformation only |
| `default` | gate + content-pipeline | Yes | Yes | Full pipeline (most common) |

**Inheritance:** Plugins can extend parents. Conflicting hooks from multiple parents cause load-time errors (except chainable lifecycle hooks which run sequentially).

### Pipeline Configuration (YAML)

```yaml
name: default
spec:
  controller: gate
  controllerConfig: { byteBudget: 8192 }
  stages:
    - type: passthrough
    - type: paginate
      config: { pageSize: 8000 }
  appliesTo: [prompt, toolResult]
  cacheable: true
```

### Per-Session Context

Each session gets a `PluginSessionContext` providing:
- Session state (`Map<string, unknown>`)
- LLM provider, cache provider, structured logger
- Virtual tool/server registration
- Upstream routing and tool discovery
- Content processing and notifications
- Audit event emission

---

## 9. Gated Sessions

### Problem

When Claude connects to an MCP server, it sees all tools immediately and starts using them. In a managed environment, you want to deliver relevant context (prompts/instructions) before granting tool access.

### Solution: Keyword-Driven Prompt Retrieval

1. **Initialize:** Instructions include prompt index + "call `begin_session` immediately"
2. **Gated `tools/list`:** Only `begin_session` visible
3. **Claude calls `begin_session`** with keywords describing the task
4. **Prompt matching:** Keywords matched against prompt summaries/chapters
5. **Ungating:** Matched prompts returned + `tools/list_changed` notification sent
6. **Full access:** All upstream tools now visible

### Prompt Scoring

**Formula:** `priority + (matchCount * priority)`

- Priority alone is baseline — ensures global prompts compete for inclusion
- Tag matches multiply priority — relevant prompts score higher
- Priority 10 = always included (bypasses budget)
- 8KB byte budget cap; overflow prompts listed as index-only

### Critical Design Lessons

**What works:**
- One gate tool (`begin_session`), zero ambiguity
- Instructions say "check its input schema" (not naming specific parameters)
- "immediately" and "required" prevent Claude from exploring first
- Tool names listed as preview in instructions (helps keyword generation)
- `tools/list_changed` notification mandatory after ungating
- Auto-ungate fallback if Claude bypasses gate

**What fails:**
- Naming parameters that don't match the schema
- Complex conditional instructions (Claude prefers simple paths)
- Multiple tools in gated state (Claude skips the gate)
- Gate instructions only in tool description (must be in initialize response)
- Burying the call-to-action after 200 lines of context

### Complete Flow

```
Client                    mcplocal                    upstream
  │── initialize ────────>│
  │<── instructions ──────│  (gate instructions + prompt index + tool preview)
  │── tools/list ────────>│
  │<── [begin_session] ───│  (ONLY begin_session visible)
  │── tools/call ────────>│
  │   begin_session        │── match prompts ─────────>│
  │   {tags:[...]}         │<── prompt content ────────│
  │<── matched prompts ───│  (full content + encouragement)
  │<── notification ──────│  (tools/list_changed)
  │── tools/list ────────>│
  │<── [108+ tools] ──────│  (ALL tools now visible)
  │                        │
  │  Claude proceeds with full tool access
```

---

## 10. Content Pipeline & Stages

### How It Works

Tool results pass through an ordered sequence of stages before reaching Claude:

1. Each stage receives previous stage's content
2. Returns `{content, sections?, metadata?}`
3. Sections enable drill-down navigation
4. Stage errors are caught — pipeline continues with previous content

### Built-in Stages

| Stage | Purpose |
|-------|---------|
| `passthrough` | Identity transform (testing/baseline) |
| `paginate` | Split large content into numbered pages (8KB default). LLM-generated page titles (cached). |
| `section-split` | Split by structure: JSON arrays/objects → elements/keys, YAML → keys, prose → `##` headers, code → function/class boundaries. Merges tiny sections, re-splits oversized ones. |
| `summarize-tree` | Hierarchical LLM-generated section summaries. Groups sections into trees. Cached. |

### Section Drill-Down

After pipeline produces sections:
1. Full content replaced with compact table of contents + `_resultId`
2. Sections stored in session-scoped store (5-minute TTL)
3. Client calls same tool with `_resultId` + `_section` to retrieve specific section
4. Supports hierarchical navigation (sections within sections)

### Custom Stages

Drop `.js` files in `~/.mcpctl/stages/`:
```javascript
export default async function myStage(input, context) {
  // context.llm, context.cache, context.log available
  return { content: transformedContent, sections: [...] };
}
```

Hot-reload with 300ms file watch debounce. Built-in stages take precedence.

---

## 11. LLM Provider Integration

### Supported Providers

| Provider | Type | Tier |
|----------|------|------|
| Gemini CLI | Local | Fast |
| Ollama | Local | Fast |
| DeepSeek | API | Fast/Heavy |
| OpenAI | API | Heavy |
| Anthropic | API | Heavy |
| vLLM | Local | Configurable |
| vLLM Managed | Auto-managed local | Configurable |

### Tier System

- **Fast tier:** Quick, cheap models for pipeline stages and keyword extraction
- **Heavy tier:** Full models for complex prompt selection and summarization
- **Legacy active:** Single default provider (fallback)

### LLM Adapter

Stages use a simple interface:
```typescript
interface LLMProvider {
  complete(prompt: string, options?): Promise<string>;
  available(): boolean;
}
```

Resolution order: named provider → fast tier → heavy tier → active provider.

### Multi-Provider Configuration

```json
{
  "llm": {
    "providers": [
      { "name": "fast-local", "type": "ollama", "model": "llama3", "tier": "fast" },
      { "name": "heavy-api", "type": "openai", "model": "gpt-4", "tier": "heavy" }
    ]
  }
}
```

---

## 12. Caching

### Architecture: L1 Memory + L2 Disk

- **L1 in-memory:** LRU map (default 500 entries) for fast lookups
- **L2 disk:** `~/.mcpctl/cache/<namespace>/<key>.dat`
  - Namespace: `provider--model--proxymodel` (e.g., `openai--gpt-4o--content-pipeline`)
  - Key: 16-char hex SHA256 prefix of content
  - Value: raw content (no JSON wrapper)

### Configuration

| Option | Default | Examples |
|--------|---------|---------|
| `maxSize` | 256MB | `"1GB"`, `"10%"` (of partition), `536870912` (bytes) |
| `ttlMs` | 30 days | Any millisecond value |
| `maxMemoryEntries` | 500 | L1 LRU cap |
| `dir` | `~/.mcpctl/cache` | Custom path |

### Management

```bash
mcpctl cache stats                        # Per-namespace breakdown
mcpctl cache clear                        # Clear everything
mcpctl cache clear openai--gpt-4--default # Clear specific namespace
mcpctl cache clear --older-than 7         # Clear entries older than 7 days
```

### HTTP API
```
GET  /cache/stats              # Per-namespace stats
DELETE /cache                  # Clear all (or ?olderThan=N)
DELETE /cache/:namespace       # Clear specific namespace
```

---

## 13. Authentication & RBAC

### Auth Flow

1. `mcpctl login` → prompts for email/password
2. First login (no users in system) → bootstrap: creates admin user + admin group + bootstrap RBAC
3. POST `/auth/login` → returns 30-day bearer token
4. Token stored in `~/.mcpctl/credentials.json`
5. CLI passes token to mcplocal config
6. mcplocal attaches `Authorization: Bearer <token>` to all mcpd requests
7. mcpd validates token against Session table

### RBAC Model

**Subjects:** User (by email), Group, ServiceAccount

**Roles & Capabilities:**

| Role | Grants |
|------|--------|
| `edit` | view, create, delete, edit, expose |
| `view` | view |
| `create` | create |
| `delete` | delete |
| `run` | run (for operations) |
| `expose` | expose, view |

**Resources:** `*`, servers, instances, secrets, projects, templates, users, groups, rbac, prompts, promptrequests

**Binding Types:**
- **Resource binding:** `{role: 'edit', resource: 'servers', name?: 'my-server'}`
  - With `name`: user can only access that specific resource
  - Without `name`: user can access all resources of that type
- **Operation binding:** `{role: 'run', action: 'impersonate'}`
  - Grants permission for named operations (backup, restore, audit-purge, logs, impersonate)

### Resolution

1. CLI resolves name → CUID client-side before API calls
2. RBAC hook resolves CUID → name before checking bindings
3. List filtering: `getAllowedScope()` computes allowed names, `preSerialization` hook filters arrays
4. Wildcard scope: user has unscoped binding → sees all resources
5. Named scope: user has only name-scoped bindings → filtered to allowed names

---

## 14. Audit Infrastructure & Trust Model

### Event Kinds

| Event Kind | Description |
|------------|-------------|
| `pipeline_execution` | Full pipeline run summary (duration, stage count, sizes) |
| `stage_execution` | Individual stage detail (duration, input/output size, error) |
| `gate_decision` | Gate open/close with client intent and matched prompts |
| `prompt_delivery` | Which prompts were sent, match scores |
| `tool_call_trace` | Tool call with server + timing + result size |
| `rbac_decision` | Access control decisions |
| `session_bind` | Session initialization |

### Trust Model

| Source | Verified | Meaning |
|--------|----------|---------|
| `client` | false | Client LLM claims (begin_session intent, tags) |
| `mcplocal` | true | Server-side data (prompt matches, pipeline transforms) |
| `mcpd` | true | mcpd-originated events |

### AuditCollector

Fire-and-forget batching: 50 events max, 5-second flush interval. POSTs to mcpd. Non-blocking — audit failures don't affect tool calls.

### Correlation & Causality

- `correlationId` links related events (all events from one tool call)
- `parentEventId` enables causal chains (gate_decision → pipeline_execution)
- `userName` tracks which user triggered the event
- Designed for future graphiti knowledge graph ingestion

### Per-Server Targeting

Different servers in a project can have different proxymodel configs via `serverOverrides` on the project resource. Resolution: server override → project default → null.

### Future (Designed, Not Implemented)

- **Virtual MCP Audit Server:** mcpd-hosted virtual server providing `query_audit_log`, `get_session_timeline` tools. Claude can directly query audit data.
- **Graphiti Integration:** Causal graph with entity types (Session, Tool, Server, ProxyModel, Prompt, Stage) and edges (`triggered_by`, `transformed_by`, `verified_by`).
- **Lab Parameter Simulation:** Select any pipeline event, retrieve original input, re-run with different proxyModel/LLM/stages, side-by-side diff.
- **Audit Level Config:** Per-server `auditLevel: 'full' | 'hash-only' | 'disabled'`.

---

## 15. Container Orchestration

### Orchestrator Interface

`McpOrchestrator` abstracts container management (Docker/Podman today, Kubernetes in the future):
- `pullImage`, `createContainer`, `stopContainer`, `removeContainer`
- `inspectContainer`, `getContainerLogs`, `execInContainer`, `ping`

### Container Management

- Labels: `mcpctl.managed=true` for filtering
- Network: `mcp-servers` (configurable via `MCPD_MCP_NETWORK`)
- Resource limits: 512MB RAM, 0.5 CPU (configurable)
- Internal container IP exposed via inspect

### Runtime Spawn Commands

| Runtime | Command |
|---------|---------|
| Node | `npx --prefer-offline -y <packageName>` |
| Python | `uvx <packageName>` |
| Custom | Explicit `command` field |

### Health Probes

Periodic MCP tool-call probes (like K8s livenessProbe):
- Default interval: 15 seconds
- Dispatch by transport: STDIO (docker exec), HTTP (JSON-RPC)
- Failure threshold: 3 consecutive failures → unhealthy
- Updates instance `healthStatus` and `lastHealthCheck`

### Reconciliation Loop

Maintains desired replica count:
- If running < desired → start new instances
- If running > desired → stop excess instances
- Detects crashed containers → marks ERROR → triggers re-creation

### Persistent STDIO Connections

For STDIO transport, mcpd maintains long-lived exec sessions (`PersistentStdioClient`) to avoid repeated `docker exec` overhead. Bidirectional streaming for interactive sessions.

---

## 16. Deployment & Distribution

### Production Deployment

**mcpd runs on 10.0.0.194** (NAS, managed via Portainer), NOT on the dev machine.

```bash
# Full deploy (preferred after merging)
bash fulldeploy.sh
```

`fulldeploy.sh` runs three steps:
1. `scripts/build-mcpd.sh` — build + push Docker image to `mysources.co.uk/michal/mcpctl-mcpd`
2. `deploy.sh` — deploy stack to production via Portainer API at `http://10.0.0.194:9000`
3. `scripts/release.sh` — build RPM + publish to Gitea + install locally + smoke tests

### Docker Images

| Image | Purpose |
|-------|---------|
| `mcpctl-mcpd` | Multi-stage build: Node 20 Alpine, includes git/ssh, Prisma |
| `mcpctl-node-runner` | Node 20 slim, runs `npx -y` for npm packages |
| `mcpctl-python-runner` | Python 3.12 slim, uses `uv` for Python packages |

All pushed to `mysources.co.uk/michal/` registry.

### Stack Services (Production)

- `postgres` — PostgreSQL 16 (port 5432)
- `mcpd` — Daemon (port 3100)
- `node-runner`, `python-runner` — Base images
- Networks: `mcpctl` (management), `mcp-servers` (container communication)

### RPM/DEB Distribution

```bash
source .env && bash scripts/release.sh
```

Installs via nfpm:
- `/usr/bin/mcpctl` — CLI binary (bun compiled)
- `/usr/bin/mcpctl-local` — Local proxy binary (bun compiled)
- `/usr/share/fish/vendor_completions.d/mcpctl.fish` — Fish completions
- `/usr/share/bash-completion/completions/mcpctl` — Bash completions
- `/usr/lib/systemd/user/mcplocal.service` — Systemd user service

User install:
```bash
dnf config-manager --add-repo https://mysources.co.uk/api/packages/michal/rpm.repo
dnf install mcpctl
```

### Git & PR Workflow

- Gitea at `http://10.0.0.194:3012` (internal) / `https://mysources.co.uk/michal/mcpctl` (public)
- `pr.sh` in project root creates PRs via Gitea API
- `gh` CLI not installed — use `pr.sh` or direct API calls

---

## 17. Testing Strategy

### Test Tiers

| Tier | Tool | Scope | When |
|------|------|-------|------|
| Unit tests | Vitest | Package-level, mocked dependencies | `pnpm test:run` |
| DB tests | Vitest | Full Prisma + test PostgreSQL | `pnpm --filter db exec vitest run` (separate) |
| Smoke tests | Vitest | Live mcplocal + mcpd (not mocked) | `pnpm test:smoke` (post-deploy) |

### Convention

- Every new feature MUST include smoke tests
- Smoke tests live in `src/mcplocal/tests/smoke/`
- Use `SmokeMcpSession` from `tests/smoke/mcp-client.ts` for MCP protocol interactions
- Smoke tests run automatically in the build/deploy pipeline

### Critical Rules

- **NEVER pipe pnpm test output** to `tail`, `grep`, `head` — pnpm hangs when it detects non-TTY
- Always capture full output with `2>&1` and read directly
- DB tests excluded from workspace-root vitest (need test database)
- Tests integrated into pipeline: `build-rpm.sh` runs unit tests; `release.sh` runs smoke tests

---

## 18. Technology Stack

| Layer | Technology |
|-------|-----------|
| CLI Framework | Commander.js, Ink/React (TUI), Inquirer |
| API Server | Fastify 5, TypeScript strict mode |
| Database | PostgreSQL 16, Prisma ORM v6 |
| Container Runtime | Docker/Podman via dockerode |
| MCP Protocol | @modelcontextprotocol/sdk |
| Validation | Zod schemas everywhere |
| LLM Providers | OpenAI, Anthropic, Google Gemini, Ollama, DeepSeek, Groq, Mistral, OpenRouter, Azure |
| Testing | Vitest, coverage via v8 |
| Build | TypeScript project references, pnpm workspaces |
| Compilation | Bun (binary compilation for RPM) |
| Packaging | nfpm (RPM/DEB), Docker multi-stage |
| CI/CD | fulldeploy.sh → Portainer API + Gitea packages |
| Shell Completions | Fish + Bash (auto-generated via `scripts/generate-completions.ts`) |

### Design Patterns

1. **Monorepo** — pnpm workspaces with shared base TypeScript config
2. **Layered architecture** — Routes → Services → Repositories (Prisma)
3. **Interface-based repositories** — all data access through interfaces for testability
4. **Dependency injection** — services receive dependencies via constructor
5. **Zod validation** — all input validated at API boundary
6. **Plugin inheritance** — composable ProxyModel plugins with conflict detection
7. **Content-addressed caching** — SHA256 hash keys for deduplication
8. **TTL-based stores** — prompt index (60s), system prompts (5min), sections (5min)
9. **Fire-and-forget audit** — non-blocking event collection
10. **Declarative config** — kubectl-style YAML/JSON for all resource management

---

## 19. Project Structure

```
mcpctl/
├── src/
│   ├── cli/            @mcpctl/cli          CLI (Commander.js)
│   │   ├── src/commands/                    22 command handlers
│   │   ├── src/registry/                    MCP server registry client
│   │   ├── src/formatters/                  Output formatting (table/json/yaml)
│   │   └── src/auth/                        Credential storage
│   ├── mcpd/           @mcpctl/mcpd         Daemon (Fastify 5)
│   │   ├── src/routes/                      18 route handlers
│   │   ├── src/services/                    13 services
│   │   ├── src/repositories/                Data access layer
│   │   ├── src/middleware/                   Auth, logging, error handling
│   │   └── src/validation/                  Zod schemas, RBAC rules
│   ├── mcplocal/       @mcpctl/mcplocal     Local proxy
│   │   ├── src/gate/                        Session gating + tag matching
│   │   ├── src/proxymodel/                  Plugin system + stages + cache
│   │   ├── src/providers/                   6 LLM providers
│   │   ├── src/upstream/                    STDIO + HTTP upstream connections
│   │   ├── src/audit/                       Event collection + batching
│   │   └── src/health/                      Health monitoring
│   ├── db/             @mcpctl/db           Database (Prisma)
│   │   ├── prisma/schema.prisma             22 models
│   │   └── prisma/migrations/               11 migrations
│   └── shared/         @mcpctl/shared       Constants, types, validation
├── deploy/                                  Dockerfiles + entrypoint
├── stack/                                   Production docker-compose + env
├── scripts/                                 Build, release, deploy scripts
├── completions/                             Fish + Bash completions
├── templates/                               MCP server YAML templates
├── docs/                                    Architecture + design docs
├── fulldeploy.sh                            Full build → deploy → release
├── deploy.sh                                Portainer stack deploy
├── pr.sh                                    Gitea PR creation
├── nfpm.yaml                                RPM/DEB package metadata
├── vitest.config.ts                         Root test config
├── vitest.workspace.ts                      Workspace test config
├── tsconfig.base.json                       Base TypeScript config (strict)
└── pnpm-workspace.yaml                      Monorepo workspace definition
```

---

## 20. Deferred & Future Work

### Deferred Tasks

| ID | Description | Status |
|----|-------------|--------|
| 88 | Rename proxyMode: filtered → proxy | Deferred |
| 105-109 | Model Studio TUI | Deferred |
| 110 | RBAC for ProxyModels | Deferred |
| 113 | Model Studio docs | Deferred |

### Future Architecture

- **Virtual MCP Audit Server** — Claude-queryable audit tools
- **Graphiti Knowledge Graph** — causal graph from audit events
- **Lab Parameter Simulation** — re-run pipelines with different configs
- **Kubernetes Orchestrator** — beyond Docker/Podman
- **ConfigMaps** — non-sensitive config separate from Secrets
- **Multi-provider failover** — automatic LLM provider cascading

### Completed Major Features

- Project structure + monorepo setup
- MCP Registry Client (official, glama, smithery — 53 tests)
- Health Probe Runner (STDIO, SSE, Streamable HTTP — 12 tests)
- Container orchestration with reconciliation
- Full RBAC with name-scoped bindings
- Gated sessions with prompt scoring
- ProxyModel plugin system with inheritance
- Content pipeline with 4 built-in stages
- Pipeline cache (L1 memory + L2 disk)
- Audit infrastructure with trust model
- Git-based backup and restore
- Shell completions (Fish + Bash)
- RPM/DEB packaging and distribution
- Smoke test framework
- Console inspector for debugging