- Rename local-proxy to mcplocal with HTTP server, LLM pipeline, mcpd discovery - Add LLM pre-processing: token estimation, filter cache, metrics, Gemini CLI + DeepSeek providers - Add mcpd auth (login/logout) and MCP proxy endpoints - Update CLI: dual URLs (mcplocalUrl/mcpdUrl), auth commands, --direct flag - Add tiered health monitoring, shell completions, e2e integration tests - 57 test files, 597 tests passing Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
273 lines
12 KiB
Markdown
273 lines
12 KiB
Markdown
# mcpctl v2 - Corrected 3-Tier Architecture PRD
|
|
|
|
## Overview
|
|
|
|
mcpctl is a kubectl-inspired system for managing MCP (Model Context Protocol) servers. It consists of 4 components arranged in a 3-tier architecture:
|
|
|
|
```
|
|
Claude Code
|
|
|
|
|
v (stdio - MCP protocol)
|
|
mcplocal (Local Daemon - runs on developer machine)
|
|
|
|
|
v (HTTP REST)
|
|
mcpd (External Daemon - runs on server/NAS)
|
|
|
|
|
v (Docker API / K8s API)
|
|
mcp_servers (MCP server containers)
|
|
```
|
|
|
|
## Components
|
|
|
|
### 1. mcpctl (CLI Tool)
|
|
- **Package**: `src/cli/` (`@mcpctl/cli`)
|
|
- **What it is**: kubectl-like CLI for managing the entire system
|
|
- **Talks to**: mcplocal (local daemon) via HTTP REST
|
|
- **Key point**: mcpctl does NOT talk to mcpd directly. It always goes through mcplocal.
|
|
- **Distributed as**: RPM package via Gitea registry (bun compile + nfpm)
|
|
- **Commands**: get, describe, apply, setup, instance, claude, project, backup, restore, config, status
|
|
|
|
### 2. mcplocal (Local Daemon)
|
|
- **Package**: `src/local-proxy/` (rename to `src/mcplocal/`)
|
|
- **What it is**: Local daemon running on the developer's machine
|
|
- **Talks to**: mcpd (external daemon) via HTTP REST
|
|
- **Exposes to Claude**: MCP protocol via stdio (tools, resources, prompts)
|
|
- **Exposes to mcpctl**: HTTP REST API for management commands
|
|
|
|
**Core responsibility: LLM Pre-processing**
|
|
|
|
This is the intelligence layer. When Claude asks for data from MCP servers, mcplocal:
|
|
|
|
1. Receives Claude's request (e.g., "get Slack messages about security")
|
|
2. Uses a local/cheap LLM (Gemini CLI binary, Ollama, vLLM, DeepSeek API) to interpret what Claude actually wants
|
|
3. Sends narrow, filtered requests to mcpd which forwards to the actual MCP servers
|
|
4. Receives raw results from MCP servers (via mcpd)
|
|
5. Uses the local LLM again to filter/summarize results - extracting only what's relevant
|
|
6. Returns the smallest, most comprehensive response to Claude
|
|
|
|
**Why**: Claude Code tokens are expensive. Instead of dumping 500 Slack messages into Claude's context window, mcplocal uses a cheap LLM to pre-filter to the 12 relevant ones.
|
|
|
|
**LLM Provider Strategy** (already partially exists):
|
|
- Gemini CLI binary (local, free)
|
|
- Ollama (local, free)
|
|
- vLLM (local, free)
|
|
- DeepSeek API (cheap)
|
|
- OpenAI API (fallback)
|
|
- Anthropic API (fallback)
|
|
|
|
**Additional mcplocal responsibilities**:
|
|
- MCP protocol routing (namespace tools: `slack/send_message`, `jira/create_issue`)
|
|
- Connection health monitoring for upstream MCP servers
|
|
- Caching frequently requested data
|
|
- Proxying mcpctl management commands to mcpd
|
|
|
|
### 3. mcpd (External Daemon)
|
|
- **Package**: `src/mcpd/` (`@mcpctl/mcpd`)
|
|
- **What it is**: Server-side daemon that runs on centralized infrastructure (Synology NAS, cloud server, etc.)
|
|
- **Deployed via**: Docker Compose (Dockerfile + docker-compose.yml)
|
|
- **Database**: PostgreSQL for state, audit logs, access control
|
|
|
|
**Core responsibilities**:
|
|
- **Deploy and run MCP server containers** (Docker now, Kubernetes later)
|
|
- **Instance lifecycle management**: start, stop, restart, logs, inspect
|
|
- **MCP server registry**: Store server definitions, configuration templates, profiles
|
|
- **Project management**: Group MCP profiles into projects for Claude sessions
|
|
- **Auditing**: Log every operation - who ran what, when, with what result
|
|
- **Access management**: Users, sessions, permissions - who can access which MCP servers
|
|
- **Credential storage**: MCP servers often need API tokens (Slack, Jira, GitHub) - stored securely on server side, never exposed to local machine
|
|
- **Backup/restore**: Export and import configuration
|
|
|
|
**Key point**: mcpd holds the credentials. When mcplocal asks mcpd to query Slack, mcpd runs the Slack MCP server container with the proper SLACK_TOKEN injected - mcplocal never sees the token.
|
|
|
|
### 4. mcp_servers (MCP Server Containers)
|
|
- **What they are**: The actual MCP server processes (Slack, Jira, GitHub, Terraform, filesystem, postgres, etc.)
|
|
- **Managed by**: mcpd via Docker/Podman API
|
|
- **Network**: Isolated network, only accessible by mcpd
|
|
- **Credentials**: Injected by mcpd as environment variables
|
|
- **Communication**: MCP protocol (stdio or SSE/HTTP) between mcpd and the containers
|
|
|
|
## Data Flow Examples
|
|
|
|
### Example 1: Claude asks for Slack messages
|
|
```
|
|
Claude: "Get messages about security incidents from the last week"
|
|
|
|
|
v (MCP tools/call: slack/search_messages)
|
|
mcplocal:
|
|
1. Intercepts the tool call
|
|
2. Calls local Gemini: "User wants security incident messages from last week.
|
|
Generate optimal Slack search query and date filters."
|
|
3. Gemini returns: query="security incident OR vulnerability OR CVE", after="2024-01-15"
|
|
4. Sends filtered request to mcpd
|
|
|
|
|
v (HTTP POST /api/v1/mcp/proxy)
|
|
mcpd:
|
|
1. Looks up Slack MCP instance (injects SLACK_TOKEN)
|
|
2. Forwards narrowed query to Slack MCP server container
|
|
3. Returns raw results (200 messages)
|
|
|
|
|
v (response)
|
|
mcplocal:
|
|
1. Receives 200 messages
|
|
2. Calls local Gemini: "Filter these 200 Slack messages. Keep only those
|
|
directly about security incidents. Return message IDs and 1-line summaries."
|
|
3. Gemini returns: 15 relevant messages with summaries
|
|
4. Returns filtered result to Claude
|
|
|
|
|
v (MCP response: 15 messages instead of 200)
|
|
Claude: processes only the relevant 15 messages
|
|
```
|
|
|
|
### Example 2: mcpctl management command
|
|
```
|
|
$ mcpctl get servers
|
|
|
|
|
v (HTTP GET)
|
|
mcplocal:
|
|
1. Recognizes this is a management command (not MCP data)
|
|
2. Proxies directly to mcpd (no LLM processing needed)
|
|
|
|
|
v (HTTP GET /api/v1/servers)
|
|
mcpd:
|
|
1. Queries PostgreSQL for server definitions
|
|
2. Returns list
|
|
|
|
|
v (proxied response)
|
|
mcplocal -> mcpctl -> formatted table output
|
|
```
|
|
|
|
### Example 3: mcpctl instance management
|
|
```
|
|
$ mcpctl instance start slack
|
|
|
|
|
v
|
|
mcplocal -> mcpd:
|
|
1. Creates Docker container for Slack MCP server
|
|
2. Injects SLACK_TOKEN from secure storage
|
|
3. Connects to isolated mcp-servers network
|
|
4. Logs audit entry: "user X started slack instance"
|
|
5. Returns instance status
|
|
```
|
|
|
|
## What Already Exists (completed work)
|
|
|
|
### Done and reusable as-is:
|
|
- Project structure: pnpm monorepo, TypeScript strict mode, Vitest, ESLint
|
|
- Database schema: Prisma + PostgreSQL (User, McpServer, McpProfile, Project, McpInstance, AuditLog)
|
|
- mcpd server framework: Fastify 5, routes, services, repositories, middleware
|
|
- mcpd MCP server CRUD: registration, profiles, projects
|
|
- mcpd Docker container management: dockerode, instance lifecycle
|
|
- mcpd audit logging, health monitoring, metrics, backup/restore
|
|
- mcpctl CLI framework: Commander.js, commands, config, API client, formatters
|
|
- mcpctl RPM distribution: bun compile, nfpm, Gitea publishing, shell completions
|
|
- MCP protocol routing in local-proxy: namespace tools, resources, prompts
|
|
- LLM provider abstractions: OpenAI, Anthropic, Ollama adapters (defined but unused)
|
|
- Shared types and profile templates
|
|
|
|
### Needs rework:
|
|
- mcpctl currently talks to mcpd directly -> must talk to mcplocal instead
|
|
- local-proxy is just a dumb router -> needs LLM pre-processing intelligence
|
|
- local-proxy has no HTTP API for mcpctl -> needs REST endpoints for management proxying
|
|
- mcpd has no MCP proxy endpoint -> needs endpoint that mcplocal can call to execute MCP tool calls on managed instances
|
|
- No integration between LLM providers and MCP request/response pipeline
|
|
|
|
## New Tasks Needed
|
|
|
|
### Phase 1: Rename and restructure local-proxy -> mcplocal
|
|
- Rename `src/local-proxy/` to `src/mcplocal/`
|
|
- Update all package references and imports
|
|
- Add HTTP REST server (Fastify) alongside existing stdio server
|
|
- mcplocal needs TWO interfaces: stdio for Claude, HTTP for mcpctl
|
|
|
|
### Phase 2: mcplocal management proxy
|
|
- Add REST endpoints that mirror mcpd's API (get servers, instances, projects, etc.)
|
|
- mcpctl config changes: `daemonUrl` now points to mcplocal (e.g., localhost:3200) instead of mcpd
|
|
- mcplocal proxies management requests to mcpd (configurable `mcpdUrl` e.g., http://nas:3100)
|
|
- Pass-through with no LLM processing for management commands
|
|
|
|
### Phase 3: mcpd MCP proxy endpoint
|
|
- Add `/api/v1/mcp/proxy` endpoint to mcpd
|
|
- Accepts: `{ serverId, method, params }` - execute an MCP tool call on a managed instance
|
|
- mcpd looks up the instance, connects to the container, executes the MCP call, returns result
|
|
- This is how mcplocal talks to MCP servers without needing direct Docker access
|
|
|
|
### Phase 4: LLM pre-processing pipeline in mcplocal
|
|
- Create request interceptor in mcplocal's MCP router
|
|
- Before forwarding `tools/call` to mcpd, run the request through LLM for interpretation
|
|
- After receiving response from mcpd, run through LLM for filtering/summarization
|
|
- LLM provider selection based on config (prefer local/cheap models)
|
|
- Configurable: enable/disable pre-processing per server or per tool
|
|
- Bypass for simple operations (list, create, delete - no filtering needed)
|
|
|
|
### Phase 5: Smart context optimization
|
|
- Token counting: estimate how many tokens the raw response would consume
|
|
- Decision logic: if raw response < threshold, skip LLM filtering (not worth the latency)
|
|
- If raw response > threshold, filter with LLM
|
|
- Cache LLM filtering decisions for repeated similar queries
|
|
- Metrics: track tokens saved, latency added by filtering
|
|
|
|
### Phase 6: mcpctl -> mcplocal migration
|
|
- Update mcpctl's default daemonUrl to point to mcplocal (localhost:3200)
|
|
- Update all CLI commands to work through mcplocal proxy
|
|
- Add `mcpctl config set mcpd-url <url>` for configuring upstream mcpd
|
|
- Add `mcpctl config set mcplocal-url <url>` for configuring local daemon
|
|
- Health check: `mcpctl status` shows both mcplocal and mcpd connectivity
|
|
- Shell completions update if needed
|
|
|
|
### Phase 7: End-to-end integration testing
|
|
- Test full flow: mcpctl -> mcplocal -> mcpd -> mcp_server -> response -> LLM filter -> Claude
|
|
- Test management commands pass through correctly
|
|
- Test LLM pre-processing reduces context window size
|
|
- Test credential isolation (mcplocal never sees MCP server credentials)
|
|
- Test health monitoring across all tiers
|
|
|
|
## Authentication & Authorization
|
|
|
|
### Database ownership
|
|
- **mcpd owns the database** (PostgreSQL). It is the only component that talks to the DB.
|
|
- mcplocal has NO database. It is stateless (config file only).
|
|
- mcpctl has NO database. It stores user credentials locally in `~/.mcpctl/config.yaml`.
|
|
|
|
### Auth flow
|
|
```
|
|
mcpctl login
|
|
|
|
|
v (user enters mcpd URL + credentials)
|
|
mcpctl stores API token in ~/.mcpctl/config.yaml
|
|
|
|
|
v (passes token to mcplocal config)
|
|
mcplocal authenticates to mcpd using Bearer token on every request
|
|
|
|
|
v (Authorization: Bearer <token>)
|
|
mcpd validates token against Session table in PostgreSQL
|
|
|
|
|
v (authenticated request proceeds)
|
|
```
|
|
|
|
### mcpctl responsibilities
|
|
- `mcpctl login` command: prompts user for mcpd URL and credentials (username/password or API token)
|
|
- `mcpctl login` calls mcpd's auth endpoint to get a session token
|
|
- Stores the token in `~/.mcpctl/config.yaml` (or `~/.mcpctl/credentials` with restricted permissions)
|
|
- Passes the token to mcplocal (either via config or as startup argument)
|
|
- `mcpctl logout` command: invalidates the session token
|
|
|
|
### mcplocal responsibilities
|
|
- Reads auth token from its config (set by mcpctl)
|
|
- Attaches `Authorization: Bearer <token>` header to ALL requests to mcpd
|
|
- If mcpd returns 401, mcplocal returns appropriate error to mcpctl/Claude
|
|
- Does NOT store credentials itself - they come from mcpctl's config
|
|
|
|
### mcpd responsibilities
|
|
- Owns User and Session tables
|
|
- Provides auth endpoints: `POST /api/v1/auth/login`, `POST /api/v1/auth/logout`
|
|
- Validates Bearer tokens on every request via auth middleware (already exists)
|
|
- Returns 401 for invalid/expired tokens
|
|
- Audit logs include the authenticated user
|
|
|
|
## Non-functional Requirements
|
|
- mcplocal must start fast (developer's machine, runs per-session or as daemon)
|
|
- LLM pre-processing must not add more than 2-3 seconds latency
|
|
- If local LLM is unavailable, fall back to passing data through unfiltered
|
|
- All components must be independently deployable and testable
|
|
- mcpd must remain stateless (outside of DB) and horizontally scalable
|