feat: implement v2 3-tier architecture (mcpctl → mcplocal → mcpd)
- Rename local-proxy to mcplocal with HTTP server, LLM pipeline, mcpd discovery - Add LLM pre-processing: token estimation, filter cache, metrics, Gemini CLI + DeepSeek providers - Add mcpd auth (login/logout) and MCP proxy endpoints - Update CLI: dual URLs (mcplocalUrl/mcpdUrl), auth commands, --direct flag - Add tiered health monitoring, shell completions, e2e integration tests - 57 test files, 597 tests passing Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
272
.taskmaster/docs/prd-v2-architecture.md
Normal file
272
.taskmaster/docs/prd-v2-architecture.md
Normal file
@@ -0,0 +1,272 @@
|
||||
# mcpctl v2 - Corrected 3-Tier Architecture PRD
|
||||
|
||||
## Overview
|
||||
|
||||
mcpctl is a kubectl-inspired system for managing MCP (Model Context Protocol) servers. It consists of 4 components arranged in a 3-tier architecture:
|
||||
|
||||
```
|
||||
Claude Code
|
||||
|
|
||||
v (stdio - MCP protocol)
|
||||
mcplocal (Local Daemon - runs on developer machine)
|
||||
|
|
||||
v (HTTP REST)
|
||||
mcpd (External Daemon - runs on server/NAS)
|
||||
|
|
||||
v (Docker API / K8s API)
|
||||
mcp_servers (MCP server containers)
|
||||
```
|
||||
|
||||
## Components
|
||||
|
||||
### 1. mcpctl (CLI Tool)
|
||||
- **Package**: `src/cli/` (`@mcpctl/cli`)
|
||||
- **What it is**: kubectl-like CLI for managing the entire system
|
||||
- **Talks to**: mcplocal (local daemon) via HTTP REST
|
||||
- **Key point**: mcpctl does NOT talk to mcpd directly. It always goes through mcplocal.
|
||||
- **Distributed as**: RPM package via Gitea registry (bun compile + nfpm)
|
||||
- **Commands**: get, describe, apply, setup, instance, claude, project, backup, restore, config, status
|
||||
|
||||
### 2. mcplocal (Local Daemon)
|
||||
- **Package**: `src/local-proxy/` (rename to `src/mcplocal/`)
|
||||
- **What it is**: Local daemon running on the developer's machine
|
||||
- **Talks to**: mcpd (external daemon) via HTTP REST
|
||||
- **Exposes to Claude**: MCP protocol via stdio (tools, resources, prompts)
|
||||
- **Exposes to mcpctl**: HTTP REST API for management commands
|
||||
|
||||
**Core responsibility: LLM Pre-processing**
|
||||
|
||||
This is the intelligence layer. When Claude asks for data from MCP servers, mcplocal:
|
||||
|
||||
1. Receives Claude's request (e.g., "get Slack messages about security")
|
||||
2. Uses a local/cheap LLM (Gemini CLI binary, Ollama, vLLM, DeepSeek API) to interpret what Claude actually wants
|
||||
3. Sends narrow, filtered requests to mcpd which forwards to the actual MCP servers
|
||||
4. Receives raw results from MCP servers (via mcpd)
|
||||
5. Uses the local LLM again to filter/summarize results - extracting only what's relevant
|
||||
6. Returns the smallest, most comprehensive response to Claude
|
||||
|
||||
**Why**: Claude Code tokens are expensive. Instead of dumping 500 Slack messages into Claude's context window, mcplocal uses a cheap LLM to pre-filter to the 12 relevant ones.
|
||||
|
||||
**LLM Provider Strategy** (already partially exists):
|
||||
- Gemini CLI binary (local, free)
|
||||
- Ollama (local, free)
|
||||
- vLLM (local, free)
|
||||
- DeepSeek API (cheap)
|
||||
- OpenAI API (fallback)
|
||||
- Anthropic API (fallback)
|
||||
|
||||
**Additional mcplocal responsibilities**:
|
||||
- MCP protocol routing (namespace tools: `slack/send_message`, `jira/create_issue`)
|
||||
- Connection health monitoring for upstream MCP servers
|
||||
- Caching frequently requested data
|
||||
- Proxying mcpctl management commands to mcpd
|
||||
|
||||
### 3. mcpd (External Daemon)
|
||||
- **Package**: `src/mcpd/` (`@mcpctl/mcpd`)
|
||||
- **What it is**: Server-side daemon that runs on centralized infrastructure (Synology NAS, cloud server, etc.)
|
||||
- **Deployed via**: Docker Compose (Dockerfile + docker-compose.yml)
|
||||
- **Database**: PostgreSQL for state, audit logs, access control
|
||||
|
||||
**Core responsibilities**:
|
||||
- **Deploy and run MCP server containers** (Docker now, Kubernetes later)
|
||||
- **Instance lifecycle management**: start, stop, restart, logs, inspect
|
||||
- **MCP server registry**: Store server definitions, configuration templates, profiles
|
||||
- **Project management**: Group MCP profiles into projects for Claude sessions
|
||||
- **Auditing**: Log every operation - who ran what, when, with what result
|
||||
- **Access management**: Users, sessions, permissions - who can access which MCP servers
|
||||
- **Credential storage**: MCP servers often need API tokens (Slack, Jira, GitHub) - stored securely on server side, never exposed to local machine
|
||||
- **Backup/restore**: Export and import configuration
|
||||
|
||||
**Key point**: mcpd holds the credentials. When mcplocal asks mcpd to query Slack, mcpd runs the Slack MCP server container with the proper SLACK_TOKEN injected - mcplocal never sees the token.
|
||||
|
||||
### 4. mcp_servers (MCP Server Containers)
|
||||
- **What they are**: The actual MCP server processes (Slack, Jira, GitHub, Terraform, filesystem, postgres, etc.)
|
||||
- **Managed by**: mcpd via Docker/Podman API
|
||||
- **Network**: Isolated network, only accessible by mcpd
|
||||
- **Credentials**: Injected by mcpd as environment variables
|
||||
- **Communication**: MCP protocol (stdio or SSE/HTTP) between mcpd and the containers
|
||||
|
||||
## Data Flow Examples
|
||||
|
||||
### Example 1: Claude asks for Slack messages
|
||||
```
|
||||
Claude: "Get messages about security incidents from the last week"
|
||||
|
|
||||
v (MCP tools/call: slack/search_messages)
|
||||
mcplocal:
|
||||
1. Intercepts the tool call
|
||||
2. Calls local Gemini: "User wants security incident messages from last week.
|
||||
Generate optimal Slack search query and date filters."
|
||||
3. Gemini returns: query="security incident OR vulnerability OR CVE", after="2024-01-15"
|
||||
4. Sends filtered request to mcpd
|
||||
|
|
||||
v (HTTP POST /api/v1/mcp/proxy)
|
||||
mcpd:
|
||||
1. Looks up Slack MCP instance (injects SLACK_TOKEN)
|
||||
2. Forwards narrowed query to Slack MCP server container
|
||||
3. Returns raw results (200 messages)
|
||||
|
|
||||
v (response)
|
||||
mcplocal:
|
||||
1. Receives 200 messages
|
||||
2. Calls local Gemini: "Filter these 200 Slack messages. Keep only those
|
||||
directly about security incidents. Return message IDs and 1-line summaries."
|
||||
3. Gemini returns: 15 relevant messages with summaries
|
||||
4. Returns filtered result to Claude
|
||||
|
|
||||
v (MCP response: 15 messages instead of 200)
|
||||
Claude: processes only the relevant 15 messages
|
||||
```
|
||||
|
||||
### Example 2: mcpctl management command
|
||||
```
|
||||
$ mcpctl get servers
|
||||
|
|
||||
v (HTTP GET)
|
||||
mcplocal:
|
||||
1. Recognizes this is a management command (not MCP data)
|
||||
2. Proxies directly to mcpd (no LLM processing needed)
|
||||
|
|
||||
v (HTTP GET /api/v1/servers)
|
||||
mcpd:
|
||||
1. Queries PostgreSQL for server definitions
|
||||
2. Returns list
|
||||
|
|
||||
v (proxied response)
|
||||
mcplocal -> mcpctl -> formatted table output
|
||||
```
|
||||
|
||||
### Example 3: mcpctl instance management
|
||||
```
|
||||
$ mcpctl instance start slack
|
||||
|
|
||||
v
|
||||
mcplocal -> mcpd:
|
||||
1. Creates Docker container for Slack MCP server
|
||||
2. Injects SLACK_TOKEN from secure storage
|
||||
3. Connects to isolated mcp-servers network
|
||||
4. Logs audit entry: "user X started slack instance"
|
||||
5. Returns instance status
|
||||
```
|
||||
|
||||
## What Already Exists (completed work)
|
||||
|
||||
### Done and reusable as-is:
|
||||
- Project structure: pnpm monorepo, TypeScript strict mode, Vitest, ESLint
|
||||
- Database schema: Prisma + PostgreSQL (User, McpServer, McpProfile, Project, McpInstance, AuditLog)
|
||||
- mcpd server framework: Fastify 5, routes, services, repositories, middleware
|
||||
- mcpd MCP server CRUD: registration, profiles, projects
|
||||
- mcpd Docker container management: dockerode, instance lifecycle
|
||||
- mcpd audit logging, health monitoring, metrics, backup/restore
|
||||
- mcpctl CLI framework: Commander.js, commands, config, API client, formatters
|
||||
- mcpctl RPM distribution: bun compile, nfpm, Gitea publishing, shell completions
|
||||
- MCP protocol routing in local-proxy: namespace tools, resources, prompts
|
||||
- LLM provider abstractions: OpenAI, Anthropic, Ollama adapters (defined but unused)
|
||||
- Shared types and profile templates
|
||||
|
||||
### Needs rework:
|
||||
- mcpctl currently talks to mcpd directly -> must talk to mcplocal instead
|
||||
- local-proxy is just a dumb router -> needs LLM pre-processing intelligence
|
||||
- local-proxy has no HTTP API for mcpctl -> needs REST endpoints for management proxying
|
||||
- mcpd has no MCP proxy endpoint -> needs endpoint that mcplocal can call to execute MCP tool calls on managed instances
|
||||
- No integration between LLM providers and MCP request/response pipeline
|
||||
|
||||
## New Tasks Needed
|
||||
|
||||
### Phase 1: Rename and restructure local-proxy -> mcplocal
|
||||
- Rename `src/local-proxy/` to `src/mcplocal/`
|
||||
- Update all package references and imports
|
||||
- Add HTTP REST server (Fastify) alongside existing stdio server
|
||||
- mcplocal needs TWO interfaces: stdio for Claude, HTTP for mcpctl
|
||||
|
||||
### Phase 2: mcplocal management proxy
|
||||
- Add REST endpoints that mirror mcpd's API (get servers, instances, projects, etc.)
|
||||
- mcpctl config changes: `daemonUrl` now points to mcplocal (e.g., localhost:3200) instead of mcpd
|
||||
- mcplocal proxies management requests to mcpd (configurable `mcpdUrl` e.g., http://nas:3100)
|
||||
- Pass-through with no LLM processing for management commands
|
||||
|
||||
### Phase 3: mcpd MCP proxy endpoint
|
||||
- Add `/api/v1/mcp/proxy` endpoint to mcpd
|
||||
- Accepts: `{ serverId, method, params }` - execute an MCP tool call on a managed instance
|
||||
- mcpd looks up the instance, connects to the container, executes the MCP call, returns result
|
||||
- This is how mcplocal talks to MCP servers without needing direct Docker access
|
||||
|
||||
### Phase 4: LLM pre-processing pipeline in mcplocal
|
||||
- Create request interceptor in mcplocal's MCP router
|
||||
- Before forwarding `tools/call` to mcpd, run the request through LLM for interpretation
|
||||
- After receiving response from mcpd, run through LLM for filtering/summarization
|
||||
- LLM provider selection based on config (prefer local/cheap models)
|
||||
- Configurable: enable/disable pre-processing per server or per tool
|
||||
- Bypass for simple operations (list, create, delete - no filtering needed)
|
||||
|
||||
### Phase 5: Smart context optimization
|
||||
- Token counting: estimate how many tokens the raw response would consume
|
||||
- Decision logic: if raw response < threshold, skip LLM filtering (not worth the latency)
|
||||
- If raw response > threshold, filter with LLM
|
||||
- Cache LLM filtering decisions for repeated similar queries
|
||||
- Metrics: track tokens saved, latency added by filtering
|
||||
|
||||
### Phase 6: mcpctl -> mcplocal migration
|
||||
- Update mcpctl's default daemonUrl to point to mcplocal (localhost:3200)
|
||||
- Update all CLI commands to work through mcplocal proxy
|
||||
- Add `mcpctl config set mcpd-url <url>` for configuring upstream mcpd
|
||||
- Add `mcpctl config set mcplocal-url <url>` for configuring local daemon
|
||||
- Health check: `mcpctl status` shows both mcplocal and mcpd connectivity
|
||||
- Shell completions update if needed
|
||||
|
||||
### Phase 7: End-to-end integration testing
|
||||
- Test full flow: mcpctl -> mcplocal -> mcpd -> mcp_server -> response -> LLM filter -> Claude
|
||||
- Test management commands pass through correctly
|
||||
- Test LLM pre-processing reduces context window size
|
||||
- Test credential isolation (mcplocal never sees MCP server credentials)
|
||||
- Test health monitoring across all tiers
|
||||
|
||||
## Authentication & Authorization
|
||||
|
||||
### Database ownership
|
||||
- **mcpd owns the database** (PostgreSQL). It is the only component that talks to the DB.
|
||||
- mcplocal has NO database. It is stateless (config file only).
|
||||
- mcpctl has NO database. It stores user credentials locally in `~/.mcpctl/config.yaml`.
|
||||
|
||||
### Auth flow
|
||||
```
|
||||
mcpctl login
|
||||
|
|
||||
v (user enters mcpd URL + credentials)
|
||||
mcpctl stores API token in ~/.mcpctl/config.yaml
|
||||
|
|
||||
v (passes token to mcplocal config)
|
||||
mcplocal authenticates to mcpd using Bearer token on every request
|
||||
|
|
||||
v (Authorization: Bearer <token>)
|
||||
mcpd validates token against Session table in PostgreSQL
|
||||
|
|
||||
v (authenticated request proceeds)
|
||||
```
|
||||
|
||||
### mcpctl responsibilities
|
||||
- `mcpctl login` command: prompts user for mcpd URL and credentials (username/password or API token)
|
||||
- `mcpctl login` calls mcpd's auth endpoint to get a session token
|
||||
- Stores the token in `~/.mcpctl/config.yaml` (or `~/.mcpctl/credentials` with restricted permissions)
|
||||
- Passes the token to mcplocal (either via config or as startup argument)
|
||||
- `mcpctl logout` command: invalidates the session token
|
||||
|
||||
### mcplocal responsibilities
|
||||
- Reads auth token from its config (set by mcpctl)
|
||||
- Attaches `Authorization: Bearer <token>` header to ALL requests to mcpd
|
||||
- If mcpd returns 401, mcplocal returns appropriate error to mcpctl/Claude
|
||||
- Does NOT store credentials itself - they come from mcpctl's config
|
||||
|
||||
### mcpd responsibilities
|
||||
- Owns User and Session tables
|
||||
- Provides auth endpoints: `POST /api/v1/auth/login`, `POST /api/v1/auth/logout`
|
||||
- Validates Bearer tokens on every request via auth middleware (already exists)
|
||||
- Returns 401 for invalid/expired tokens
|
||||
- Audit logs include the authenticated user
|
||||
|
||||
## Non-functional Requirements
|
||||
- mcplocal must start fast (developer's machine, runs per-session or as daemon)
|
||||
- LLM pre-processing must not add more than 2-3 seconds latency
|
||||
- If local LLM is unavailable, fall back to passing data through unfiltered
|
||||
- All components must be independently deployable and testable
|
||||
- mcpd must remain stateless (outside of DB) and horizontally scalable
|
||||
Reference in New Issue
Block a user