Files
mcpctl/.taskmaster/docs/prd-v2-architecture.md
Michal b8c5cf718a
Some checks failed
CI / lint (pull_request) Has been cancelled
CI / typecheck (pull_request) Has been cancelled
CI / test (pull_request) Has been cancelled
CI / build (pull_request) Has been cancelled
CI / package (pull_request) Has been cancelled
feat: implement v2 3-tier architecture (mcpctl → mcplocal → mcpd)
- Rename local-proxy to mcplocal with HTTP server, LLM pipeline, mcpd discovery
- Add LLM pre-processing: token estimation, filter cache, metrics, Gemini CLI + DeepSeek providers
- Add mcpd auth (login/logout) and MCP proxy endpoints
- Update CLI: dual URLs (mcplocalUrl/mcpdUrl), auth commands, --direct flag
- Add tiered health monitoring, shell completions, e2e integration tests
- 57 test files, 597 tests passing

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-22 11:42:06 +00:00

12 KiB

mcpctl v2 - Corrected 3-Tier Architecture PRD

Overview

mcpctl is a kubectl-inspired system for managing MCP (Model Context Protocol) servers. It consists of 4 components arranged in a 3-tier architecture:

Claude Code
    |
    v (stdio - MCP protocol)
mcplocal (Local Daemon - runs on developer machine)
    |
    v (HTTP REST)
mcpd (External Daemon - runs on server/NAS)
    |
    v (Docker API / K8s API)
mcp_servers (MCP server containers)

Components

1. mcpctl (CLI Tool)

  • Package: src/cli/ (@mcpctl/cli)
  • What it is: kubectl-like CLI for managing the entire system
  • Talks to: mcplocal (local daemon) via HTTP REST
  • Key point: mcpctl does NOT talk to mcpd directly. It always goes through mcplocal.
  • Distributed as: RPM package via Gitea registry (bun compile + nfpm)
  • Commands: get, describe, apply, setup, instance, claude, project, backup, restore, config, status

2. mcplocal (Local Daemon)

  • Package: src/local-proxy/ (rename to src/mcplocal/)
  • What it is: Local daemon running on the developer's machine
  • Talks to: mcpd (external daemon) via HTTP REST
  • Exposes to Claude: MCP protocol via stdio (tools, resources, prompts)
  • Exposes to mcpctl: HTTP REST API for management commands

Core responsibility: LLM Pre-processing

This is the intelligence layer. When Claude asks for data from MCP servers, mcplocal:

  1. Receives Claude's request (e.g., "get Slack messages about security")
  2. Uses a local/cheap LLM (Gemini CLI binary, Ollama, vLLM, DeepSeek API) to interpret what Claude actually wants
  3. Sends narrow, filtered requests to mcpd which forwards to the actual MCP servers
  4. Receives raw results from MCP servers (via mcpd)
  5. Uses the local LLM again to filter/summarize results - extracting only what's relevant
  6. Returns the smallest, most comprehensive response to Claude

Why: Claude Code tokens are expensive. Instead of dumping 500 Slack messages into Claude's context window, mcplocal uses a cheap LLM to pre-filter to the 12 relevant ones.

LLM Provider Strategy (already partially exists):

  • Gemini CLI binary (local, free)
  • Ollama (local, free)
  • vLLM (local, free)
  • DeepSeek API (cheap)
  • OpenAI API (fallback)
  • Anthropic API (fallback)

Additional mcplocal responsibilities:

  • MCP protocol routing (namespace tools: slack/send_message, jira/create_issue)
  • Connection health monitoring for upstream MCP servers
  • Caching frequently requested data
  • Proxying mcpctl management commands to mcpd

3. mcpd (External Daemon)

  • Package: src/mcpd/ (@mcpctl/mcpd)
  • What it is: Server-side daemon that runs on centralized infrastructure (Synology NAS, cloud server, etc.)
  • Deployed via: Docker Compose (Dockerfile + docker-compose.yml)
  • Database: PostgreSQL for state, audit logs, access control

Core responsibilities:

  • Deploy and run MCP server containers (Docker now, Kubernetes later)
  • Instance lifecycle management: start, stop, restart, logs, inspect
  • MCP server registry: Store server definitions, configuration templates, profiles
  • Project management: Group MCP profiles into projects for Claude sessions
  • Auditing: Log every operation - who ran what, when, with what result
  • Access management: Users, sessions, permissions - who can access which MCP servers
  • Credential storage: MCP servers often need API tokens (Slack, Jira, GitHub) - stored securely on server side, never exposed to local machine
  • Backup/restore: Export and import configuration

Key point: mcpd holds the credentials. When mcplocal asks mcpd to query Slack, mcpd runs the Slack MCP server container with the proper SLACK_TOKEN injected - mcplocal never sees the token.

4. mcp_servers (MCP Server Containers)

  • What they are: The actual MCP server processes (Slack, Jira, GitHub, Terraform, filesystem, postgres, etc.)
  • Managed by: mcpd via Docker/Podman API
  • Network: Isolated network, only accessible by mcpd
  • Credentials: Injected by mcpd as environment variables
  • Communication: MCP protocol (stdio or SSE/HTTP) between mcpd and the containers

Data Flow Examples

Example 1: Claude asks for Slack messages

Claude: "Get messages about security incidents from the last week"
    |
    v (MCP tools/call: slack/search_messages)
mcplocal:
    1. Intercepts the tool call
    2. Calls local Gemini: "User wants security incident messages from last week.
       Generate optimal Slack search query and date filters."
    3. Gemini returns: query="security incident OR vulnerability OR CVE", after="2024-01-15"
    4. Sends filtered request to mcpd
    |
    v (HTTP POST /api/v1/mcp/proxy)
mcpd:
    1. Looks up Slack MCP instance (injects SLACK_TOKEN)
    2. Forwards narrowed query to Slack MCP server container
    3. Returns raw results (200 messages)
    |
    v (response)
mcplocal:
    1. Receives 200 messages
    2. Calls local Gemini: "Filter these 200 Slack messages. Keep only those
       directly about security incidents. Return message IDs and 1-line summaries."
    3. Gemini returns: 15 relevant messages with summaries
    4. Returns filtered result to Claude
    |
    v (MCP response: 15 messages instead of 200)
Claude: processes only the relevant 15 messages

Example 2: mcpctl management command

$ mcpctl get servers
    |
    v (HTTP GET)
mcplocal:
    1. Recognizes this is a management command (not MCP data)
    2. Proxies directly to mcpd (no LLM processing needed)
    |
    v (HTTP GET /api/v1/servers)
mcpd:
    1. Queries PostgreSQL for server definitions
    2. Returns list
    |
    v (proxied response)
mcplocal -> mcpctl -> formatted table output

Example 3: mcpctl instance management

$ mcpctl instance start slack
    |
    v
mcplocal -> mcpd:
    1. Creates Docker container for Slack MCP server
    2. Injects SLACK_TOKEN from secure storage
    3. Connects to isolated mcp-servers network
    4. Logs audit entry: "user X started slack instance"
    5. Returns instance status

What Already Exists (completed work)

Done and reusable as-is:

  • Project structure: pnpm monorepo, TypeScript strict mode, Vitest, ESLint
  • Database schema: Prisma + PostgreSQL (User, McpServer, McpProfile, Project, McpInstance, AuditLog)
  • mcpd server framework: Fastify 5, routes, services, repositories, middleware
  • mcpd MCP server CRUD: registration, profiles, projects
  • mcpd Docker container management: dockerode, instance lifecycle
  • mcpd audit logging, health monitoring, metrics, backup/restore
  • mcpctl CLI framework: Commander.js, commands, config, API client, formatters
  • mcpctl RPM distribution: bun compile, nfpm, Gitea publishing, shell completions
  • MCP protocol routing in local-proxy: namespace tools, resources, prompts
  • LLM provider abstractions: OpenAI, Anthropic, Ollama adapters (defined but unused)
  • Shared types and profile templates

Needs rework:

  • mcpctl currently talks to mcpd directly -> must talk to mcplocal instead
  • local-proxy is just a dumb router -> needs LLM pre-processing intelligence
  • local-proxy has no HTTP API for mcpctl -> needs REST endpoints for management proxying
  • mcpd has no MCP proxy endpoint -> needs endpoint that mcplocal can call to execute MCP tool calls on managed instances
  • No integration between LLM providers and MCP request/response pipeline

New Tasks Needed

Phase 1: Rename and restructure local-proxy -> mcplocal

  • Rename src/local-proxy/ to src/mcplocal/
  • Update all package references and imports
  • Add HTTP REST server (Fastify) alongside existing stdio server
  • mcplocal needs TWO interfaces: stdio for Claude, HTTP for mcpctl

Phase 2: mcplocal management proxy

  • Add REST endpoints that mirror mcpd's API (get servers, instances, projects, etc.)
  • mcpctl config changes: daemonUrl now points to mcplocal (e.g., localhost:3200) instead of mcpd
  • mcplocal proxies management requests to mcpd (configurable mcpdUrl e.g., http://nas:3100)
  • Pass-through with no LLM processing for management commands

Phase 3: mcpd MCP proxy endpoint

  • Add /api/v1/mcp/proxy endpoint to mcpd
  • Accepts: { serverId, method, params } - execute an MCP tool call on a managed instance
  • mcpd looks up the instance, connects to the container, executes the MCP call, returns result
  • This is how mcplocal talks to MCP servers without needing direct Docker access

Phase 4: LLM pre-processing pipeline in mcplocal

  • Create request interceptor in mcplocal's MCP router
  • Before forwarding tools/call to mcpd, run the request through LLM for interpretation
  • After receiving response from mcpd, run through LLM for filtering/summarization
  • LLM provider selection based on config (prefer local/cheap models)
  • Configurable: enable/disable pre-processing per server or per tool
  • Bypass for simple operations (list, create, delete - no filtering needed)

Phase 5: Smart context optimization

  • Token counting: estimate how many tokens the raw response would consume
  • Decision logic: if raw response < threshold, skip LLM filtering (not worth the latency)
  • If raw response > threshold, filter with LLM
  • Cache LLM filtering decisions for repeated similar queries
  • Metrics: track tokens saved, latency added by filtering

Phase 6: mcpctl -> mcplocal migration

  • Update mcpctl's default daemonUrl to point to mcplocal (localhost:3200)
  • Update all CLI commands to work through mcplocal proxy
  • Add mcpctl config set mcpd-url <url> for configuring upstream mcpd
  • Add mcpctl config set mcplocal-url <url> for configuring local daemon
  • Health check: mcpctl status shows both mcplocal and mcpd connectivity
  • Shell completions update if needed

Phase 7: End-to-end integration testing

  • Test full flow: mcpctl -> mcplocal -> mcpd -> mcp_server -> response -> LLM filter -> Claude
  • Test management commands pass through correctly
  • Test LLM pre-processing reduces context window size
  • Test credential isolation (mcplocal never sees MCP server credentials)
  • Test health monitoring across all tiers

Authentication & Authorization

Database ownership

  • mcpd owns the database (PostgreSQL). It is the only component that talks to the DB.
  • mcplocal has NO database. It is stateless (config file only).
  • mcpctl has NO database. It stores user credentials locally in ~/.mcpctl/config.yaml.

Auth flow

mcpctl login
    |
    v (user enters mcpd URL + credentials)
mcpctl stores API token in ~/.mcpctl/config.yaml
    |
    v (passes token to mcplocal config)
mcplocal authenticates to mcpd using Bearer token on every request
    |
    v (Authorization: Bearer <token>)
mcpd validates token against Session table in PostgreSQL
    |
    v (authenticated request proceeds)

mcpctl responsibilities

  • mcpctl login command: prompts user for mcpd URL and credentials (username/password or API token)
  • mcpctl login calls mcpd's auth endpoint to get a session token
  • Stores the token in ~/.mcpctl/config.yaml (or ~/.mcpctl/credentials with restricted permissions)
  • Passes the token to mcplocal (either via config or as startup argument)
  • mcpctl logout command: invalidates the session token

mcplocal responsibilities

  • Reads auth token from its config (set by mcpctl)
  • Attaches Authorization: Bearer <token> header to ALL requests to mcpd
  • If mcpd returns 401, mcplocal returns appropriate error to mcpctl/Claude
  • Does NOT store credentials itself - they come from mcpctl's config

mcpd responsibilities

  • Owns User and Session tables
  • Provides auth endpoints: POST /api/v1/auth/login, POST /api/v1/auth/logout
  • Validates Bearer tokens on every request via auth middleware (already exists)
  • Returns 401 for invalid/expired tokens
  • Audit logs include the authenticated user

Non-functional Requirements

  • mcplocal must start fast (developer's machine, runs per-session or as daemon)
  • LLM pre-processing must not add more than 2-3 seconds latency
  • If local LLM is unavailable, fall back to passing data through unfiltered
  • All components must be independently deployable and testable
  • mcpd must remain stateless (outside of DB) and horizontally scalable