Files
mcpctl/.taskmaster/docs/prd-gated-prompts.md
Michal ecc9c48597 feat: gated project experience & prompt intelligence
Implements the full gated session flow and prompt intelligence system:

- Prisma schema: add gated, priority, summary, chapters, linkTarget fields
- Session gate: state machine (gated → begin_session → ungated) with LLM-powered
  tool selection based on prompt index
- Tag matcher: intelligent prompt-to-tool matching with project/server/action tags
- LLM selector: tiered provider selection (fast for gating, heavy for complex tasks)
- Link resolver: cross-project MCP resource references (project/server:uri format)
- Prompt summary service: LLM-generated summaries and chapter extraction
- System project bootstrap: ensures default project exists on startup
- Structural link health checks: enrichWithLinkStatus on prompt GET endpoints
- CLI: create prompt --priority/--link, create project --gated/--no-gated,
  describe project shows prompts section, get prompts shows PRI/LINK/STATUS
- Apply/edit: priority, linkTarget, gated fields supported
- Shell completions: fish updated with new flags
- 1,253 tests passing across all packages

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 23:22:42 +00:00

20 KiB

PRD: Gated Project Experience & Prompt Intelligence

Overview

When 300 developers connect their LLM clients (Claude Code, Cursor, etc.) to mcpctl projects, they need relevant context — security policies, architecture decisions, operational runbooks — without flooding the context window. This feature introduces a gated session flow where the client LLM drives its own context retrieval through keyword-based matching, with the proxy providing a prompt index and encouraging ongoing discovery.

Problem

  • Injecting all prompts into instructions doesn't scale (hundreds of pages of policies)
  • Exposing prompts only as MCP resources means LLMs never read them
  • An index-only approach works for small numbers but breaks down at scale
  • No mechanism to link external knowledge (Notion, Docmost) as prompts
  • LLMs tend to work with whatever they have rather than proactively seek more context

Core Concepts

Gated Experience

A project-level flag (gated: boolean, default: true) that controls whether sessions go through a keyword-driven prompt retrieval flow before accessing project tools and resources.

Flow (A + C):

  1. On initialize, instructions include the prompt index (names + summaries for all prompts, up to a reasonable cap) and tell client LLM: "Call begin_session with 5 keywords describing your task"
  2. If client obeys: begin_session({ tags: ["zigbee", "lights", "mqtt", "pairing", "automation"] }) → prompt selection (see below) → returns matched prompt content + full prompt index + encouragement to retrieve more → session ungated
  3. If client ignores: First tools/call is intercepted → keywords extracted from tool name + arguments → same prompt selection → briefing injected alongside tool result → session ungated
  4. Ongoing retrieval: Client can call read_prompts({ tags: ["security", "vpn"] }) at any point to retrieve more prompts. The prompt index is always visible so the client LLM can see what's available.

Prompt selection — tiered approach:

  • Primary (heavy LLM available): Tags + full prompt index (names, priorities, summaries, chapters) are sent to the heavy LLM (e.g. Gemini). The LLM understands synonyms, context, and intent — it knows "zigbee" relates to "Z2M" and "Zigbee2MQTT", and that someone working on "lights" probably needs the "common-mistakes" prompt about pairing. The LLM returns a ranked list of relevant prompt names with brief explanations of why each is relevant. The heavy LLM may use the fast LLM for preprocessing if needed (e.g. generating missing summaries on the fly).
  • Fallback (no LLM, or llmProvider=none): Deterministic keyword-based tag matching against summaries/chapters with byte-budget allocation (see "Tag Matching Algorithm" below). Same approach as ResponsePaginator's byte-based fallback. Triggered when: no LLM providers configured, project has llmProvider: "none", or local override sets provider: "none".
  • Hybrid (both paths always available): Even when heavy LLM does the initial selection, the read_prompts({ tags: [...] }) tool always uses keyword matching. This way the client LLM can retrieve specific prompts by keyword that the heavy LLM may have missed. The LLM is smart about context, keywords are precise about names — together they cover both fuzzy and exact retrieval.

LLM availability resolution (same chain as existing LLM features):

  • Project llmProvider: "none" → no LLM, keyword fallback only
  • Project llmProvider: null → inherit from global config
  • Local override provider: "none" → no LLM, keyword fallback only
  • No providers configured → keyword fallback only
  • Otherwise → use heavy LLM for begin_session, fast LLM for summary generation

Encouraging Retrieval

LLMs tend to proceed with incomplete information rather than seek more context. The system must actively counter this at multiple points:

In initialize instructions:

You have access to project knowledge containing policies, architecture decisions,
and guidelines. Some may contain critical rules about what you're doing. After your
initial briefing, if you're unsure about conventions, security requirements, or
best practices — request more context using read_prompts. It's always better to
check than to guess wrong. The project may have specific rules you don't know about yet.

In begin_session response (after matched prompts):

Other prompts available that may become relevant as your work progresses:
- security-policies: Network segmentation, firewall rules, VPN access
- naming-conventions: Service and resource naming standards
- ...
If any of these seem related to what you're doing now or later, request them
with read_prompts({ tags: [...] }) or resources/read. Don't assume you have
all the context — check when in doubt.

In read_prompts response:

Remember: you can request more prompts at any time with read_prompts({ tags: [...] }).
The project may have additional guidelines relevant to your current approach.

The tone is not "here's optional reading" but "there are rules you might not know about, and violating them costs more than reading them."

Prompt Priority (1-10)

Every prompt has a priority level that influences selection order and byte-budget allocation:

Range Meaning Behavior
1-3 Reference Low priority, included only on strong keyword match
4-6 Standard Default priority, included on moderate keyword match
7-9 Important High priority, lower match threshold
10 Critical Always included in full, regardless of keyword match (guardrails, common mistakes)

Default priority for new prompts: 5.

Prompt Summaries & Chapters (Auto-generated)

Each prompt gets auto-generated metadata used for the prompt index and tag matching:

  • summary (string, ~20 words) — one-line description of what the prompt covers
  • chapters (string[]) — key sections/topics extracted from content

Generation pipeline:

  • Fast LLM available: Summarize content, extract key topics
  • No fast LLM: First sentence of content + markdown headings via regex
  • Regenerated on prompt create/update
  • Cached on the prompt record

Tag Matching Algorithm (No-LLM Fallback)

When no local LLM is available, the system falls back to a deterministic retrieval algorithm:

  1. Client provides tags (5 keywords from begin_session, or extracted from tool call)
  2. For each prompt, compute a match score:
    • Check tags against prompt summary and chapters (case-insensitive substring match)
    • Score = number_of_matching_tags * base_priority
    • Priority 10 prompts: score = infinity (always included)
  3. Sort by score descending
  4. Fill a byte budget (configurable, default ~8KB) from top down:
    • Include full content until budget exhausted
    • Remaining matched prompts: include as index entries (name + summary)
    • Non-matched prompts: listed as names only in the "other prompts available" section

When begin_session is skipped (intercept path):

  • Extract keywords from tool name + arguments (e.g., home-assistant/get_entities({ domain: "light" }) → tags: ["home-assistant", "entities", "light"])
  • Run same matching algorithm
  • Inject briefing alongside the real tool result

read_prompts Tool (Ongoing Retrieval)

Available after session is ungated. Allows the client LLM to request more context at any point:

{
  "name": "read_prompts",
  "description": "Request additional project context by keywords. Use this whenever you need guidelines, policies, or conventions related to your current work. It's better to check than to guess.",
  "inputSchema": {
    "type": "object",
    "properties": {
      "tags": {
        "type": "array",
        "items": { "type": "string" },
        "description": "Keywords describing what context you need (e.g. [\"security\", \"vpn\", \"firewall\"])"
      }
    },
    "required": ["tags"]
  }
}

Returns matched prompt content + the prompt index reminder.

A prompt can be a link to an MCP resource in another project's server. The linked content is fetched server-side (by the proxy, not the client), enforcing RBAC.

Format: project/server:resource-uri Example: system-public/docmost-mcp:docmost://pages/architecture-overview

Properties:

  • The proxy fetches linked content using the source project's service account
  • Client LLM never gets direct access to the source MCP server
  • Dead links are detected and marked (health check on link resolution)
  • Dead links generate error log entries

RBAC for links:

  • Creating a link requires edit permission on RBAC in the target project
  • A service account permission is created on the source project for the linked resource
  • Default: admin group members can manage links

Schema Changes

Project

Add field:

  • gated: boolean (default: true)

Prompt

Add fields:

  • priority: integer (1-10, default: 5)
  • summary: string | null (auto-generated)
  • chapters: string[] | null (auto-generated, stored as JSON)
  • linkTarget: string | null (format: project/server:resource-uri, null for regular prompts)

PromptRequest

Add field:

  • priority: integer (1-10, default: 5)

API Changes

Modified Endpoints

  • POST /api/v1/prompts — accept priority, linkTarget
  • PUT /api/v1/prompts/:id — accept priority (not linkTarget — links are immutable, delete and recreate)
  • POST /api/v1/promptrequests — accept priority
  • GET /api/v1/prompts — return priority, summary, linkTarget, linkStatus (alive/dead/unknown)
  • GET /api/v1/projects/:name/prompts/visible — return priority, summary, chapters

New Endpoints

  • POST /api/v1/prompts/:id/regenerate-summary — force re-generation of summary/chapters
  • GET /api/v1/projects/:name/prompt-index — returns compact index (name, priority, summary, chapters)

MCP Protocol Changes (mcplocal router)

Session State

Router tracks per-session state:

  • gated: boolean — starts true if project is gated
  • tags: string[] — accumulated tags from begin_session + read_prompts calls
  • retrievedPrompts: Set<string> — prompts already sent to client (avoid re-sending)

Gated Session Flow

  1. On initialize: instructions include prompt index + gate message + retrieval encouragement
  2. tools/list while gated: only begin_session visible (progressive tool exposure)
  3. begin_session({ tags }): match tags → return briefing + prompt index + encouragement → ungate → send notifications/tools/list_changed
  4. On first tools/call while still gated: extract keywords → match → inject briefing alongside result → ungate
  5. After ungating: all tools work normally, read_prompts available for ongoing retrieval

begin_session Tool

{
  "name": "begin_session",
  "description": "Start your session by providing 5 keywords that describe your current task. You'll receive relevant project context, policies, and guidelines. Required before using other tools.",
  "inputSchema": {
    "type": "object",
    "properties": {
      "tags": {
        "type": "array",
        "items": { "type": "string" },
        "maxItems": 10,
        "description": "5 keywords describing your current task (e.g. [\"zigbee\", \"automation\", \"lights\", \"mqtt\", \"pairing\"])"
      }
    },
    "required": ["tags"]
  }
}

Response structure:

[Priority 10 prompts — always, full content]

[Tag-matched prompts — full content, byte-budget-capped, priority-ordered]

Other prompts available that may become relevant as your work progresses:
- <name>: <summary>
- <name>: <summary>
- ...
If any of these seem related to what you're doing, request them with
read_prompts({ tags: [...] }). Don't assume you have all the context — check.

Prompt Index in Instructions

The initialize instructions include a compact prompt index so the client LLM can see what knowledge exists. Format per prompt: - <name>: <summary> (~100 chars max per entry).

Cap: if more than 50 prompts, include only priority 7+ in instructions index. Full index always available via resources/list.

CLI Changes

New/Modified Commands

  • mcpctl create prompt <name> --priority <1-10> — create with priority
  • mcpctl create prompt <name> --link <project/server:uri> — create linked prompt
  • mcpctl get prompt -A — show all prompts across all projects, with link targets
  • mcpctl describe project <name> — show gated status, session greeting, prompt table
  • mcpctl edit project <name>gated field editable
$ mcpctl get prompt -A
PROJECT           NAME                 PRIORITY  LINK                                          STATUS
homeautomation    security-policies    8         -                                             -
homeautomation    architecture-adr     6         system-public/docmost-mcp:docmost://pages/a1  alive
homeautomation    common-mistakes      10        -                                             -
system-public     onboarding           4         -                                             -

Describe Project Output

$ mcpctl describe project homeautomation
Name:         homeautomation
Gated:        true
LLM Provider: gemini-cli
...

Session greeting:
  You have access to project knowledge containing policies, architecture decisions,
  and guidelines. Call begin_session with 5 keywords describing your task to receive
  relevant context. Some prompts contain critical rules — it's better to check than guess.

Prompts:
  NAME                 PRIORITY  TYPE    LINK
  common-mistakes      10        local   -
  security-policies    8         local   -
  architecture-adr     6         link    system-public/docmost-mcp:docmost://pages/a1
  stack                5         local   -

Testing Strategy

Full test coverage is required. Every new module, service, route, and algorithm must have comprehensive tests. No feature ships without tests.

Unit Tests (mcpd)

  • Prompt priority CRUD: create/update/get with priority field, default value, validation (1-10 range)
  • Prompt link CRUD: create with linkTarget, immutability (can't update linkTarget), delete
  • Prompt summary generation: auto-generation on create/update, regex fallback when no LLM
  • GET /api/v1/prompts with priority, linkTarget, linkStatus fields
  • GET /api/v1/projects/:name/prompt-index returns compact index
  • POST /api/v1/prompts/:id/regenerate-summary triggers re-generation
  • Project gated field: CRUD, default value

Unit Tests (mcplocal — gating flow)

  • State machine: gated → begin_session → ungated (happy path)
  • State machine: gated → tools/call intercepted → ungated (fallback path)
  • State machine: non-gated project skips gate entirely
  • LLM selection path: tags + prompt index sent to heavy LLM, ranked results returned, priority 10 always included
  • LLM selection path: heavy LLM uses fast LLM for missing summary generation
  • No-LLM fallback: tag matching score calculation, priority weighting, substring matching
  • No-LLM fallback: byte-budget exhaustion, priority ordering, index fallback, edge cases
  • Keyword extraction from tool calls: tool name parsing, argument extraction
  • begin_session response: matched content + index + encouragement text (both LLM and fallback paths)
  • read_prompts response: additional matches, deduplication against already-sent prompts (both paths)
  • Tools blocked while gated: return error directing to begin_session
  • tools/list while gated: only begin_session visible
  • tools/list after ungating: begin_session replaced by read_prompts + all upstream tools
  • Priority 10 always included regardless of tag match or budget
  • Prompt index in instructions: cap at 50, priority 7+ when over cap
  • Notifications: tools/list_changed sent after ungating
  • Link resolution: fetch content from source project's MCP server via service account
  • Dead link detection: source server unavailable, resource not found, permission denied
  • Dead link marking: status field updated, error logged
  • RBAC enforcement: link creation requires edit permission on target project RBAC
  • Service account permission: auto-created on source project for linked resource
  • Content isolation: client LLM cannot access source server directly

Unit Tests (CLI)

  • create prompt with --priority flag, validation
  • create prompt with --link flag, format validation
  • get prompt -A output: all projects, link targets, status columns
  • describe project output: gated status, session greeting, prompt table
  • edit project with gated field
  • Shell completions for new flags and resources

Integration Tests

  • End-to-end gated session: connect → begin_session with tags → tools available → correct prompts returned
  • End-to-end intercept: connect → skip begin_session → call tool → keywords extracted → briefing injected
  • End-to-end read_prompts: after ungating → request more context → additional prompts returned → no duplicates
  • Prompt link resolution: create link → fetch content → verify content matches source
  • Dead link lifecycle: create link → kill source → verify dead detection → restore → verify recovery
  • Priority ordering: create prompts at various priorities → verify selection order and budget allocation
  • Encouragement text: verify retrieval encouragement present in begin_session, read_prompts, and instructions

System Prompts (mcpctl-system project)

All gate messages, encouragement text, and briefing templates are stored as prompts in a special mcpctl-system project. This makes them editable at runtime via mcpctl edit prompt without code changes or redeployment.

Required System Prompts

Name Priority Purpose
gate-instructions 10 Text injected into initialize instructions for gated projects. Tells client to call begin_session with 5 keywords.
gate-encouragement 10 Appended after begin_session response. Lists remaining prompts and encourages further retrieval.
read-prompts-reminder 10 Appended after read_prompts response. Reminds client that more context is available.
gate-intercept-preamble 10 Prepended to briefing when injected via tool call intercept (Option C fallback).
session-greeting 10 Shown in mcpctl describe project as the "hello prompt" — what client LLMs see on connect.

Bootstrap

The mcpctl-system project and its system prompts are created automatically on first startup (seed migration). They can be edited afterward but not deleted — delete attempts return an error.

How mcplocal Uses Them

On router initialization, mcplocal fetches system prompts from mcpd via:

GET /api/v1/projects/mcpctl-system/prompts/visible

These are cached with the same 60s TTL as project routers. The prompt content supports template variables:

  • {{prompt_index}} — replaced with the current project's prompt index
  • {{project_name}} — replaced with the current project name
  • {{matched_prompts}} — replaced with tag-matched prompt content
  • {{remaining_prompts}} — replaced with the list of non-matched prompts

This way the encouragement text, tone, and structure can be tuned by editing prompts — no code changes needed.

Security Considerations

  • Prompt links: content fetched server-side, client never gets direct access to source MCP server
  • RBAC: link creation requires edit permission on target project's RBAC
  • Service account: source project grants read access to linked resource only
  • Dead links: logged as errors, marked in listings, never expose source server errors to client
  • Tag extraction: sanitize tool call arguments before using as keywords (prevent injection)