mcpctl/.taskmaster/docs/prd-gated-prompts.md

# PRD: Gated Project Experience & Prompt Intelligence

## Overview

When 300 developers connect their LLM clients (Claude Code, Cursor, etc.) to mcpctl projects, they need relevant context — security policies, architecture decisions, operational runbooks — without flooding the context window. This feature introduces a gated session flow where the client LLM drives its own context retrieval through keyword-based matching, with the proxy providing a prompt index and encouraging ongoing discovery.

## Problem

- Injecting all prompts into instructions doesn't scale (hundreds of pages of policies)
- Exposing prompts only as MCP resources means LLMs never read them
- An index-only approach works for small numbers but breaks down at scale
- No mechanism to link external knowledge (Notion, Docmost) as prompts
- LLMs tend to work with whatever they have rather than proactively seek more context

## Core Concepts

### Gated Experience

A project-level flag (`gated: boolean`, default: `true`) that controls whether sessions go through a keyword-driven prompt retrieval flow before accessing project tools and resources.

**Flow (A + C):**

1. On `initialize`, instructions include the **prompt index** (names + summaries for all prompts, up to a reasonable cap) and tell client LLM: "Call `begin_session` with 5 keywords describing your task"
2. **If client obeys**: `begin_session({ tags: ["zigbee", "lights", "mqtt", "pairing", "automation"] })` → prompt selection (see below) → returns matched prompt content + full prompt index + encouragement to retrieve more → session ungated
3. **If client ignores**: First `tools/call` is intercepted → keywords extracted from tool name + arguments → same prompt selection → briefing injected alongside tool result → session ungated
4. **Ongoing retrieval**: Client can call `read_prompts({ tags: ["security", "vpn"] })` at any point to retrieve more prompts. The prompt index is always visible so the client LLM can see what's available.

**Prompt selection — tiered approach:**

- **Primary (heavy LLM available)**: Tags + full prompt index (names, priorities, summaries, chapters) are sent to the heavy LLM (e.g. Gemini). The LLM understands synonyms, context, and intent — it knows "zigbee" relates to "Z2M" and "Zigbee2MQTT", and that someone working on "lights" probably needs the "common-mistakes" prompt about pairing. The LLM returns a ranked list of relevant prompt names with brief explanations of why each is relevant. The heavy LLM may use the fast LLM for preprocessing if needed (e.g. generating missing summaries on the fly).
- **Fallback (no LLM, or `llmProvider=none`)**: Deterministic keyword-based tag matching against summaries/chapters with byte-budget allocation (see "Tag Matching Algorithm" below). Same approach as ResponsePaginator's byte-based fallback. Triggered when: no LLM providers configured, project has `llmProvider: "none"`, or local override sets `provider: "none"`.
- **Hybrid (both paths always available)**: Even when heavy LLM does the initial selection, the `read_prompts({ tags: [...] })` tool always uses keyword matching. This way the client LLM can retrieve specific prompts by keyword that the heavy LLM may have missed. The LLM is smart about context, keywords are precise about names — together they cover both fuzzy and exact retrieval.

**LLM availability resolution** (same chain as existing LLM features):
- Project `llmProvider: "none"` → no LLM, keyword fallback only
- Project `llmProvider: null` → inherit from global config
- Local override `provider: "none"` → no LLM, keyword fallback only
- No providers configured → keyword fallback only
- Otherwise → use heavy LLM for `begin_session`, fast LLM for summary generation

### Encouraging Retrieval

LLMs tend to proceed with incomplete information rather than seek more context. The system must actively counter this at multiple points:

**In `initialize` instructions:**
```
You have access to project knowledge containing policies, architecture decisions,
and guidelines. Some may contain critical rules about what you're doing. After your
initial briefing, if you're unsure about conventions, security requirements, or
best practices — request more context using read_prompts. It's always better to
check than to guess wrong. The project may have specific rules you don't know about yet.
```

**In `begin_session` response (after matched prompts):**
```
Other prompts available that may become relevant as your work progresses:
- security-policies: Network segmentation, firewall rules, VPN access
- naming-conventions: Service and resource naming standards
- ...
If any of these seem related to what you're doing now or later, request them
with read_prompts({ tags: [...] }) or resources/read. Don't assume you have
all the context — check when in doubt.
```

**In `read_prompts` response:**
```
Remember: you can request more prompts at any time with read_prompts({ tags: [...] }).
The project may have additional guidelines relevant to your current approach.
```

The tone is not "here's optional reading" but "there are rules you might not know about, and violating them costs more than reading them."

### Prompt Priority (1-10)

Every prompt has a priority level that influences selection order and byte-budget allocation:

| Range | Meaning | Behavior |
|-------|---------|----------|
| 1-3   | Reference | Low priority, included only on strong keyword match |
| 4-6   | Standard | Default priority, included on moderate keyword match |
| 7-9   | Important | High priority, lower match threshold |
| 10    | Critical | Always included in full, regardless of keyword match (guardrails, common mistakes) |

Default priority for new prompts: `5`.

### Prompt Summaries & Chapters (Auto-generated)

Each prompt gets auto-generated metadata used for the prompt index and tag matching:

- `summary` (string, ~20 words) — one-line description of what the prompt covers
- `chapters` (string[]) — key sections/topics extracted from content

Generation pipeline:
- **Fast LLM available**: Summarize content, extract key topics
- **No fast LLM**: First sentence of content + markdown headings via regex
- Regenerated on prompt create/update
- Cached on the prompt record

### Tag Matching Algorithm (No-LLM Fallback)

When no local LLM is available, the system falls back to a deterministic retrieval algorithm:

1. Client provides tags (5 keywords from `begin_session`, or extracted from tool call)
2. For each prompt, compute a match score:
   - Check tags against prompt `summary` and `chapters` (case-insensitive substring match)
   - Score = `number_of_matching_tags * base_priority`
   - Priority 10 prompts: score = infinity (always included)
3. Sort by score descending
4. Fill a byte budget (configurable, default ~8KB) from top down:
   - Include full content until budget exhausted
   - Remaining matched prompts: include as index entries (name + summary)
   - Non-matched prompts: listed as names only in the "other prompts available" section

**When `begin_session` is skipped (intercept path):**
- Extract keywords from tool name + arguments (e.g., `home-assistant/get_entities({ domain: "light" })` → tags: `["home-assistant", "entities", "light"]`)
- Run same matching algorithm
- Inject briefing alongside the real tool result

### `read_prompts` Tool (Ongoing Retrieval)

Available after session is ungated. Allows the client LLM to request more context at any point:

```json
{
  "name": "read_prompts",
  "description": "Request additional project context by keywords. Use this whenever you need guidelines, policies, or conventions related to your current work. It's better to check than to guess.",
  "inputSchema": {
    "type": "object",
    "properties": {
      "tags": {
        "type": "array",
        "items": { "type": "string" },
        "description": "Keywords describing what context you need (e.g. [\"security\", \"vpn\", \"firewall\"])"
      }
    },
    "required": ["tags"]
  }
}
```

Returns matched prompt content + the prompt index reminder.

### Prompt Links

A prompt can be a **link** to an MCP resource in another project's server. The linked content is fetched server-side (by the proxy, not the client), enforcing RBAC.

Format: `project/server:resource-uri`
Example: `system-public/docmost-mcp:docmost://pages/architecture-overview`

Properties:
- The proxy fetches linked content using the source project's service account
- Client LLM never gets direct access to the source MCP server
- Dead links are detected and marked (health check on link resolution)
- Dead links generate error log entries

RBAC for links:
- Creating a link requires `edit` permission on RBAC in the target project
- A service account permission is created on the source project for the linked resource
- Default: admin group members can manage links

## Schema Changes

### Project

Add field:
- `gated: boolean` (default: `true`)

### Prompt

Add fields:
- `priority: integer` (1-10, default: 5)
- `summary: string | null` (auto-generated)
- `chapters: string[] | null` (auto-generated, stored as JSON)
- `linkTarget: string | null` (format: `project/server:resource-uri`, null for regular prompts)

### PromptRequest

Add field:
- `priority: integer` (1-10, default: 5)

## API Changes

### Modified Endpoints

- `POST /api/v1/prompts` — accept `priority`, `linkTarget`
- `PUT /api/v1/prompts/:id` — accept `priority` (not `linkTarget` — links are immutable, delete and recreate)
- `POST /api/v1/promptrequests` — accept `priority`
- `GET /api/v1/prompts` — return `priority`, `summary`, `linkTarget`, `linkStatus` (alive/dead/unknown)
- `GET /api/v1/projects/:name/prompts/visible` — return `priority`, `summary`, `chapters`

### New Endpoints

- `POST /api/v1/prompts/:id/regenerate-summary` — force re-generation of summary/chapters
- `GET /api/v1/projects/:name/prompt-index` — returns compact index (name, priority, summary, chapters)

## MCP Protocol Changes (mcplocal router)

### Session State

Router tracks per-session state:
- `gated: boolean` — starts `true` if project is gated
- `tags: string[]` — accumulated tags from begin_session + read_prompts calls
- `retrievedPrompts: Set<string>` — prompts already sent to client (avoid re-sending)

### Gated Session Flow

1. On `initialize`: instructions include prompt index + gate message + retrieval encouragement
2. `tools/list` while gated: only `begin_session` visible (progressive tool exposure)
3. `begin_session({ tags })`: match tags → return briefing + prompt index + encouragement → ungate → send `notifications/tools/list_changed`
4. On first `tools/call` while still gated: extract keywords → match → inject briefing alongside result → ungate
5. After ungating: all tools work normally, `read_prompts` available for ongoing retrieval

### `begin_session` Tool

```json
{
  "name": "begin_session",
  "description": "Start your session by providing 5 keywords that describe your current task. You'll receive relevant project context, policies, and guidelines. Required before using other tools.",
  "inputSchema": {
    "type": "object",
    "properties": {
      "tags": {
        "type": "array",
        "items": { "type": "string" },
        "maxItems": 10,
        "description": "5 keywords describing your current task (e.g. [\"zigbee\", \"automation\", \"lights\", \"mqtt\", \"pairing\"])"
      }
    },
    "required": ["tags"]
  }
}
```

Response structure:
```
[Priority 10 prompts — always, full content]

[Tag-matched prompts — full content, byte-budget-capped, priority-ordered]

Other prompts available that may become relevant as your work progresses:
- <name>: <summary>
- <name>: <summary>
- ...
If any of these seem related to what you're doing, request them with
read_prompts({ tags: [...] }). Don't assume you have all the context — check.
```

### Prompt Index in Instructions

The `initialize` instructions include a compact prompt index so the client LLM can see what knowledge exists. Format per prompt: `- <name>: <summary>` (~100 chars max per entry).

Cap: if more than 50 prompts, include only priority 7+ in instructions index. Full index always available via `resources/list`.

## CLI Changes

### New/Modified Commands

- `mcpctl create prompt <name> --priority <1-10>` — create with priority
- `mcpctl create prompt <name> --link <project/server:uri>` — create linked prompt
- `mcpctl get prompt -A` — show all prompts across all projects, with link targets
- `mcpctl describe project <name>` — show gated status, session greeting, prompt table
- `mcpctl edit project <name>` — `gated` field editable

### Prompt Link Display

```
$ mcpctl get prompt -A
PROJECT           NAME                 PRIORITY  LINK                                          STATUS
homeautomation    security-policies    8         -                                             -
homeautomation    architecture-adr     6         system-public/docmost-mcp:docmost://pages/a1  alive
homeautomation    common-mistakes      10        -                                             -
system-public     onboarding           4         -                                             -
```

## Describe Project Output

```
$ mcpctl describe project homeautomation
Name:         homeautomation
Gated:        true
LLM Provider: gemini-cli
...

Session greeting:
  You have access to project knowledge containing policies, architecture decisions,
  and guidelines. Call begin_session with 5 keywords describing your task to receive
  relevant context. Some prompts contain critical rules — it's better to check than guess.

Prompts:
  NAME                 PRIORITY  TYPE    LINK
  common-mistakes      10        local   -
  security-policies    8         local   -
  architecture-adr     6         link    system-public/docmost-mcp:docmost://pages/a1
  stack                5         local   -
```

## Testing Strategy

**Full test coverage is required.** Every new module, service, route, and algorithm must have comprehensive tests. No feature ships without tests.

### Unit Tests (mcpd)
- Prompt priority CRUD: create/update/get with priority field, default value, validation (1-10 range)
- Prompt link CRUD: create with linkTarget, immutability (can't update linkTarget), delete
- Prompt summary generation: auto-generation on create/update, regex fallback when no LLM
- `GET /api/v1/prompts` with priority, linkTarget, linkStatus fields
- `GET /api/v1/projects/:name/prompt-index` returns compact index
- `POST /api/v1/prompts/:id/regenerate-summary` triggers re-generation
- Project `gated` field: CRUD, default value

### Unit Tests (mcplocal — gating flow)
- State machine: gated → `begin_session` → ungated (happy path)
- State machine: gated → `tools/call` intercepted → ungated (fallback path)
- State machine: non-gated project skips gate entirely
- LLM selection path: tags + prompt index sent to heavy LLM, ranked results returned, priority 10 always included
- LLM selection path: heavy LLM uses fast LLM for missing summary generation
- No-LLM fallback: tag matching score calculation, priority weighting, substring matching
- No-LLM fallback: byte-budget exhaustion, priority ordering, index fallback, edge cases
- Keyword extraction from tool calls: tool name parsing, argument extraction
- `begin_session` response: matched content + index + encouragement text (both LLM and fallback paths)
- `read_prompts` response: additional matches, deduplication against already-sent prompts (both paths)
- Tools blocked while gated: return error directing to `begin_session`
- `tools/list` while gated: only `begin_session` visible
- `tools/list` after ungating: `begin_session` replaced by `read_prompts` + all upstream tools
- Priority 10 always included regardless of tag match or budget
- Prompt index in instructions: cap at 50, priority 7+ when over cap
- Notifications: `tools/list_changed` sent after ungating

### Unit Tests (mcplocal — prompt links)
- Link resolution: fetch content from source project's MCP server via service account
- Dead link detection: source server unavailable, resource not found, permission denied
- Dead link marking: status field updated, error logged
- RBAC enforcement: link creation requires edit permission on target project RBAC
- Service account permission: auto-created on source project for linked resource
- Content isolation: client LLM cannot access source server directly

### Unit Tests (CLI)
- `create prompt` with `--priority` flag, validation
- `create prompt` with `--link` flag, format validation
- `get prompt -A` output: all projects, link targets, status columns
- `describe project` output: gated status, session greeting, prompt table
- `edit project` with gated field
- Shell completions for new flags and resources

### Integration Tests
- End-to-end gated session: connect → begin_session with tags → tools available → correct prompts returned
- End-to-end intercept: connect → skip begin_session → call tool → keywords extracted → briefing injected
- End-to-end read_prompts: after ungating → request more context → additional prompts returned → no duplicates
- Prompt link resolution: create link → fetch content → verify content matches source
- Dead link lifecycle: create link → kill source → verify dead detection → restore → verify recovery
- Priority ordering: create prompts at various priorities → verify selection order and budget allocation
- Encouragement text: verify retrieval encouragement present in begin_session, read_prompts, and instructions

## System Prompts (mcpctl-system project)

All gate messages, encouragement text, and briefing templates are stored as prompts in a special `mcpctl-system` project. This makes them editable at runtime via `mcpctl edit prompt` without code changes or redeployment.

### Required System Prompts

| Name | Priority | Purpose |
|------|----------|---------|
| `gate-instructions` | 10 | Text injected into `initialize` instructions for gated projects. Tells client to call `begin_session` with 5 keywords. |
| `gate-encouragement` | 10 | Appended after `begin_session` response. Lists remaining prompts and encourages further retrieval. |
| `read-prompts-reminder` | 10 | Appended after `read_prompts` response. Reminds client that more context is available. |
| `gate-intercept-preamble` | 10 | Prepended to briefing when injected via tool call intercept (Option C fallback). |
| `session-greeting` | 10 | Shown in `mcpctl describe project` as the "hello prompt" — what client LLMs see on connect. |

### Bootstrap

The `mcpctl-system` project and its system prompts are created automatically on first startup (seed migration). They can be edited afterward but not deleted — delete attempts return an error.

### How mcplocal Uses Them

On router initialization, mcplocal fetches system prompts from mcpd via:
```
GET /api/v1/projects/mcpctl-system/prompts/visible
```

These are cached with the same 60s TTL as project routers. The prompt content supports template variables:
- `{{prompt_index}}` — replaced with the current project's prompt index
- `{{project_name}}` — replaced with the current project name
- `{{matched_prompts}}` — replaced with tag-matched prompt content
- `{{remaining_prompts}}` — replaced with the list of non-matched prompts

This way the encouragement text, tone, and structure can be tuned by editing prompts — no code changes needed.

## Security Considerations

- Prompt links: content fetched server-side, client never gets direct access to source MCP server
- RBAC: link creation requires edit permission on target project's RBAC
- Service account: source project grants read access to linked resource only
- Dead links: logged as errors, marked in listings, never expose source server errors to client
- Tag extraction: sanitize tool call arguments before using as keywords (prevent injection)