feat(agents): smoke tests + README + docs (Stage 6, final)

Closes the agents feature. Smoke tests (run via `pnpm test:smoke` against a live mcpd at $MCPD_URL, default https://mcpctl.ad.itaz.eu): * tests/smoke/agent.smoke.test.ts — full CRUD round-trip: create secret + Llm + agent with sampling defaults; `get agents` surfaces it; `get agent foo -o yaml | apply -f` round-trips identically; create + list a thread via the HTTP API; agent delete leaves Llm + secret intact (Restrict + SetNull as designed). Self- skips with a warning when /healthz is unreachable. * tests/smoke/agent-chat.smoke.test.ts — gated on MCPCTL_SMOKE_LLM_URL + MCPCTL_SMOKE_LLM_KEY. Provisions secret + Llm + agent against a real upstream, runs `mcpctl chat -m … --no- stream` (asserts a reply lands), then runs the streaming default (asserts text on stdout + `(thread: …)` on stderr). The fast path for verifying the in-cluster qwen3-thinking deployment: MCPCTL_SMOKE_LLM_URL=http://litellm.nvidia-nim.svc.cluster.local:4000/v1 \ MCPCTL_SMOKE_LLM_MODEL=qwen3-thinking \ MCPCTL_SMOKE_LLM_KEY=$(pulumi config get --stack homelab \ secrets:litellmMcpctlGatewayToken) \ pnpm test:smoke Docs: * README.md — new "Agents" section under Resources with the qwen3-thinking quickstart and links to docs/agents.md and docs/chat.md. Adds llm + agent rows to the resources table. * docs/agents.md (new) — full reference: data model, chat-parameter table, HTTP API, RBAC mapping, tool-use loop semantics, yaml round-trip shorthand, the kubernetes-deployment wiring recipe, and a troubleshooting section (namespace collision, llm-in-use, pending-row recovery, Anthropic-tool limitation). * docs/chat.md (new) — user-facing `mcpctl chat` walkthrough: modes, per-call flags, slash-commands, threads, and a troubleshooting section. * CLAUDE.md — adds a "Resource types" cheatsheet with one-line pointers to each, including the new `agent` row that links to the docs. All suites still green: mcpd 759/759, mcplocal 715/715, cli 430/430. Smoke tests typecheck and self-skip when no live mcpd is reachable. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 17:08:37 +01:00
parent 727e7d628c
commit 8b56f09f25
6 changed files with 767 additions and 0 deletions
--- a/docs/chat.md
+++ b/docs/chat.md
@@ -0,0 +1,124 @@
+# `mcpctl chat`
+
+Open an interactive chat session with an `Agent`, or send a single message
+in one shot. See [agents.md](agents.md) for what an Agent is and how to
+create one.
+
+## Modes
+
+```bash
+mcpctl chat <agent>                 # interactive REPL, new thread
+mcpctl chat <agent> --thread <id>   # interactive REPL, resume thread
+mcpctl chat <agent> -m "hi"         # one-shot, prints reply, no REPL
+mcpctl chat <agent> -m "hi" --no-stream  # one-shot, single JSON response (no SSE)
+```
+
+Streaming is on by default. Text deltas land on stdout as they arrive; tool
+calls and tool results print to stderr in dim brackets so the chat output
+stays clean.
+
+## Per-call flags
+
+All optional. They override the agent's `defaultParams` for this session
+only — use the in-REPL `/save` slash-command to persist the current set
+back to the agent.
+
+```bash
+--system <text>              # replace agent.systemPrompt for this session
+--system-file <path>         # read --system text from a file
+--system-append <text>       # append to the agent system block (after project Prompts)
+--temperature <n>            # 0..2
+--top-p <n>                  # 0..1
+--top-k <n>                  # integer; Anthropic-only, OpenAI ignores
+--max-tokens <n>             # cap on assistant tokens
+--seed <n>                   # reproducibility (provider-dependent)
+--stop <text>                # stop sequence (repeatable, up to 4)
+--allow-tool <name>          # repeat to allowlist project MCP tools
+--extra <key=value>          # provider-specific knob (repeatable)
+--no-stream                  # disable SSE; single JSON response
+```
+
+`--extra` is the LiteLLM-style escape hatch: pass anything the underlying
+adapter understands. Numeric values are auto-parsed (`--extra
+repetition_penalty=1.1`); strings stay strings.
+
+## In-REPL slash-commands
+
+```
+/set KEY VALUE      adjust an override for the rest of the session
+                    (temperature, top-p, top-k, max-tokens, seed, stop,
+                     or any provider-specific knob — unknown keys go
+                     into `extra`)
+/system <text>      set systemAppend for this turn onward (empty = clear)
+/tools              list MCP servers the agent can call as tools
+/clear              start a fresh thread (same agent)
+/save               PATCH agent.defaultParams = current overrides
+                    (systemOverride / systemAppend are NOT persisted)
+/quit, /exit        leave the REPL (Ctrl-D works too)
+```
+
+## Threads
+
+Threads persist server-side. To resume:
+
+```bash
+mcpctl get threads --agent reviewer
+mcpctl chat reviewer --thread <id>
+```
+
+A `mcpctl get thread <id>` reads the message log:
+
+```bash
+mcpctl get thread c0abc… -o yaml
+```
+
+## Examples
+
+**Quick gut-check on a deploy:**
+
+```bash
+$ mcpctl chat reviewer -m "is fulldeploy.sh safe to run on the current branch?"
+Yes — I checked: tests are green on commit 727e7d6 and there's no
+in-flight migration. The k8s context is worker0-k8s0 (production); confirm
+that's intended before running.
+(thread: cm9k…)
+```
+
+**Resuming with overrides:**
+
+```bash
+$ mcpctl chat deployer --thread cm9k… --temperature 0.0 --max-tokens 256
+> walk me through what changed since the last deploy
+…
+```
+
+**Pinning sampling defaults to the agent:**
+
+```
+$ mcpctl chat deployer --temperature 0.0 --max-tokens 8000
+> /save
+(saved current overrides as agent.defaultParams)
+> /quit
+```
+
+## Troubleshooting
+
+- **No agents appear in `tools/list`** — check the agent has a project
+  attach (`mcpctl describe agent <name>`). The mcplocal plugin only
+  exposes agents on their attached project's session.
+
+- **Tool calls fail with `Project not found`** — the agent has no project
+  attach. Either attach it (`mcpctl edit agent <name>` and set the project
+  field), or expect text-only chat.
+
+- **Anthropic agents can't call tools** — known limitation; the Anthropic
+  adapter doesn't translate OpenAI tool format yet. Use LiteLLM or a
+  direct OpenAI-compatible provider for tool-using agents until the
+  translator ships.
+
+- **`mcpctl chat <agent>` returns 404** — the agent name doesn't resolve.
+  `mcpctl get agents` to confirm spelling.
+
+- **REPL feels stuck** — agent tool calls can take minutes (e.g. running a
+  Grafana query). Watch stderr for `[tool_call: …]` / `[tool_result: …]`
+  brackets; those tell you the loop is alive.