docs/chat.md

# `mcpctl chat`

Open an interactive chat session with an `Agent`, or send a single message
in one shot. See [agents.md](agents.md) for what an Agent is and how to
create one.

## Modes

```bash
mcpctl chat <agent>                 # interactive REPL, new thread
mcpctl chat <agent> --thread <id>   # interactive REPL, resume thread
mcpctl chat <agent> -m "hi"         # one-shot, prints reply, no REPL
mcpctl chat <agent> -m "hi" --no-stream  # one-shot, single JSON response (no SSE)
```

Streaming is on by default. Text deltas land on stdout as they arrive; tool
calls and tool results print to stderr in dim brackets so the chat output
stays clean.

## Per-call flags

All optional. They override the agent's `defaultParams` for this session
only — use the in-REPL `/save` slash-command to persist the current set
back to the agent.

```bash
--system <text>              # replace agent.systemPrompt for this session
--system-file <path>         # read --system text from a file
--system-append <text>       # append to the agent system block (after project Prompts)
--personality <name>         # apply a personality overlay for this turn
                             # (additive — see docs/personalities.md)
--temperature <n>            # 0..2
--top-p <n>                  # 0..1
--top-k <n>                  # integer; Anthropic-only, OpenAI ignores
--max-tokens <n>             # cap on assistant tokens
--seed <n>                   # reproducibility (provider-dependent)
--stop <text>                # stop sequence (repeatable, up to 4)
--allow-tool <name>          # repeat to allowlist project MCP tools
--extra <key=value>          # provider-specific knob (repeatable)
--no-stream                  # disable SSE; single JSON response
```

`--extra` is the LiteLLM-style escape hatch: pass anything the underlying
adapter understands. Numeric values are auto-parsed (`--extra
repetition_penalty=1.1`); strings stay strings.

## In-REPL slash-commands

```
/set KEY VALUE      adjust an override for the rest of the session
                    (temperature, top-p, top-k, max-tokens, seed, stop,
                     or any provider-specific knob — unknown keys go
                     into `extra`)
/system <text>      set systemAppend for this turn onward (empty = clear)
/tools              list MCP servers the agent can call as tools
/clear              start a fresh thread (same agent)
/save               PATCH agent.defaultParams = current overrides
                    (systemOverride / systemAppend are NOT persisted)
/quit, /exit        leave the REPL (Ctrl-D works too)
```

## Threads

Threads persist server-side. To resume:

```bash
mcpctl get threads --agent reviewer
mcpctl chat reviewer --thread <id>
```

A `mcpctl get thread <id>` reads the message log:

```bash
mcpctl get thread c0abc… -o yaml
```

## Examples

**Quick gut-check on a deploy:**

```bash
$ mcpctl chat reviewer -m "is fulldeploy.sh safe to run on the current branch?"
Yes — I checked: tests are green on commit 727e7d6 and there's no
in-flight migration. The k8s context is worker0-k8s0 (production); confirm
that's intended before running.
(thread: cm9k…)
```

**Resuming with overrides:**

```bash
$ mcpctl chat deployer --thread cm9k… --temperature 0.0 --max-tokens 256
> walk me through what changed since the last deploy
…
```

**Pinning sampling defaults to the agent:**

```
$ mcpctl chat deployer --temperature 0.0 --max-tokens 8000
> /save
(saved current overrides as agent.defaultParams)
> /quit
```

## Troubleshooting

- **No agents appear in `tools/list`** — check the agent has a project
  attach (`mcpctl describe agent <name>`). The mcplocal plugin only
  exposes agents on their attached project's session.

- **Tool calls fail with `Project not found`** — the agent has no project
  attach. Either attach it (`mcpctl edit agent <name>` and set the project
  field), or expect text-only chat.

- **Anthropic agents can't call tools** — known limitation; the Anthropic
  adapter doesn't translate OpenAI tool format yet. Use LiteLLM or a
  direct OpenAI-compatible provider for tool-using agents until the
  translator ships.

- **`mcpctl chat <agent>` returns 404** — the agent name doesn't resolve.
  `mcpctl get agents` to confirm spelling.

- **REPL feels stuck** — agent tool calls can take minutes (e.g. running a
  Grafana query). Watch stderr for `[tool_call: …]` / `[tool_result: …]`
  brackets; those tell you the loop is alive.
feat(agents): smoke tests + README + docs (Stage 6, final) Closes the agents feature. Smoke tests (run via `pnpm test:smoke` against a live mcpd at $MCPD_URL, default https://mcpctl.ad.itaz.eu): * tests/smoke/agent.smoke.test.ts — full CRUD round-trip: create secret + Llm + agent with sampling defaults; `get agents` surfaces it; `get agent foo -o yaml \| apply -f` round-trips identically; create + list a thread via the HTTP API; agent delete leaves Llm + secret intact (Restrict + SetNull as designed). Self- skips with a warning when /healthz is unreachable. * tests/smoke/agent-chat.smoke.test.ts — gated on MCPCTL_SMOKE_LLM_URL + MCPCTL_SMOKE_LLM_KEY. Provisions secret + Llm + agent against a real upstream, runs `mcpctl chat -m … --no- stream` (asserts a reply lands), then runs the streaming default (asserts text on stdout + `(thread: …)` on stderr). The fast path for verifying the in-cluster qwen3-thinking deployment: MCPCTL_SMOKE_LLM_URL=http://litellm.nvidia-nim.svc.cluster.local:4000/v1 \ MCPCTL_SMOKE_LLM_MODEL=qwen3-thinking \ MCPCTL_SMOKE_LLM_KEY=$(pulumi config get --stack homelab \ secrets:litellmMcpctlGatewayToken) \ pnpm test:smoke Docs: * README.md — new "Agents" section under Resources with the qwen3-thinking quickstart and links to docs/agents.md and docs/chat.md. Adds llm + agent rows to the resources table. * docs/agents.md (new) — full reference: data model, chat-parameter table, HTTP API, RBAC mapping, tool-use loop semantics, yaml round-trip shorthand, the kubernetes-deployment wiring recipe, and a troubleshooting section (namespace collision, llm-in-use, pending-row recovery, Anthropic-tool limitation). * docs/chat.md (new) — user-facing `mcpctl chat` walkthrough: modes, per-call flags, slash-commands, threads, and a troubleshooting section. * CLAUDE.md — adds a "Resource types" cheatsheet with one-line pointers to each, including the new `agent` row that links to the docs. All suites still green: mcpd 759/759, mcplocal 715/715, cli 430/430. Smoke tests typecheck and self-skip when no live mcpd is reachable. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> 2026-04-25 17:08:37 +01:00			# `mcpctl chat`

			Open an interactive chat session with an `Agent`, or send a single message
			`in one shot. See [agents.md](agents.md) for what an Agent is and how to`
			`create one.`

			`## Modes`

			```bash
			`mcpctl chat <agent> # interactive REPL, new thread`
			`mcpctl chat <agent> --thread <id> # interactive REPL, resume thread`
			`mcpctl chat <agent> -m "hi" # one-shot, prints reply, no REPL`
			`mcpctl chat <agent> -m "hi" --no-stream # one-shot, single JSON response (no SSE)`
			```

			`Streaming is on by default. Text deltas land on stdout as they arrive; tool`
			`calls and tool results print to stderr in dim brackets so the chat output`
			`stays clean.`

			`## Per-call flags`

			All optional. They override the agent's `defaultParams` for this session
			only — use the in-REPL `/save` slash-command to persist the current set
			`back to the agent.`

			```bash
			`--system <text> # replace agent.systemPrompt for this session`
			`--system-file <path> # read --system text from a file`
			`--system-append <text> # append to the agent system block (after project Prompts)`
feat(mcpd+deploy): serve web UI at /ui + smoke tests + docs (Stage 6) The closing stage. mcpd now hosts the Stage 5 SPA, the Docker image bundles the build artifact, a smoke test exercises the personality HTTP surface end-to-end, and the user-facing docs spell out the mental model. mcpd: - Add @fastify/static dep. - New routes/web-ui.ts: registers /ui/* against a static bundle. Looks for the bundle at $MCPD_WEB_ROOT, then /usr/share/mcpd/web (the Docker image path), then a dev-tree fallback. Logs and skips cleanly if missing — API-only deploys keep working. - SPA fallback: any /ui/<path> that doesn't match a file falls through to index.html so direct hits to react-router URLs work. - /ui/* falls through to `kind: skip` in mapUrlToPermission, so the static assets are served unauthenticated. Each API call from the SPA still carries the bearer token. Deploy: - Dockerfile.mcpd builds the @mcpctl/web bundle in the same builder stage and copies dist/ to /usr/share/mcpd/web in the runtime image. Smoke (personality.smoke.test.ts): - Live mcpd flow: create secret/llm/agent/personality, attach an agent-direct prompt, verify the binding listing, reject double- attach (409) + foreign-agent prompt (400), set defaultPersonality by name, detach + delete cleanup. Docs: - New docs/personalities.md: VLAN-on-ethernet model, system-block ordering table, three prompt scopes, CLI walkthrough, web UI walkthrough, full API surface, RBAC notes. - agents.md and chat.md cross-link. - README's Agents section gains a Personalities subsection. Test count after Stage 6: mcpd: 801/801 cli: 430/430 web: 7/7 db: 58/62 (4 pre-existing) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> 2026-04-26 19:48:43 +01:00			`--personality <name> # apply a personality overlay for this turn`
			`# (additive — see docs/personalities.md)`
feat(agents): smoke tests + README + docs (Stage 6, final) Closes the agents feature. Smoke tests (run via `pnpm test:smoke` against a live mcpd at $MCPD_URL, default https://mcpctl.ad.itaz.eu): * tests/smoke/agent.smoke.test.ts — full CRUD round-trip: create secret + Llm + agent with sampling defaults; `get agents` surfaces it; `get agent foo -o yaml \| apply -f` round-trips identically; create + list a thread via the HTTP API; agent delete leaves Llm + secret intact (Restrict + SetNull as designed). Self- skips with a warning when /healthz is unreachable. * tests/smoke/agent-chat.smoke.test.ts — gated on MCPCTL_SMOKE_LLM_URL + MCPCTL_SMOKE_LLM_KEY. Provisions secret + Llm + agent against a real upstream, runs `mcpctl chat -m … --no- stream` (asserts a reply lands), then runs the streaming default (asserts text on stdout + `(thread: …)` on stderr). The fast path for verifying the in-cluster qwen3-thinking deployment: MCPCTL_SMOKE_LLM_URL=http://litellm.nvidia-nim.svc.cluster.local:4000/v1 \ MCPCTL_SMOKE_LLM_MODEL=qwen3-thinking \ MCPCTL_SMOKE_LLM_KEY=$(pulumi config get --stack homelab \ secrets:litellmMcpctlGatewayToken) \ pnpm test:smoke Docs: * README.md — new "Agents" section under Resources with the qwen3-thinking quickstart and links to docs/agents.md and docs/chat.md. Adds llm + agent rows to the resources table. * docs/agents.md (new) — full reference: data model, chat-parameter table, HTTP API, RBAC mapping, tool-use loop semantics, yaml round-trip shorthand, the kubernetes-deployment wiring recipe, and a troubleshooting section (namespace collision, llm-in-use, pending-row recovery, Anthropic-tool limitation). * docs/chat.md (new) — user-facing `mcpctl chat` walkthrough: modes, per-call flags, slash-commands, threads, and a troubleshooting section. * CLAUDE.md — adds a "Resource types" cheatsheet with one-line pointers to each, including the new `agent` row that links to the docs. All suites still green: mcpd 759/759, mcplocal 715/715, cli 430/430. Smoke tests typecheck and self-skip when no live mcpd is reachable. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> 2026-04-25 17:08:37 +01:00			`--temperature <n> # 0..2`
			`--top-p <n> # 0..1`
			`--top-k <n> # integer; Anthropic-only, OpenAI ignores`
			`--max-tokens <n> # cap on assistant tokens`
			`--seed <n> # reproducibility (provider-dependent)`
			`--stop <text> # stop sequence (repeatable, up to 4)`
			`--allow-tool <name> # repeat to allowlist project MCP tools`
			`--extra <key=value> # provider-specific knob (repeatable)`
			`--no-stream # disable SSE; single JSON response`
			```

			`--extra` is the LiteLLM-style escape hatch: pass anything the underlying
			adapter understands. Numeric values are auto-parsed (`--extra
			repetition_penalty=1.1`); strings stay strings.

			`## In-REPL slash-commands`

			```
			`/set KEY VALUE adjust an override for the rest of the session`
			`(temperature, top-p, top-k, max-tokens, seed, stop,`
			`or any provider-specific knob — unknown keys go`
			into `extra`)
			`/system <text> set systemAppend for this turn onward (empty = clear)`
			`/tools list MCP servers the agent can call as tools`
			`/clear start a fresh thread (same agent)`
			`/save PATCH agent.defaultParams = current overrides`
			`(systemOverride / systemAppend are NOT persisted)`
			`/quit, /exit leave the REPL (Ctrl-D works too)`
			```

			`## Threads`

			`Threads persist server-side. To resume:`

			```bash
			`mcpctl get threads --agent reviewer`
			`mcpctl chat reviewer --thread <id>`
			```

			A `mcpctl get thread <id>` reads the message log:

			```bash
			`mcpctl get thread c0abc… -o yaml`
			```

			`## Examples`

			`Quick gut-check on a deploy:`

			```bash
			`$ mcpctl chat reviewer -m "is fulldeploy.sh safe to run on the current branch?"`
			`Yes — I checked: tests are green on commit 727e7d6 and there's no`
			`in-flight migration. The k8s context is worker0-k8s0 (production); confirm`
			`that's intended before running.`
			`(thread: cm9k…)`
			```

			`Resuming with overrides:`

			```bash
			`$ mcpctl chat deployer --thread cm9k… --temperature 0.0 --max-tokens 256`
			`> walk me through what changed since the last deploy`
			`…`
			```

			`Pinning sampling defaults to the agent:`

			```
			`$ mcpctl chat deployer --temperature 0.0 --max-tokens 8000`
			`> /save`
			`(saved current overrides as agent.defaultParams)`
			`> /quit`
			```

			`## Troubleshooting`

			- No agents appear in `tools/list` — check the agent has a project
			attach (`mcpctl describe agent <name>`). The mcplocal plugin only
			`exposes agents on their attached project's session.`

			- Tool calls fail with `Project not found` — the agent has no project
			attach. Either attach it (`mcpctl edit agent <name>` and set the project
			`field), or expect text-only chat.`

			`- Anthropic agents can't call tools — known limitation; the Anthropic`
			`adapter doesn't translate OpenAI tool format yet. Use LiteLLM or a`
			`direct OpenAI-compatible provider for tool-using agents until the`
			`translator ships.`

			- `mcpctl chat <agent>` returns 404 — the agent name doesn't resolve.
			`mcpctl get agents` to confirm spelling.

			`- REPL feels stuck — agent tool calls can take minutes (e.g. running a`
			Grafana query). Watch stderr for `[tool_call: …]` / `[tool_result: …]`
			`brackets; those tell you the loop is alive.`