Files
mcpctl/docs/chat.md
Michal 8b56f09f25 feat(agents): smoke tests + README + docs (Stage 6, final)
Closes the agents feature.

Smoke tests (run via `pnpm test:smoke` against a live mcpd at
$MCPD_URL, default https://mcpctl.ad.itaz.eu):

* tests/smoke/agent.smoke.test.ts — full CRUD round-trip:
  create secret + Llm + agent with sampling defaults; `get agents`
  surfaces it; `get agent foo -o yaml | apply -f` round-trips
  identically; create + list a thread via the HTTP API; agent delete
  leaves Llm + secret intact (Restrict + SetNull as designed). Self-
  skips with a warning when /healthz is unreachable.

* tests/smoke/agent-chat.smoke.test.ts — gated on
  MCPCTL_SMOKE_LLM_URL + MCPCTL_SMOKE_LLM_KEY. Provisions secret +
  Llm + agent against a real upstream, runs `mcpctl chat -m … --no-
  stream` (asserts a reply lands), then runs the streaming default
  (asserts text on stdout + `(thread: …)` on stderr). The fast path
  for verifying the in-cluster qwen3-thinking deployment:

      MCPCTL_SMOKE_LLM_URL=http://litellm.nvidia-nim.svc.cluster.local:4000/v1 \
      MCPCTL_SMOKE_LLM_MODEL=qwen3-thinking \
      MCPCTL_SMOKE_LLM_KEY=$(pulumi config get --stack homelab \
        secrets:litellmMcpctlGatewayToken) \
        pnpm test:smoke

Docs:

* README.md — new "Agents" section under Resources with the
  qwen3-thinking quickstart and links to docs/agents.md and
  docs/chat.md. Adds llm + agent rows to the resources table.

* docs/agents.md (new) — full reference: data model, chat-parameter
  table, HTTP API, RBAC mapping, tool-use loop semantics, yaml
  round-trip shorthand, the kubernetes-deployment wiring recipe,
  and a troubleshooting section (namespace collision, llm-in-use,
  pending-row recovery, Anthropic-tool limitation).

* docs/chat.md (new) — user-facing `mcpctl chat` walkthrough:
  modes, per-call flags, slash-commands, threads, and a
  troubleshooting section.

* CLAUDE.md — adds a "Resource types" cheatsheet with one-line
  pointers to each, including the new `agent` row that links to
  the docs.

All suites still green: mcpd 759/759, mcplocal 715/715, cli 430/430.
Smoke tests typecheck and self-skip when no live mcpd is reachable.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 17:08:37 +01:00

4.2 KiB

mcpctl chat

Open an interactive chat session with an Agent, or send a single message in one shot. See agents.md for what an Agent is and how to create one.

Modes

mcpctl chat <agent>                 # interactive REPL, new thread
mcpctl chat <agent> --thread <id>   # interactive REPL, resume thread
mcpctl chat <agent> -m "hi"         # one-shot, prints reply, no REPL
mcpctl chat <agent> -m "hi" --no-stream  # one-shot, single JSON response (no SSE)

Streaming is on by default. Text deltas land on stdout as they arrive; tool calls and tool results print to stderr in dim brackets so the chat output stays clean.

Per-call flags

All optional. They override the agent's defaultParams for this session only — use the in-REPL /save slash-command to persist the current set back to the agent.

--system <text>              # replace agent.systemPrompt for this session
--system-file <path>         # read --system text from a file
--system-append <text>       # append to the agent system block (after project Prompts)
--temperature <n>            # 0..2
--top-p <n>                  # 0..1
--top-k <n>                  # integer; Anthropic-only, OpenAI ignores
--max-tokens <n>             # cap on assistant tokens
--seed <n>                   # reproducibility (provider-dependent)
--stop <text>                # stop sequence (repeatable, up to 4)
--allow-tool <name>          # repeat to allowlist project MCP tools
--extra <key=value>          # provider-specific knob (repeatable)
--no-stream                  # disable SSE; single JSON response

--extra is the LiteLLM-style escape hatch: pass anything the underlying adapter understands. Numeric values are auto-parsed (--extra repetition_penalty=1.1); strings stay strings.

In-REPL slash-commands

/set KEY VALUE      adjust an override for the rest of the session
                    (temperature, top-p, top-k, max-tokens, seed, stop,
                     or any provider-specific knob — unknown keys go
                     into `extra`)
/system <text>      set systemAppend for this turn onward (empty = clear)
/tools              list MCP servers the agent can call as tools
/clear              start a fresh thread (same agent)
/save               PATCH agent.defaultParams = current overrides
                    (systemOverride / systemAppend are NOT persisted)
/quit, /exit        leave the REPL (Ctrl-D works too)

Threads

Threads persist server-side. To resume:

mcpctl get threads --agent reviewer
mcpctl chat reviewer --thread <id>

A mcpctl get thread <id> reads the message log:

mcpctl get thread c0abc… -o yaml

Examples

Quick gut-check on a deploy:

$ mcpctl chat reviewer -m "is fulldeploy.sh safe to run on the current branch?"
Yes — I checked: tests are green on commit 727e7d6 and there's no
in-flight migration. The k8s context is worker0-k8s0 (production); confirm
that's intended before running.
(thread: cm9k…)

Resuming with overrides:

$ mcpctl chat deployer --thread cm9k… --temperature 0.0 --max-tokens 256
> walk me through what changed since the last deploy
…

Pinning sampling defaults to the agent:

$ mcpctl chat deployer --temperature 0.0 --max-tokens 8000
> /save
(saved current overrides as agent.defaultParams)
> /quit

Troubleshooting

  • No agents appear in tools/list — check the agent has a project attach (mcpctl describe agent <name>). The mcplocal plugin only exposes agents on their attached project's session.

  • Tool calls fail with Project not found — the agent has no project attach. Either attach it (mcpctl edit agent <name> and set the project field), or expect text-only chat.

  • Anthropic agents can't call tools — known limitation; the Anthropic adapter doesn't translate OpenAI tool format yet. Use LiteLLM or a direct OpenAI-compatible provider for tool-using agents until the translator ships.

  • mcpctl chat <agent> returns 404 — the agent name doesn't resolve. mcpctl get agents to confirm spelling.

  • REPL feels stuck — agent tool calls can take minutes (e.g. running a Grafana query). Watch stderr for [tool_call: …] / [tool_result: …] brackets; those tell you the loop is alive.