Closes the loop on user-facing surface:
$ mcpctl get llm
NAME KIND STATUS TYPE MODEL TIER KEY ID
qwen3-thinking public active openai qwen3-thinking fast ... ...
vllm-local virtual active openai Qwen/Qwen2.5-7B-Instruct fast - ...
$ mcpctl chat-llm vllm-local
────────────────────────────────────────
LLM: vllm-local openai → Qwen/Qwen2.5-7B-Instruct-AWQ
Kind: virtual Status: active
────────────────────────────────────────
> hello?
Hi! …
New: chat-llm command (commands/chat-llm.ts)
- Stateless chat with any mcpd-registered LLM. No threads, no tools,
no project prompts. POSTs to /api/v1/llms/<name>/infer; mcpd's
kind=virtual branch handles relay-through-mcplocal transparently,
so the same CLI command works for both public and virtual LLMs.
- Reuses installStatusBar / formatStats / recordDelta / styleStats /
PhaseStats from chat.ts (now exported) so the bottom-row tokens-per-
second ticker behaves identically to mcpctl chat.
- Flags: --message (one-shot), --system, --temperature, --max-tokens,
--no-stream. Streaming uses OpenAI chat.completion.chunk SSE.
- REPL mode keeps a per-session history array so multi-turn flows
feel natural; each turn is an independent inference call.
Updated: get.ts
- LlmRow gains optional kind/status fields.
- llmColumns layout: NAME, KIND, STATUS, TYPE, MODEL, TIER, KEY, ID.
Defaults gracefully when older mcpd responses don't return them.
Updated: chat.ts
- Re-exports the helpers chat-llm.ts needs (PhaseStats, newPhase,
recordDelta, formatStats, styleStats, styleThinking, STDERR_IS_TTY,
StatusBar, installStatusBar). No behavior change.
Completions: chat-llm picks up the standard option enumeration
automatically; bash gets a special-case for first-arg LLM-name
completion via _mcpctl_resource_names "llms".
CLI suite: 437/437 (was 430, +7 from auto-discovered test cases in
the regenerated completions golden). Workspace: 2043/2043 across
152 files.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>