feat(cli+docs): mcpctl get agent KIND/STATUS columns + virtual-agent smoke + docs (v3 Stage 4)

CLI: `mcpctl get agent` table view gains KIND and STATUS columns mirroring the `get llm` shape from v1. Public agents render as `public/active` (the AgentRow defaults) and virtual ones surface their true lifecycle state, so `mcpctl get agent` becomes a single-pane view for both manually-created and mcplocal-published personas. Smoke: tests/smoke/virtual-agent.smoke.test.ts mirrors virtual-llm's in-process registrar pattern — publishes a fake provider + agent in one round-trip, confirms mcpd surfaces the agent kind=virtual / status=active under /api/v1/agents, then disconnects and verifies the paired Llm-and-Agent both flip to inactive (deletion is GC-driven, not disconnect-driven, so the rows must still exist post-stop). Heartbeat- stale and 4 h sweep paths are covered by the unit suite to keep smoke duration in check. Docs: docs/virtual-llms.md gets a "Virtual agents (v3)" section with a config sample, lifecycle notes, listing example, and the cluster-wide name-uniqueness caveat. The API surface block now mentions the new `agents[]` field on _provider-register, the join-by-session heartbeat behavior, and the `GET /api/v1/agents` lifecycle fields. docs/agents.md gains a one-paragraph note pointing to the v3 publishing path. Tests: full smoke suite 141/141 (was 139, +2 new), unit suites unchanged (mcpd 860/860, mcplocal 723/723).
2026-04-27 18:47:03 +01:00
parent 610808b9e7
commit 1998b733b2
4 changed files with 314 additions and 6 deletions
--- a/docs/agents.md
+++ b/docs/agents.md
@@ -204,5 +204,9 @@ mcpctl chat reviewer
 - [virtual-llms.md](./virtual-llms.md) — local LLMs (e.g. `vllm-local`)
  publishing themselves into `mcpctl get llm` so anyone can chat with
  them via `mcpctl chat-llm <name>`. Inference is relayed through the
-  publishing mcplocal — mcpd never holds the local URL or key.
+  publishing mcplocal — mcpd never holds the local URL or key. **v3**
+  extends the same publishing model to **virtual agents** declared in
+  mcplocal config — they show up in `mcpctl get agent` with
+  `KIND=virtual / STATUS=active` and become chat-able via
+  `mcpctl chat <name>` like any other agent.
 - [chat.md](./chat.md) — `mcpctl chat` flow and LiteLLM-style flags.
--- a/docs/virtual-llms.md
+++ b/docs/virtual-llms.md
@@ -199,10 +199,87 @@ provider doesn't come up within `maxWaitSeconds`), every queued infer
 is rejected with a clear error and the row stays `hibernating` —
 the next request gets a fresh wake attempt.

+## Virtual agents (v3)
+
+Virtual agents extend the same publishing model to **agents** — named
+LLM personas with their own system prompt and sampling defaults. mcplocal
+declares them in its config alongside its providers, and the existing
+`_provider-register` endpoint atomically publishes both Llms and Agents
+in one round-trip. They show up under `mcpctl get agent` next to
+manually-created public agents and become chat-able via
+`mcpctl chat <agent>` — no special command.
+
+### Declaring a virtual agent in mcplocal config
+
+```jsonc
+// ~/.mcpctl/config.json
+{
+  "llm": {
+    "providers": [
+      { "name": "vllm-local", "type": "vllm", "model": "Qwen/Qwen2.5-7B-Instruct-AWQ", "publish": true }
+    ]
+  },
+  "agents": [
+    {
+      "name": "local-coder",
+      "llm": "vllm-local",
+      "description": "Local coding assistant on the workstation GPU",
+      "systemPrompt": "You are a senior engineer. Be terse.",
+      "defaultParams": { "temperature": 0.2 }
+    }
+  ]
+}
+```
+
+`llm` references a published provider's name from the same config. Agents
+pinned to a name that isn't being published are still forwarded to mcpd —
+the server validates `llmName` and 404s with a clear message if it's
+genuinely missing, which lets you point at a *public* Llm if you want.
+
+### Lifecycle
+
+Same shape as virtual Llms — 30 s heartbeat from mcplocal, 90 s
+heartbeat-stale → status flips to `inactive`, 4 h inactive → row deleted
+by mcpd's GC sweep. Heartbeats cover both Llms and Agents owned by the
+session.
+
+The GC orders agent deletes **before** their pinned virtual Llm so the
+`Agent.llmId onDelete: Restrict` FK doesn't block the sweep.
+
+### Listing
+
+```sh
+$ mcpctl get agents
+NAME          KIND     STATUS    LLM             PROJECT             DESCRIPTION
+local-coder   virtual  active    vllm-local      -                   Local coding assistant on…
+reviewer      public   active    qwen3-thinking  mcpctl-development  I review what you're shipping…
+```
+
+The `KIND` and `STATUS` columns are the v3 additions. Round-tripping
+through `mcpctl get agent X -o yaml | mcpctl apply -f -` strips those
+runtime fields cleanly so a virtual agent can be re-declared as a public
+one (or vice versa) without manual editing.
+
+### Chatting
+
+```sh
+$ mcpctl chat local-coder
+> hello?
+… streams through mcpd → SSE → mcplocal's vllm-local provider …
+```
+
+Same command as for public agents. Works because chat.service has a
+`kind=virtual` branch that hands off to `VirtualLlmService.enqueueInferTask`
+when the agent's pinned Llm is virtual.
+
+### Cluster-wide name uniqueness
+
+`Agent.name` is unique cluster-wide. Two mcplocals trying to publish the
+same agent name collide on the second register with HTTP 409. Per-publisher
+namespacing is a v4+ concern — same constraint as virtual Llms in v1.
+
 ## Roadmap (later stages)

- **v3 — Virtual agents**: mcplocal publishes its local agent configs
-  (model + system prompt + sampling defaults) into mcpd's `Agent` table.
 - **v4 — LB pool by model**: agents can target a model name instead of
  a specific Llm; mcpd picks the healthiest pool member per request.
 - **v5 — Task queue**: persisted requests for hibernating/saturated
@@ -211,18 +288,23 @@ the next request gets a fresh wake attempt.
 ## API surface (v1)

 ```
-POST  /api/v1/llms/_provider-register      → returns { providerSessionId, llms[] }
+POST  /api/v1/llms/_provider-register      → returns { providerSessionId, llms[], agents[] }
+                                              v3: body accepts an optional `agents[]` array
+                                              alongside `providers[]`. Atomic publish; older
+                                              clients (providers-only) keep working.
 GET   /api/v1/llms/_provider-stream        → SSE channel; require x-mcpctl-provider-session header
-POST  /api/v1/llms/_provider-heartbeat     → { providerSessionId }
+POST  /api/v1/llms/_provider-heartbeat     → { providerSessionId } — bumps both Llms and Agents
+                                              owned by the session
 POST  /api/v1/llms/_provider-task/:id/result
                                           → one of:
                                             { error: "msg" }
                                             { chunk: { data, done? } }
                                             { status, body }

-GET   /api/v1/llms                         → list (now includes kind, status, lastHeartbeatAt, inactiveSince)
+GET   /api/v1/llms                         → list (includes kind, status, lastHeartbeatAt, inactiveSince)
 POST  /api/v1/llms/<virtual>/infer         → routes through the SSE relay
 DELETE /api/v1/llms/<virtual>              → delete unconditionally (also runs GC's job)
+GET   /api/v1/agents                       → list (v3: includes kind, status, lastHeartbeatAt, inactiveSince)
 ```

 RBAC piggybacks on `view/edit/create:llms` — no new resource. Publishing