fix(chat): real fixes for thinking-model + URL conventions, not test tweaks

Five real bugs surfaced by the agent-chat smoke against live qwen3-thinking. None of these are fixed by changing the test — the test was right to fail. 1. openai-passthrough adapter doubled `/v1` in the request URL. The adapter hard-codes `/v1/chat/completions` after the configured base, but every OpenAI-compat provider documents its base URL with a trailing `/v1` (api.openai.com/v1, llm.example.com/v1, …). Users pasting that conventional shape produced `https://x/v1/v1/chat/completions` → 404. endpointUrl now strips a trailing `/v1` so both forms canonicalize. `/v1beta` (Anthropic-style) is preserved. 2. Non-streaming chat returned an empty assistant when thinking models (qwen3-thinking, deepseek-reasoner, OpenAI o1) emitted only `reasoning_content` with `content: null`. extractChoice now also pulls reasoning (every spelling the streaming parser already knows about), and a new pickAssistantText helper falls back to it when content is empty. A `[response truncated by max_tokens]` marker is appended when finish_reason is `length`, so users see the cut-off instead of guessing why the answer is short. Symmetric streaming fix: the chatStream loop accumulates reasoning and yields ONE synthesized `text` frame at the end when content stayed empty, keeping the CLI's stdout (which only prints `text` deltas) in sync with the persisted thread message. 3. `mcpctl get agent X -o yaml` emitted `kind: public` (the v3 lifecycle field) instead of `kind: agent` (apply envelope), so round-tripping through `apply -f` failed. Same fix shape as the v1 Llm strip in toApplyDocs — drop kind/status/lastHeartbeatAt/ inactiveSince/providerSessionId for the agents resource too. 4. Non-streaming `mcpctl chat` printed `thread:<cuid>` (no space) on stderr; streaming printed `(thread: <cuid>)` (with space). Tests and any other regex watching for one form missed the other. Standardize on `thread: <cuid>` (single space) in both paths. 5. agent-chat.smoke's `run()` used `execSync`, which discards stderr on success — making any `expect(stderr).toMatch(...)` assertion structurally impossible to satisfy in the happy path. Switch to `spawnSync` so stderr is actually captured. Includes a small shell-style argv splitter so the existing call sites with quoted multi-word values (`--system-prompt "..."`) keep working. Tests: +6 new mcpd unit tests (4 chat-service for the reasoning fallback / truncation marker / content-preference / streaming synth; 2 llm-adapters for the URL strip + /v1beta preservation). Full mcpd + mcplocal + smoke green: 860/860 + 723/723 + 139/139.
2026-04-27 18:39:01 +01:00
parent 58bc277242
commit 610808b9e7
7 changed files with 293 additions and 29 deletions
--- a/src/cli/src/commands/chat.ts
+++ b/src/cli/src/commands/chat.ts
@@ -151,7 +151,10 @@ async function runOneShot(
    const sec = Math.max(0.05, (Date.now() - startMs) / 1000);
    const words = (res.assistant.match(/\S+/g) ?? []).length;
    process.stdout.write(`${res.assistant}\n`);
-    process.stderr.write(styleStats(`(${String(words)}w · ${(words / sec).toFixed(1)} w/s · ${sec.toFixed(1)}s)`) + `  thread:${res.threadId}\n`);
+    // `thread: <id>` — single space after the colon, matching the streaming
+    // path (line 160 below) so any tooling/regex that watches one form picks
+    // up the other too.
+    process.stderr.write(styleStats(`(${String(words)}w · ${(words / sec).toFixed(1)} w/s · ${sec.toFixed(1)}s)`) + `  thread: ${res.threadId}\n`);
    return;
  }
  const bar = installStatusBar();
--- a/src/cli/src/commands/get.ts
+++ b/src/cli/src/commands/get.ts
@@ -408,8 +408,8 @@ function toApplyDocs(resource: string, items: unknown[]): Array<{ kind: string }
  const kind = RESOURCE_KIND[resource] ?? resource;
  return items.map((item) => {
    const cleaned = stripInternalFields(item as Record<string, unknown>);
-    // Llm-specific: the new virtual-provider lifecycle fields collide with
-    // the apply-doc `kind` envelope (the schema uses `kind: public|virtual`)
+    // Llm-specific: the virtual-provider lifecycle fields collide with the
+    // apply-doc `kind` envelope (the schema uses `kind: public|virtual`)
    // and aren't apply-able anyway — they're derived runtime state managed
    // by VirtualLlmService. Drop them so YAML round-trips stay clean.
    if (resource === 'llms') {
@@ -419,6 +419,17 @@ function toApplyDocs(resource: string, items: unknown[]): Array<{ kind: string }
      delete cleaned['inactiveSince'];
      delete cleaned['providerSessionId'];
    }
+    // Agent-specific: same shape as Llm — Agent gained kind/status/etc. in
+    // v3 Stage 1 (virtual agent lifecycle) and the schema-`kind` field
+    // shadows the apply-envelope `kind: agent`. Strip the same set so
+    // `get agent X -o yaml | apply -f -` round-trips without diff.
+    if (resource === 'agents') {
+      delete cleaned['kind'];
+      delete cleaned['status'];
+      delete cleaned['lastHeartbeatAt'];
+      delete cleaned['inactiveSince'];
+      delete cleaned['providerSessionId'];
+    }
    return { kind, ...cleaned };
  });
 }