fix(chat): real fixes for thinking-model + URL conventions, not test tweaks

Five real bugs surfaced by the agent-chat smoke against live qwen3-thinking. None of these are fixed by changing the test — the test was right to fail. 1. openai-passthrough adapter doubled `/v1` in the request URL. The adapter hard-codes `/v1/chat/completions` after the configured base, but every OpenAI-compat provider documents its base URL with a trailing `/v1` (api.openai.com/v1, llm.example.com/v1, …). Users pasting that conventional shape produced `https://x/v1/v1/chat/completions` → 404. endpointUrl now strips a trailing `/v1` so both forms canonicalize. `/v1beta` (Anthropic-style) is preserved. 2. Non-streaming chat returned an empty assistant when thinking models (qwen3-thinking, deepseek-reasoner, OpenAI o1) emitted only `reasoning_content` with `content: null`. extractChoice now also pulls reasoning (every spelling the streaming parser already knows about), and a new pickAssistantText helper falls back to it when content is empty. A `[response truncated by max_tokens]` marker is appended when finish_reason is `length`, so users see the cut-off instead of guessing why the answer is short. Symmetric streaming fix: the chatStream loop accumulates reasoning and yields ONE synthesized `text` frame at the end when content stayed empty, keeping the CLI's stdout (which only prints `text` deltas) in sync with the persisted thread message. 3. `mcpctl get agent X -o yaml` emitted `kind: public` (the v3 lifecycle field) instead of `kind: agent` (apply envelope), so round-tripping through `apply -f` failed. Same fix shape as the v1 Llm strip in toApplyDocs — drop kind/status/lastHeartbeatAt/ inactiveSince/providerSessionId for the agents resource too. 4. Non-streaming `mcpctl chat` printed `thread:<cuid>` (no space) on stderr; streaming printed `(thread: <cuid>)` (with space). Tests and any other regex watching for one form missed the other. Standardize on `thread: <cuid>` (single space) in both paths. 5. agent-chat.smoke's `run()` used `execSync`, which discards stderr on success — making any `expect(stderr).toMatch(...)` assertion structurally impossible to satisfy in the happy path. Switch to `spawnSync` so stderr is actually captured. Includes a small shell-style argv splitter so the existing call sites with quoted multi-word values (`--system-prompt "..."`) keep working. Tests: +6 new mcpd unit tests (4 chat-service for the reasoning fallback / truncation marker / content-preference / streaming synth; 2 llm-adapters for the URL strip + /v1beta preservation). Full mcpd + mcplocal + smoke green: 860/860 + 723/723 + 139/139.
2026-04-27 18:39:01 +01:00
parent 58bc277242
commit 610808b9e7
7 changed files with 293 additions and 29 deletions
--- a/src/mcplocal/tests/smoke/agent-chat.smoke.test.ts
+++ b/src/mcplocal/tests/smoke/agent-chat.smoke.test.ts
@@ -17,7 +17,7 @@
 import { describe, it, expect, beforeAll, afterAll } from 'vitest';
 import http from 'node:http';
 import https from 'node:https';
-import { execSync } from 'node:child_process';
+import { spawnSync, execSync } from 'node:child_process';

 const MCPD_URL = process.env.MCPD_URL ?? 'https://mcpctl.ad.itaz.eu';
 const LLM_URL = process.env.MCPCTL_SMOKE_LLM_URL;
@@ -31,21 +31,37 @@ const AGENT_NAME = `smoke-chat-agent-${SUFFIX}`;
 interface CliResult { code: number; stdout: string; stderr: string }

 function run(args: string): CliResult {
-  try {
-    const stdout = execSync(`mcpctl --direct ${args}`, {
-      encoding: 'utf-8',
-      timeout: 60_000,
-      stdio: ['ignore', 'pipe', 'pipe'],
-    });
-    return { code: 0, stdout: stdout.trim(), stderr: '' };
-  } catch (err) {
-    const e = err as { status?: number; stdout?: Buffer | string; stderr?: Buffer | string };
-    return {
-      code: e.status ?? 1,
-      stdout: e.stdout ? (typeof e.stdout === 'string' ? e.stdout : e.stdout.toString('utf-8')) : '',
-      stderr: e.stderr ? (typeof e.stderr === 'string' ? e.stderr : e.stderr.toString('utf-8')) : '',
-    };
+  // spawnSync (not execSync) — execSync returns only stdout on success and
+  // discards stderr, which made any `thread:` assertion against a successful
+  // chat impossible to evaluate. Splitting the args correctly handles the
+  // few existing call sites that quote-wrap multi-word values like
+  // `--system-prompt "You are..."`.
+  const argv = splitArgs(args);
+  const res = spawnSync('mcpctl', ['--direct', ...argv], {
+    encoding: 'utf-8',
+    timeout: 60_000,
+  });
+  return {
+    code: res.status ?? 1,
+    stdout: (res.stdout ?? '').trim(),
+    stderr: (res.stderr ?? '').trim(),
+  };
+}
+
+/**
+ * Tokenize a shell-style argv string with simple double-quote support — just
+ * enough for the smoke test's call shapes. Not a full POSIX parser; we only
+ * need to keep `--system-prompt "You are a smoke test..."` together as one
+ * arg.
+ */
+function splitArgs(s: string): string[] {
+  const out: string[] = [];
+  const re = /"([^"]*)"|(\S+)/g;
+  let m: RegExpExecArray | null;
+  while ((m = re.exec(s)) !== null) {
+    out.push(m[1] !== undefined ? m[1] : (m[2] ?? ''));
  }
+  return out;
 }

 function healthz(url: string, timeoutMs = 5000): Promise<boolean> {