fix(chat): real fixes for thinking-model + URL conventions, not test tweaks
Some checks failed
CI/CD / lint (pull_request) Successful in 54s
CI/CD / test (pull_request) Successful in 1m7s
CI/CD / typecheck (pull_request) Successful in 2m37s
CI/CD / smoke (pull_request) Failing after 1m43s
CI/CD / build (pull_request) Successful in 5m42s
CI/CD / publish (pull_request) Has been skipped

Five real bugs surfaced by the agent-chat smoke against live
qwen3-thinking. None of these are fixed by changing the test — the
test was right to fail.

1. openai-passthrough adapter doubled `/v1` in the request URL. The
   adapter hard-codes `/v1/chat/completions` after the configured base,
   but every OpenAI-compat provider documents its base URL with a
   trailing `/v1` (api.openai.com/v1, llm.example.com/v1, …). Users
   pasting that conventional shape produced
   `https://x/v1/v1/chat/completions` → 404. endpointUrl now strips a
   trailing `/v1` so both forms canonicalize. `/v1beta` (Anthropic-style)
   is preserved.

2. Non-streaming chat returned an empty assistant when thinking models
   (qwen3-thinking, deepseek-reasoner, OpenAI o1) emitted only
   `reasoning_content` with `content: null`. extractChoice now also
   pulls reasoning (every spelling the streaming parser already knows
   about), and a new pickAssistantText helper falls back to it when
   content is empty. A `[response truncated by max_tokens]` marker is
   appended when finish_reason is `length`, so users see the cut-off
   instead of guessing why the answer is short. Symmetric streaming
   fix: the chatStream loop accumulates reasoning and yields ONE
   synthesized `text` frame at the end when content stayed empty,
   keeping the CLI's stdout (which only prints `text` deltas) in sync
   with the persisted thread message.

3. `mcpctl get agent X -o yaml` emitted `kind: public` (the v3
   lifecycle field) instead of `kind: agent` (apply envelope), so
   round-tripping through `apply -f` failed. Same fix shape as the v1
   Llm strip in toApplyDocs — drop kind/status/lastHeartbeatAt/
   inactiveSince/providerSessionId for the agents resource too.

4. Non-streaming `mcpctl chat` printed `thread:<cuid>` (no space) on
   stderr; streaming printed `(thread: <cuid>)` (with space). Tests
   and any other regex watching for one form missed the other.
   Standardize on `thread: <cuid>` (single space) in both paths.

5. agent-chat.smoke's `run()` used `execSync`, which discards stderr on
   success — making any `expect(stderr).toMatch(...)` assertion
   structurally impossible to satisfy in the happy path. Switch to
   `spawnSync` so stderr is actually captured. Includes a small
   shell-style argv splitter so the existing call sites with quoted
   multi-word values (`--system-prompt "..."`) keep working.

Tests: +6 new mcpd unit tests (4 chat-service for the reasoning
fallback / truncation marker / content-preference / streaming synth;
2 llm-adapters for the URL strip + /v1beta preservation). Full mcpd
+ mcplocal + smoke green: 860/860 + 723/723 + 139/139.
This commit is contained in:
Michal
2026-04-27 18:39:01 +01:00
parent 58bc277242
commit 610808b9e7
7 changed files with 293 additions and 29 deletions

View File

@@ -17,7 +17,7 @@
import { describe, it, expect, beforeAll, afterAll } from 'vitest';
import http from 'node:http';
import https from 'node:https';
import { execSync } from 'node:child_process';
import { spawnSync, execSync } from 'node:child_process';
const MCPD_URL = process.env.MCPD_URL ?? 'https://mcpctl.ad.itaz.eu';
const LLM_URL = process.env.MCPCTL_SMOKE_LLM_URL;
@@ -31,21 +31,37 @@ const AGENT_NAME = `smoke-chat-agent-${SUFFIX}`;
interface CliResult { code: number; stdout: string; stderr: string }
function run(args: string): CliResult {
try {
const stdout = execSync(`mcpctl --direct ${args}`, {
encoding: 'utf-8',
timeout: 60_000,
stdio: ['ignore', 'pipe', 'pipe'],
});
return { code: 0, stdout: stdout.trim(), stderr: '' };
} catch (err) {
const e = err as { status?: number; stdout?: Buffer | string; stderr?: Buffer | string };
return {
code: e.status ?? 1,
stdout: e.stdout ? (typeof e.stdout === 'string' ? e.stdout : e.stdout.toString('utf-8')) : '',
stderr: e.stderr ? (typeof e.stderr === 'string' ? e.stderr : e.stderr.toString('utf-8')) : '',
};
// spawnSync (not execSync) — execSync returns only stdout on success and
// discards stderr, which made any `thread:` assertion against a successful
// chat impossible to evaluate. Splitting the args correctly handles the
// few existing call sites that quote-wrap multi-word values like
// `--system-prompt "You are..."`.
const argv = splitArgs(args);
const res = spawnSync('mcpctl', ['--direct', ...argv], {
encoding: 'utf-8',
timeout: 60_000,
});
return {
code: res.status ?? 1,
stdout: (res.stdout ?? '').trim(),
stderr: (res.stderr ?? '').trim(),
};
}
/**
* Tokenize a shell-style argv string with simple double-quote support — just
* enough for the smoke test's call shapes. Not a full POSIX parser; we only
* need to keep `--system-prompt "You are a smoke test..."` together as one
* arg.
*/
function splitArgs(s: string): string[] {
const out: string[] = [];
const re = /"([^"]*)"|(\S+)/g;
let m: RegExpExecArray | null;
while ((m = re.exec(s)) !== null) {
out.push(m[1] !== undefined ? m[1] : (m[2] ?? ''));
}
return out;
}
function healthz(url: string, timeoutMs = 5000): Promise<boolean> {