fix(chat): real fixes for thinking-model + URL conventions, not test tweaks
Some checks failed
CI/CD / lint (pull_request) Successful in 54s
CI/CD / test (pull_request) Successful in 1m7s
CI/CD / typecheck (pull_request) Successful in 2m37s
CI/CD / smoke (pull_request) Failing after 1m43s
CI/CD / build (pull_request) Successful in 5m42s
CI/CD / publish (pull_request) Has been skipped

Five real bugs surfaced by the agent-chat smoke against live
qwen3-thinking. None of these are fixed by changing the test — the
test was right to fail.

1. openai-passthrough adapter doubled `/v1` in the request URL. The
   adapter hard-codes `/v1/chat/completions` after the configured base,
   but every OpenAI-compat provider documents its base URL with a
   trailing `/v1` (api.openai.com/v1, llm.example.com/v1, …). Users
   pasting that conventional shape produced
   `https://x/v1/v1/chat/completions` → 404. endpointUrl now strips a
   trailing `/v1` so both forms canonicalize. `/v1beta` (Anthropic-style)
   is preserved.

2. Non-streaming chat returned an empty assistant when thinking models
   (qwen3-thinking, deepseek-reasoner, OpenAI o1) emitted only
   `reasoning_content` with `content: null`. extractChoice now also
   pulls reasoning (every spelling the streaming parser already knows
   about), and a new pickAssistantText helper falls back to it when
   content is empty. A `[response truncated by max_tokens]` marker is
   appended when finish_reason is `length`, so users see the cut-off
   instead of guessing why the answer is short. Symmetric streaming
   fix: the chatStream loop accumulates reasoning and yields ONE
   synthesized `text` frame at the end when content stayed empty,
   keeping the CLI's stdout (which only prints `text` deltas) in sync
   with the persisted thread message.

3. `mcpctl get agent X -o yaml` emitted `kind: public` (the v3
   lifecycle field) instead of `kind: agent` (apply envelope), so
   round-tripping through `apply -f` failed. Same fix shape as the v1
   Llm strip in toApplyDocs — drop kind/status/lastHeartbeatAt/
   inactiveSince/providerSessionId for the agents resource too.

4. Non-streaming `mcpctl chat` printed `thread:<cuid>` (no space) on
   stderr; streaming printed `(thread: <cuid>)` (with space). Tests
   and any other regex watching for one form missed the other.
   Standardize on `thread: <cuid>` (single space) in both paths.

5. agent-chat.smoke's `run()` used `execSync`, which discards stderr on
   success — making any `expect(stderr).toMatch(...)` assertion
   structurally impossible to satisfy in the happy path. Switch to
   `spawnSync` so stderr is actually captured. Includes a small
   shell-style argv splitter so the existing call sites with quoted
   multi-word values (`--system-prompt "..."`) keep working.

Tests: +6 new mcpd unit tests (4 chat-service for the reasoning
fallback / truncation marker / content-preference / streaming synth;
2 llm-adapters for the URL strip + /v1beta preservation). Full mcpd
+ mcplocal + smoke green: 860/860 + 723/723 + 139/139.
This commit is contained in:
Michal
2026-04-27 18:39:01 +01:00
parent 58bc277242
commit 610808b9e7
7 changed files with 293 additions and 29 deletions

View File

@@ -151,7 +151,10 @@ async function runOneShot(
const sec = Math.max(0.05, (Date.now() - startMs) / 1000);
const words = (res.assistant.match(/\S+/g) ?? []).length;
process.stdout.write(`${res.assistant}\n`);
process.stderr.write(styleStats(`(${String(words)}w · ${(words / sec).toFixed(1)} w/s · ${sec.toFixed(1)}s)`) + ` thread:${res.threadId}\n`);
// `thread: <id>` — single space after the colon, matching the streaming
// path (line 160 below) so any tooling/regex that watches one form picks
// up the other too.
process.stderr.write(styleStats(`(${String(words)}w · ${(words / sec).toFixed(1)} w/s · ${sec.toFixed(1)}s)`) + ` thread: ${res.threadId}\n`);
return;
}
const bar = installStatusBar();

View File

@@ -408,8 +408,8 @@ function toApplyDocs(resource: string, items: unknown[]): Array<{ kind: string }
const kind = RESOURCE_KIND[resource] ?? resource;
return items.map((item) => {
const cleaned = stripInternalFields(item as Record<string, unknown>);
// Llm-specific: the new virtual-provider lifecycle fields collide with
// the apply-doc `kind` envelope (the schema uses `kind: public|virtual`)
// Llm-specific: the virtual-provider lifecycle fields collide with the
// apply-doc `kind` envelope (the schema uses `kind: public|virtual`)
// and aren't apply-able anyway — they're derived runtime state managed
// by VirtualLlmService. Drop them so YAML round-trips stay clean.
if (resource === 'llms') {
@@ -419,6 +419,17 @@ function toApplyDocs(resource: string, items: unknown[]): Array<{ kind: string }
delete cleaned['inactiveSince'];
delete cleaned['providerSessionId'];
}
// Agent-specific: same shape as Llm — Agent gained kind/status/etc. in
// v3 Stage 1 (virtual agent lifecycle) and the schema-`kind` field
// shadows the apply-envelope `kind: agent`. Strip the same set so
// `get agent X -o yaml | apply -f -` round-trips without diff.
if (resource === 'agents') {
delete cleaned['kind'];
delete cleaned['status'];
delete cleaned['lastHeartbeatAt'];
delete cleaned['inactiveSince'];
delete cleaned['providerSessionId'];
}
return { kind, ...cleaned };
});
}