fix(cli): status probe accepts reasoning_content for thinking models #62

Merged

michal merged 1 commits from fix/status-probe-reasoning-content into main

2026-04-27 11:10:18 +00:00

Author	SHA1	Message	Date
Michal	a84214dad1	fix(cli): status probe accepts reasoning_content for thinking models Some checks failed CI/CD / typecheck (pull_request) Successful in 56s Details CI/CD / lint (pull_request) Successful in 3m6s Details CI/CD / test (pull_request) Successful in 1m9s Details CI/CD / build (pull_request) Successful in 2m39s Details CI/CD / smoke (pull_request) Failing after 3m58s Details CI/CD / publish (pull_request) Has been skipped Details Live deploy showed qwen3-thinking failing the probe with "empty content": at max_tokens=8 the model spent its entire budget on the reasoning trace and never emitted a final \`content\` block. Fix: - Bump max_tokens to 64. Still caps latency at ~1-2 sec on cheap models but gives reasoning models enough headroom. - If \`message.content\` is empty but \`reasoning_content\` is non-empty, count it as alive and prefix the preview with "[thinking]" so the user knows the model didn't actually answer "hi" but is responsive. - Replace the prompt with the terser "Reply with just: hi" — closer to what a thinking model can short-circuit on. Tests: existing 25 pass; the failure-path test still asserts on the "empty content" path because reasoning_content is empty there too. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 12:09:42 +01:00

Author

SHA1

Message

Date

Michal

a84214dad1

fix(cli): status probe accepts reasoning_content for thinking models

CI/CD / typecheck (pull_request) Successful in 56s

Details

CI/CD / lint (pull_request) Successful in 3m6s

Details

CI/CD / test (pull_request) Successful in 1m9s

Details

CI/CD / build (pull_request) Successful in 2m39s

Details

CI/CD / smoke (pull_request) Failing after 3m58s

Details

CI/CD / publish (pull_request) Has been skipped

Details

Live deploy showed qwen3-thinking failing the probe with "empty
content": at max_tokens=8 the model spent its entire budget on the
reasoning trace and never emitted a final \`content\` block.

Fix:
- Bump max_tokens to 64. Still caps latency at ~1-2 sec on cheap
  models but gives reasoning models enough headroom.
- If \`message.content\` is empty but \`reasoning_content\` is non-empty,
  count it as alive and prefix the preview with "[thinking]" so the
  user knows the model didn't actually answer "hi" but is responsive.
- Replace the prompt with the terser "Reply with just: hi" — closer
  to what a thinking model can short-circuit on.

Tests: existing 25 pass; the failure-path test still asserts on the
"empty content" path because reasoning_content is empty there too.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-04-27 12:09:42 +01:00

fix(cli): status probe accepts reasoning_content for thinking models #62

1 Commits