feat(cli): live "say hi" probe for server LLMs in mcpctl status #61

michal · 2026-04-27T11:02:13Z

michal commented

2026-04-27 11:02:13 +00:00

Summary

`mcpctl status` was listing server-side LLMs but not telling you whether they actually serve inference. This adds a per-LLM "say hi" probe — POST a 8-token prompt to `/api/v1/llms//infer` and render the result inline.

```
Server LLMs: 2 registered (probing live "say hi"...)
fast qwen3-thinking ✓ "hi" 312ms
openai → qwen3-thinking http://litellm.../v1 key:litellm/API_KEY
heavy sonnet ✗ upstream auth failed: 401
anthropic → claude-sonnet-4-5 provider default no key
```

Probes run in parallel (one slow LLM doesn't block the others) with a 15s per-probe timeout. JSON/YAML output gains a `health: { ok, ms, say?, error? }` field so dashboards get the same signal.

Test plan

CLI status: 25/25 (was 24, +1 for the failure-path render)
Workspace: 2006/2006 across 149 files
Typecheck clean
Manual: `mcpctl status` against the live cluster shows ✓ "hi" + ms for qwen3-thinking.

🤖 Generated with Claude Code

## Summary \`mcpctl status\` was listing server-side LLMs but not telling you whether they actually serve inference. This adds a per-LLM "say hi" probe — POST a 8-token prompt to \`/api/v1/llms/<name>/infer\` and render the result inline. \`\`\` Server LLMs: 2 registered (probing live "say hi"...) fast qwen3-thinking ✓ "hi" 312ms openai → qwen3-thinking http://litellm.../v1 key:litellm/API_KEY heavy sonnet ✗ upstream auth failed: 401 anthropic → claude-sonnet-4-5 provider default no key \`\`\` Probes run in parallel (one slow LLM doesn't block the others) with a 15s per-probe timeout. JSON/YAML output gains a \`health: { ok, ms, say?, error? }\` field so dashboards get the same signal. ## Test plan - [x] CLI status: 25/25 (was 24, +1 for the failure-path render) - [x] Workspace: 2006/2006 across 149 files - [x] Typecheck clean - [ ] Manual: \`mcpctl status\` against the live cluster shows ✓ "hi" + ms for qwen3-thinking. 🤖 Generated with [Claude Code](https://claude.com/claude-code)

michal added 1 commit 2026-04-27 11:02:14 +00:00

feat(cli): live "say hi" probe for server LLMs in mcpctl status

CI/CD / lint (pull_request) Successful in 55s

Details

CI/CD / test (pull_request) Successful in 1m13s

Details

CI/CD / typecheck (pull_request) Successful in 3m10s

Details

CI/CD / smoke (pull_request) Failing after 1m46s

Details

CI/CD / build (pull_request) Successful in 3m24s

Details

CI/CD / publish (pull_request) Has been skipped

Details

e4af16477c

Status was showing the server-side LLM list but not whether each one
actually serves inference. This adds a per-LLM probe that POSTs a
tiny prompt to /api/v1/llms/<name>/infer:

  messages: [{ role: 'user', content: "Say exactly the word 'hi' and nothing else." }]
  max_tokens: 8, temperature: 0

Each registered LLM gets a one-line health line:

  Server LLMs: 2 registered (probing live "say hi"...)
    fast   qwen3-thinking  ✓ "hi" 312ms
              openai → qwen3-thinking  http://litellm.../v1  key:litellm/API_KEY
    heavy  sonnet  ✗ upstream auth failed: 401
              anthropic → claude-sonnet-4-5  provider default  no key

Probes run in parallel so a single slow LLM doesn't gate the others;
each has its own 15-second timeout. JSON/YAML output gains a
\`health: { ok, ms, say?, error? }\` field per server LLM so dashboards
get the same liveness signal.

Tests: 25/25 (was 24, +1 new for the failure-path render). Workspace
suite: 2006/2006 across 149 files.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

michal merged commit 54e56f7b71 into main

2026-04-27 11:02:28 +00:00

michal referenced this issue from a commit

2026-04-27 11:02:29 +00:00

feat(cli): live "say hi" probe for server LLMs in mcpctl status (#61)

michal referenced this pull request

2026-04-27 11:10:00 +00:00

fix(cli): status probe accepts reasoning_content for thinking models #62

Sign in to join this conversation.

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: michal/mcpctl#61