feat(llm): probe upstream auth at registration time
mcpd now runs a cheap auth probe whenever an Llm is created (or its
apiKeyRef/url is updated). Catches misconfigured tokens / wrong URLs at
registration with a 422 + structured error message, instead of silently
500-ing on first chat with a generic "fetch failed". Caught in the wild
today: the homelab Pulumi config exposed `MCPCTL_GATEWAY_TOKEN` (which
is mcpctl_pat_-prefixed, intended for LiteLLM→mcplocal direction) where
LiteLLM expects `LITELLM_MASTER_KEY` (sk-prefixed). The probe makes
this immediate.
Probe shape (LlmAdapter.verifyAuth):
- OpenAI passthrough → GET <url>/v1/models. Cheap, idempotent, gated
by the same auth as chat/completions.
- Anthropic → POST /v1/messages with max_tokens:1, "ping". Anthropic
has no list-models endpoint; this is the cheapest auth-exercising
call.
- Returns one of:
{ ok: true }
{ ok: false, reason: "auth", status, body } — 401/403, fail hard
{ ok: false, reason: "unreachable", error } — network, warn-only
{ ok: false, reason: "unexpected", status, body } — non-auth 4xx, warn-only
Behavior:
- LlmService.create()/update() runs the probe after resolveApiKey.
Throws LlmAuthVerificationError on `auth`, logs warn for
unreachable/unexpected, swallows for offline registration.
- Probe is skipped when there's no apiKeyRef (nothing to verify) or
when the caller passes skipAuthCheck=true.
- update() probes only when apiKeyRef OR url changes — pure
description/tier updates don't trigger upstream calls.
- Routes catch LlmAuthVerificationError and return 422 with
`{ error, status }`. The CLI surfaces the message verbatim via
ApiError.
Opt-out:
- CLI: `mcpctl create llm ... --skip-auth-check` for offline
registration before the upstream is reachable.
- HTTP: side-channel body field `_skipAuthCheck: true` (stripped
before validation, never persisted on the row).
Side fix in same commit (caught while testing): src/cli/src/index.ts
read `program.opts()` BEFORE `program.parse()`, so `--direct` was a
no-op for ApiClient — every command went to mcplocal regardless. Some
commands accidentally still worked because mcplocal forwards plain
`/api/v1/*` to mcpd, but flows that need direct SSE streaming (e.g.
`mcpctl chat`) couldn't reach mcpd. Fixed by peeking at process.argv
directly for the two global flags before Commander's parse runs.
Tests:
- llm-adapters.test.ts (+8): OpenAI 200/401/403/404/network, Anthropic
200/401/400 (typo'd model = unexpected, NOT auth — registration
shouldn't block on bad model names that surface at chat time).
- llm-service.test.ts (+6): create-throws-on-auth-fail (no row
written), warn-only on unreachable/unexpected, skipAuthCheck
bypass, no-key skip, update-only-probes-on-auth-affecting-change.
mcpd 775/775, mcplocal 715/715, cli 430/430.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -264,6 +264,7 @@ export function createCreateCommand(deps: CreateCommandDeps): Command {
|
||||
.option('--api-key-ref <ref>', 'API key reference in SECRET/KEY form (e.g. anthropic-key/token)')
|
||||
.option('--extra <entry>', 'Extra config key=value (repeat)', collect, [])
|
||||
.option('--force', 'Update if already exists')
|
||||
.option('--skip-auth-check', 'Skip the upstream auth probe (for offline registration before infra exists)')
|
||||
.action(async (name: string, opts) => {
|
||||
const body: Record<string, unknown> = {
|
||||
name,
|
||||
@@ -290,6 +291,11 @@ export function createCreateCommand(deps: CreateCommandDeps): Command {
|
||||
}
|
||||
body.extraConfig = extra;
|
||||
}
|
||||
// _skipAuthCheck is a side-channel field consumed (and stripped) by the
|
||||
// mcpd route — it never makes it into the Llm row. mcpd defaults to
|
||||
// running an auth probe at create/update time so wrong tokens fail fast
|
||||
// with a 422 instead of silently 502'ing on first chat.
|
||||
if (opts.skipAuthCheck === true) body._skipAuthCheck = true;
|
||||
|
||||
try {
|
||||
const row = await client.post<{ id: string; name: string }>('/api/v1/llms', body);
|
||||
|
||||
@@ -40,14 +40,31 @@ export function createProgram(): Command {
|
||||
program.addCommand(createLoginCommand());
|
||||
program.addCommand(createLogoutCommand());
|
||||
|
||||
// Resolve target URL: --direct goes to mcpd, default goes to mcplocal
|
||||
// Resolve target URL: --direct goes to mcpd, default goes to mcplocal.
|
||||
//
|
||||
// Commander's `program.opts()` returns the default values until
|
||||
// `program.parse(argv)` runs — but commands (and ApiClient) need the
|
||||
// resolved baseUrl at construction time. The chicken-and-egg meant
|
||||
// `--direct` was previously a no-op for ApiClient: every command went to
|
||||
// mcplocal regardless. Some commands accidentally still worked because
|
||||
// mcplocal forwards plain `/api/v1/*` to mcpd, but flows that need direct
|
||||
// SSE streaming (e.g. `mcpctl chat`) went to mcplocal:3200, which doesn't
|
||||
// route them.
|
||||
//
|
||||
// Fix: peek at process.argv directly for the two global flags we need
|
||||
// before Commander's full parse runs.
|
||||
const config = loadConfig();
|
||||
const creds = loadCredentials();
|
||||
const opts = program.opts();
|
||||
const argv = process.argv;
|
||||
const directFlag = argv.includes('--direct');
|
||||
const daemonUrlIdx = argv.indexOf('--daemon-url');
|
||||
const daemonUrlVal = daemonUrlIdx > -1 && daemonUrlIdx + 1 < argv.length
|
||||
? argv[daemonUrlIdx + 1]
|
||||
: undefined;
|
||||
let baseUrl: string;
|
||||
if (opts.daemonUrl) {
|
||||
baseUrl = opts.daemonUrl as string;
|
||||
} else if (opts.direct) {
|
||||
if (daemonUrlVal !== undefined) {
|
||||
baseUrl = daemonUrlVal;
|
||||
} else if (directFlag) {
|
||||
baseUrl = config.mcpdUrl;
|
||||
} else {
|
||||
baseUrl = config.mcplocalUrl;
|
||||
|
||||
Reference in New Issue
Block a user