feat(mcplocal): RBAC-bounded vllm-managed failover #54

Merged
michal merged 1 commits from feat/llm-failover into main 2026-04-19 21:39:48 +00:00
Owner

Summary

Phase 3 of the Llm plan. Adds the infrastructure for clients to fall back to a local `vllm-managed` provider when mcpd's inference proxy is unreachable — but only if RBAC still grants them view permission on the central Llm.

Based on `feat/llm-infer` (PR #53) — needs the proxy to fall back from. The actual agent / mcplocal HTTP-mode wire-up will land when those clients pivot to mcpd's proxy (the agent source isn't currently in this branch). What this PR ships is the reusable infra they'll consume.

  • `LlmProviderFileEntry` gains optional `failoverFor: `.
  • `ProviderRegistry` tracks a failover map (`registerFailover` / `getFailoverFor` / `listFailovers`). Unregister cleans up dangling entries.
  • New `FailoverRouter` orchestrates: try primary → on failure HEAD-probe `mcpd /api/v1/llms/:name` with the caller's bearer → `200` invoke local + return `failover: true`; `403` / `401` / network error → re-throw primary (fail-closed).
  • Server: `GET /api/v1/llms/:idOrName` accepts both CUID and human name. HEAD auto-derives from GET in Fastify — same RBAC hook, body discarded, perfect for the probe.

Test plan

  • 11 unit tests for the registry map + FailoverRouter decisions (success path, fallback path, no-failover, 403, network unreachable, status mapping)
  • 4 new route tests (name-based GET, HEAD existing, HEAD missing)
  • Full workspace suite: 1844/1844 passing (+14 from Phase 2's 1830)
  • TypeScript clean across mcpd + mcplocal
  • End-to-end: hard to test without an agent in this repo — defer until the agent source returns or mcplocal HTTP-mode pivots

🤖 Generated with Claude Code

## Summary Phase 3 of the Llm plan. Adds the infrastructure for clients to fall back to a local \`vllm-managed\` provider when mcpd's inference proxy is unreachable — but only if RBAC still grants them view permission on the central Llm. **Based on \`feat/llm-infer\` (PR #53)** — needs the proxy to fall back from. The actual agent / mcplocal HTTP-mode wire-up will land when those clients pivot to mcpd's proxy (the agent source isn't currently in this branch). What this PR ships is the reusable infra they'll consume. - \`LlmProviderFileEntry\` gains optional \`failoverFor: <central llm name>\`. - \`ProviderRegistry\` tracks a failover map (\`registerFailover\` / \`getFailoverFor\` / \`listFailovers\`). Unregister cleans up dangling entries. - New \`FailoverRouter\` orchestrates: try primary → on failure HEAD-probe \`mcpd /api/v1/llms/:name\` with the caller's bearer → \`200\` invoke local + return \`failover: true\`; \`403\` / \`401\` / network error → re-throw primary (fail-closed). - Server: \`GET /api/v1/llms/:idOrName\` accepts both CUID and human name. HEAD auto-derives from GET in Fastify — same RBAC hook, body discarded, perfect for the probe. ## Test plan - [x] 11 unit tests for the registry map + FailoverRouter decisions (success path, fallback path, no-failover, 403, network unreachable, status mapping) - [x] 4 new route tests (name-based GET, HEAD existing, HEAD missing) - [x] Full workspace suite: **1844/1844 passing** (+14 from Phase 2's 1830) - [x] TypeScript clean across mcpd + mcplocal - [ ] End-to-end: hard to test without an agent in this repo — defer until the agent source returns or mcplocal HTTP-mode pivots 🤖 Generated with [Claude Code](https://claude.com/claude-code)
michal changed target branch from feat/llm-infer to main 2026-04-19 21:39:44 +00:00
michal added 1 commit 2026-04-19 21:39:44 +00:00
Why: when mcpd's inference proxy is unreachable, clients with a local
vllm-managed provider should be able to substitute — but only if they still
have view permission on the centralized Llm. Otherwise revoking an Llm
wouldn't actually stop a misbehaving client.

Infrastructure (the agent + mcplocal HTTP-mode wire-up will land separately
when those clients pivot to mcpd's proxy):

- LlmProviderFileEntry gains optional `failoverFor: <central llm name>`. The
  entry is otherwise the same local provider it always was; the new field
  just declares which central Llm it can substitute for.
- ProviderRegistry tracks a failover map (registerFailover / getFailoverFor /
  listFailovers). Unregister removes any failover entry pointing at the
  removed provider so we don't end up with dangling references.
- New FailoverRouter wraps a primary inference call. On primary failure: if
  a local provider is registered for the Llm, HEAD-probe `mcpd /api/v1/llms/
  :name` with the caller's bearer to verify view permission, then either
  invoke the local provider (allowed) or re-throw the primary error (403,
  401, network unreachable, anything else — all fail-closed).
- Server: GET /api/v1/llms/:idOrName accepts both CUID and human name. Lets
  FailoverRouter probe by name without a separate id-resolution call. HEAD
  derives automatically from GET in Fastify, which runs the same RBAC hook
  and drops the body — exactly what the probe needs.

Tests: 11 failover unit tests (registry map, decision flow, fail-closed for
forbidden + unreachable, checkAuth status mapping) + 4 new route tests
(name lookup, HEAD existing/missing). Full suite 1844/1844 (+14 from Phase
2's 1830). TypeScript clean across mcpd + mcplocal.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
michal merged commit 2155910f1c into main 2026-04-19 21:39:48 +00:00
Sign in to join this conversation.
No Reviewers
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: michal/mcpctl#54