feat(openbao): wizard-provisioning + daily token rotation

One-command setup replaces the 6-step manual flow — `mcpctl create secretbackend bao --type openbao --wizard` takes the OpenBao admin token once, provisions a narrow policy + token role, mints the first periodic token, stores it on mcpd, verifies end-to-end, and prints the migration command. The admin token is NEVER persisted. The stored credential auto-rotates daily: mcpd mints a successor via the token role (self-rotation capability is part of the policy it was issued with), verifies the successor, writes it over the backing Secret, then revokes the predecessor by accessor. TTL 720h means a week of rotation failures still leaves 20+ days of runway. Shared: - New `@mcpctl/shared/vault` — pure HTTP wrappers (verifyHealth, ensureKvV2, writePolicy, ensureTokenRole, mintRoleToken, revokeAccessor, lookupSelf, testWriteReadDelete) and policy HCL builder. mcpd: - `tokenMeta Json @default("{}")` on SecretBackend. Self-healing schema migration — empty default lets `prisma db push` add the column cleanly. - SecretBackendRotator.rotateOne: mint → verify → persist → revoke-old → update tokenMeta. Failures surface via `lastRotationError` on the row; the old token keeps working. - SecretBackendRotatorLoop: on startup rotates overdue backends, schedules per-backend timers with ±10min jitter. Stops cleanly on shutdown. - New `POST /api/v1/secretbackends/:id/rotate` (operation `rotate-secretbackend` — added to bootstrap-admin's auto-migrated ops alongside migrate-secrets, which was previously missing too). CLI: - `--wizard` on `create secretbackend` delegates to the interactive flow. All prompts can be pre-answered via flags (--url, --admin-token, --mount, --path-prefix, --policy-name, --token-role, --no-promote-default) for CI. - `mcpctl rotate secretbackend <name>` — convenience verb; hits the new rotate endpoint. - `describe secretbackend` renders a Token health section (healthy / STALE / WARNING / ERROR) with generated/renewal/expiry timestamps and last rotation error. Only shown when tokenMeta.rotatable is true — the existing k8s-auth + static-token backends don't surface it. Tests: 15 vault-client unit tests (shared), 8 rotator unit tests (mcpd), 3 wizard flow tests (cli, including a regression test that the admin token never appears in stdout). Full suite 1885/1885 (+32). Completions regenerated for the new flags. Out of scope (explicit): kubernetes-auth wizard, Vault Enterprise namespaces in the wizard path, rotation for non-wizard static-token backends. See plan file for details. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
feat(openbao): kubernetes ServiceAccount auth — no static token in DB
2026-04-20 17:20:37 +01:00 · 2026-04-19 23:23:05 +01:00 · 2026-04-19 22:59:07 +01:00 · 2026-04-19 22:55:39 +01:00 · 2026-04-19 22:45:08 +01:00 · 2026-04-19 21:39:54 +00:00
50 changed files with 4457 additions and 45 deletions
--- a/completions/mcpctl.bash
+++ b/completions/mcpctl.bash
@@ -5,7 +5,7 @@ _mcpctl() {
  local cur prev words cword
  _init_completion || return

-  local commands="status login logout config get describe delete logs create edit apply patch backup approve console cache test migrate"
+  local commands="status login logout config get describe delete logs create edit apply patch backup approve console cache test migrate rotate"
  local project_commands="get describe delete logs create edit attach-server detach-server"
  local global_opts="-v --version --daemon-url --direct -p --project -h --help"
  local resources="servers instances secrets secretbackends llms templates projects users groups rbac prompts promptrequests serverattachments proxymodels all"
@@ -188,10 +188,10 @@ _mcpctl() {
            COMPREPLY=($(compgen -W "--type --model --url --tier --description --api-key-ref --extra --force -h --help" -- "$cur"))
            ;;
          secretbackend)
-            COMPREPLY=($(compgen -W "--type --description --default --url --namespace --mount --path-prefix --token-secret --config --force -h --help" -- "$cur"))
+            COMPREPLY=($(compgen -W "--type --description --default --url --namespace --mount --path-prefix --auth --token-secret --role --auth-mount --sa-token-path --config --wizard --admin-token --policy-name --token-role --no-promote-default --force -h --help" -- "$cur"))
            ;;
          project)
-            COMPREPLY=($(compgen -W "-d --description --proxy-model --prompt --gated --no-gated --server --force -h --help" -- "$cur"))
+            COMPREPLY=($(compgen -W "-d --description --proxy-model --prompt --llm --llm-model --gated --no-gated --server --force -h --help" -- "$cur"))
            ;;
          user)
            COMPREPLY=($(compgen -W "--password --name --force -h --help" -- "$cur"))
@@ -350,6 +350,21 @@ _mcpctl() {
        esac
      fi
      return ;;
+    rotate)
+      local rotate_sub=$(_mcpctl_get_subcmd $subcmd_pos)
+      if [[ -z "$rotate_sub" ]]; then
+        COMPREPLY=($(compgen -W "secretbackend help" -- "$cur"))
+      else
+        case "$rotate_sub" in
+          secretbackend)
+            COMPREPLY=($(compgen -W "-h --help" -- "$cur"))
+            ;;
+          *)
+            COMPREPLY=($(compgen -W "-h --help" -- "$cur"))
+            ;;
+        esac
+      fi
+      return ;;
    help)
      COMPREPLY=($(compgen -W "$commands" -- "$cur"))
      return ;;
--- a/completions/mcpctl.fish
+++ b/completions/mcpctl.fish
@@ -4,7 +4,7 @@
 # Erase any stale completions from previous versions
 complete -c mcpctl -e

-set -l commands status login logout config get describe delete logs create edit apply patch backup approve console cache test migrate
+set -l commands status login logout config get describe delete logs create edit apply patch backup approve console cache test migrate rotate
 set -l project_commands get describe delete logs create edit attach-server detach-server

 # Disable file completions by default
@@ -235,6 +235,7 @@ complete -c mcpctl -n "not __mcpctl_has_project; and not __fish_seen_subcommand_
 complete -c mcpctl -n "not __mcpctl_has_project; and not __fish_seen_subcommand_from $commands" -a cache -d 'Manage ProxyModel pipeline cache'
 complete -c mcpctl -n "not __mcpctl_has_project; and not __fish_seen_subcommand_from $commands" -a test -d 'Utilities for testing MCP endpoints and config'
 complete -c mcpctl -n "not __mcpctl_has_project; and not __fish_seen_subcommand_from $commands" -a migrate -d 'Move resources between backends (currently: secrets between SecretBackends)'
+complete -c mcpctl -n "not __mcpctl_has_project; and not __fish_seen_subcommand_from $commands" -a rotate -d 'Force rotation of a credential-rotating resource (currently: secretbackend)'

 # Project-scoped commands (with --project)
 complete -c mcpctl -n "__mcpctl_has_project; and not __fish_seen_subcommand_from $project_commands" -a get -d 'List resources (servers, projects, instances, all)'
@@ -336,14 +337,25 @@ complete -c mcpctl -n "__mcpctl_subcmd_active create secretbackend" -l url -d 'o
 complete -c mcpctl -n "__mcpctl_subcmd_active create secretbackend" -l namespace -d 'openbao: X-Vault-Namespace header value' -x
 complete -c mcpctl -n "__mcpctl_subcmd_active create secretbackend" -l mount -d 'openbao: KV v2 mount point (default: secret)' -x
 complete -c mcpctl -n "__mcpctl_subcmd_active create secretbackend" -l path-prefix -d 'openbao: path prefix under mount (default: mcpctl)' -x
-complete -c mcpctl -n "__mcpctl_subcmd_active create secretbackend" -l token-secret -d 'openbao: token secret reference in SECRET/KEY form (e.g. bao-creds/token)' -x
+complete -c mcpctl -n "__mcpctl_subcmd_active create secretbackend" -l auth -d 'openbao: auth method — \'token\' (default) or \'kubernetes\'' -x
+complete -c mcpctl -n "__mcpctl_subcmd_active create secretbackend" -l token-secret -d 'openbao token auth: token secret reference in SECRET/KEY form (e.g. bao-creds/token)' -x
+complete -c mcpctl -n "__mcpctl_subcmd_active create secretbackend" -l role -d 'openbao kubernetes auth: vault role to login as (e.g. \'mcpctl\')' -x
+complete -c mcpctl -n "__mcpctl_subcmd_active create secretbackend" -l auth-mount -d 'openbao kubernetes auth: vault auth method mount path (default: \'kubernetes\')' -x
+complete -c mcpctl -n "__mcpctl_subcmd_active create secretbackend" -l sa-token-path -d 'openbao kubernetes auth: filesystem path to projected SA token (default: \'/var/run/secrets/kubernetes.io/serviceaccount/token\')' -x
 complete -c mcpctl -n "__mcpctl_subcmd_active create secretbackend" -l config -d 'Extra config as key=value (repeat for multiple)' -x
+complete -c mcpctl -n "__mcpctl_subcmd_active create secretbackend" -l wizard -d 'Interactive wizard (openbao only): provision policy + token role, mint token, store on mcpd, suggest migration'
+complete -c mcpctl -n "__mcpctl_subcmd_active create secretbackend" -l admin-token -d 'openbao wizard: OpenBao admin/root token (prompted if omitted). Used only for provisioning; NEVER persisted.' -x
+complete -c mcpctl -n "__mcpctl_subcmd_active create secretbackend" -l policy-name -d 'openbao wizard: name for the policy created on OpenBao (default: \'app-mcpd\')' -x
+complete -c mcpctl -n "__mcpctl_subcmd_active create secretbackend" -l token-role -d 'openbao wizard: name for the token role created on OpenBao (default: \'app-mcpd-role\')' -x
+complete -c mcpctl -n "__mcpctl_subcmd_active create secretbackend" -l no-promote-default -d 'openbao wizard: do not promote this backend to default after creation'
 complete -c mcpctl -n "__mcpctl_subcmd_active create secretbackend" -l force -d 'Update if already exists'

 # create project options
 complete -c mcpctl -n "__mcpctl_subcmd_active create project" -s d -l description -d 'Project description' -x
 complete -c mcpctl -n "__mcpctl_subcmd_active create project" -l proxy-model -d 'Plugin name (default, content-pipeline, gate, none)' -x
 complete -c mcpctl -n "__mcpctl_subcmd_active create project" -l prompt -d 'Project-level prompt / instructions for the LLM' -x
+complete -c mcpctl -n "__mcpctl_subcmd_active create project" -l llm -d 'Name of an Llm resource (see \'mcpctl get llms\'), or \'none\' to disable' -x
+complete -c mcpctl -n "__mcpctl_subcmd_active create project" -l llm-model -d 'Override the model string for this project (defaults to the Llm\'s own model)' -x
 complete -c mcpctl -n "__mcpctl_subcmd_active create project" -l gated -d '[deprecated: use --proxy-model default]'
 complete -c mcpctl -n "__mcpctl_subcmd_active create project" -l no-gated -d '[deprecated: use --proxy-model content-pipeline]'
 complete -c mcpctl -n "__mcpctl_subcmd_active create project" -l server -d 'Server name (repeat for multiple)' -x
@@ -429,6 +441,10 @@ complete -c mcpctl -n "__mcpctl_subcmd_active migrate secrets" -l names -d 'Comm
 complete -c mcpctl -n "__mcpctl_subcmd_active migrate secrets" -l keep-source -d 'Leave the source copy intact (default: delete from source after write+commit)'
 complete -c mcpctl -n "__mcpctl_subcmd_active migrate secrets" -l dry-run -d 'Show which secrets would be migrated without touching them'

+# rotate subcommands
+set -l rotate_cmds secretbackend
+complete -c mcpctl -n "__fish_seen_subcommand_from rotate; and not __fish_seen_subcommand_from $rotate_cmds" -a secretbackend -d 'Rotate the vault token on an OpenBao SecretBackend (wizard-provisioned)'
+
 # status options
 complete -c mcpctl -n "__fish_seen_subcommand_from status" -s o -l output -d 'output format (table, json, yaml)' -x

--- a/deploy/entrypoint.sh
+++ b/deploy/entrypoint.sh
@@ -1,8 +1,23 @@
 #!/bin/sh
 set -e

+# Self-healing schema push:
+#   1. Try once — for fresh installs and already-migrated clusters this is all
+#      that's needed.
+#   2. On failure (typically a Phase 0 upgrade where the new SecretBackend FK
+#      can't attach because pre-existing Secret rows reference nothing), run
+#      the pre-migrate bootstrap to seed a default SecretBackend + backfill
+#      Secret.backendId, then retry.
+#   3. If the retry still fails, let the error surface so the pod crashes
+#      visibly rather than starting in a half-migrated state.
 echo "mcpd: pushing database schema..."
-pnpm -F @mcpctl/db exec prisma db push --schema=prisma/schema.prisma --accept-data-loss 2>&1
+if pnpm -F @mcpctl/db exec prisma db push --schema=prisma/schema.prisma --accept-data-loss 2>&1; then
+  :
+else
+  echo "mcpd: schema push failed — running pre-migrate bootstrap + retrying..."
+  node src/db/dist/scripts/pre-migrate-bootstrap.js || true
+  pnpm -F @mcpctl/db exec prisma db push --schema=prisma/schema.prisma --accept-data-loss 2>&1
+fi

 echo "mcpd: seeding templates..."
 TEMPLATES_DIR=templates node src/mcpd/dist/seed-runner.js
--- a/src/cli/src/commands/apply.ts
+++ b/src/cli/src/commands/apply.ts
@@ -149,7 +149,12 @@ const ProjectSpecSchema = z.object({
  prompt: z.string().max(10000).default(''),
  proxyModel: z.string().optional(),
  gated: z.boolean().optional(),
+  // Name of an `Llm` resource (see `mcpctl get llms`), or the literal 'none'
+  // to disable LLM features for this project. Unknown names fall back to the
+  // consumer's registry default — `mcpctl describe project` will flag that.
  llmProvider: z.string().optional(),
+  // Override the model string for this project; defaults to the Llm's own
+  // model when unset.
  llmModel: z.string().optional(),
  servers: z.array(z.string()).default([]),
 });
--- a/src/cli/src/commands/config-setup.ts
+++ b/src/cli/src/commands/config-setup.ts
@@ -153,7 +153,7 @@ async function defaultConfirm(message: string, defaultValue?: boolean): Promise<
  return answer as boolean;
 }

-const defaultPrompt: ConfigSetupPrompt = {
+export const defaultPrompt: ConfigSetupPrompt = {
  select: defaultSelect,
  input: defaultInput,
  password: defaultPassword,
--- a/src/cli/src/commands/create-secretbackend-wizard.ts
+++ b/src/cli/src/commands/create-secretbackend-wizard.ts
@@ -0,0 +1,231 @@
+/**
+ * Interactive wizard that provisions an OpenBao backend end-to-end:
+ *
+ *   1. Asks the user for the OpenBao URL + admin/root token.
+ *   2. Verifies connectivity (`/sys/health`).
+ *   3. Ensures KV v2 is mounted at `<mount>/`.
+ *   4. Writes policy `app-mcpd` scoped to `<mount>/{data,metadata}/<prefix>/*`
+ *      plus the self-rotation paths.
+ *   5. Ensures a token role `app-mcpd-role` with `period=720h, renewable=true`.
+ *   6. Mints the first periodic token via that role.
+ *   7. Stores the token as a plaintext `Secret` on mcpd.
+ *   8. Creates the `SecretBackend` row with rotation config pointing at the role.
+ *   9. Kicks an initial rotate via `POST /api/v1/secretbackends/:id/rotate`
+ *      to seed `tokenMeta` + prove the self-rotation policy works.
+ *  10. (Optional) promotes the new backend to default.
+ *  11. Prints the migration command for the user to run.
+ *
+ * Admin token is used only for steps 2–6 and is never persisted.
+ *
+ * All prompts go through `ConfigSetupPrompt` (from `config-setup.ts`) so the
+ * wizard is testable without real stdin.
+ */
+import type { ApiClient } from '../api-client.js';
+import {
+  verifyHealth,
+  ensureKvV2,
+  writePolicy,
+  ensureTokenRole,
+  mintRoleToken,
+  testWriteReadDelete,
+  buildAppMcpdPolicyHcl,
+  type VaultDeps,
+} from '@mcpctl/shared';
+import { type ConfigSetupPrompt, defaultPrompt } from './config-setup.js';
+
+export interface WizardDeps {
+  client: ApiClient;
+  log: (...args: unknown[]) => void;
+  prompt?: ConfigSetupPrompt;
+  /** Overridable for tests. Forwarded to all vault HTTP calls. */
+  fetch?: typeof globalThis.fetch;
+}
+
+export interface WizardInput {
+  /** Backend name. Required — supplied via `mcpctl create secretbackend <name> --wizard`. */
+  name: string;
+  /** Pre-filled via flags for CI; falls back to prompt. */
+  url?: string | undefined;
+  adminToken?: string | undefined;
+  mount?: string | undefined;
+  pathPrefix?: string | undefined;
+  policyName?: string | undefined;
+  tokenRole?: string | undefined;
+  promoteToDefault?: boolean | undefined;
+  /** If set, skip the test write/read/delete (for dev/debugging only). */
+  skipSmoke?: boolean | undefined;
+}
+
+export async function runSecretBackendOpenbaoWizard(
+  input: WizardInput,
+  deps: WizardDeps,
+): Promise<void> {
+  const prompt = deps.prompt ?? defaultPrompt;
+  const log = deps.log;
+
+  const url = input.url ?? await prompt.input('OpenBao URL', 'https://bao.ad.itaz.eu');
+  const adminToken = input.adminToken ?? await prompt.password('OpenBao admin / root token');
+  if (adminToken === '') throw new Error('admin token is required');
+
+  const vaultDeps: VaultDeps = {};
+  if (deps.fetch !== undefined) vaultDeps.fetch = deps.fetch;
+
+  // 1. Health check.
+  log('  → checking OpenBao health …');
+  const health = await verifyHealth(url, adminToken, vaultDeps);
+  if (!health.initialized || health.sealed) {
+    throw new Error(`OpenBao is not ready (initialized=${String(health.initialized)}, sealed=${String(health.sealed)})`);
+  }
+  log(`    ok (version ${health.version})`);
+
+  const mount = input.mount ?? await prompt.input('KV v2 mount', 'secret');
+  const pathPrefix = input.pathPrefix ?? await prompt.input('Path prefix under mount', 'mcpd');
+  const policyName = input.policyName ?? await prompt.input('Policy name', 'app-mcpd');
+  const tokenRole = input.tokenRole ?? await prompt.input('Token role name', 'app-mcpd-role');
+
+  // 2. Enable KV v2 if needed.
+  log(`  → ensuring KV v2 at ${mount}/ …`);
+  const created = await ensureKvV2(url, adminToken, mount, vaultDeps);
+  log(`    ${created ? 'mounted' : 'already mounted'}`);
+
+  // 3. Write policy.
+  log(`  → writing policy '${policyName}' …`);
+  const hcl = buildAppMcpdPolicyHcl({ mount, pathPrefix, tokenRole });
+  await writePolicy(url, adminToken, policyName, hcl, vaultDeps);
+  log(`    written (scope: ${mount}/{data,metadata}/${pathPrefix}/* + self-rotation paths)`);
+
+  // 4. Ensure token role.
+  log(`  → ensuring token role '${tokenRole}' (period=720h, renewable) …`);
+  await ensureTokenRole(url, adminToken, tokenRole, {
+    allowedPolicies: [policyName],
+    period: 720 * 3600,
+    renewable: true,
+    orphan: false,
+  }, vaultDeps);
+  log('    ok');
+
+  // 5. Mint the first periodic token using the admin token.
+  log('  → minting first periodic token …');
+  const minted = await mintRoleToken(url, adminToken, tokenRole, vaultDeps);
+  if (!minted.renewable) {
+    throw new Error(`minted token is not renewable — the role '${tokenRole}' config is wrong`);
+  }
+  log(`    minted (accessor ${minted.accessor.slice(0, 12)}…)`);
+
+  // 6. Smoke test with the minted token before committing to mcpd.
+  if (input.skipSmoke !== true) {
+    log('  → smoke-testing write/read/delete with the minted token …');
+    await testWriteReadDelete(url, minted.clientToken, mount, `${pathPrefix}/.__mcpctl_wizard_smoke__`, vaultDeps);
+    log('    ok');
+  }
+
+  // 7. Store token on mcpd as a plaintext Secret.
+  const credsSecretName = `${input.name}-creds`;
+  log(`  → creating Secret '${credsSecretName}' on mcpd (plaintext) …`);
+  await createSecret(deps.client, credsSecretName, { token: minted.clientToken });
+
+  // 8. Create SecretBackend row (non-default by default; promote later).
+  log(`  → creating SecretBackend '${input.name}' …`);
+  const backendBody = {
+    name: input.name,
+    type: 'openbao',
+    config: {
+      url,
+      auth: 'token',
+      mount,
+      pathPrefix,
+      tokenSecretRef: { name: credsSecretName, key: 'token' },
+      rotation: {
+        enabled: true,
+        tokenRole,
+        intervalHours: 24,
+      },
+    },
+  };
+  const backend = await deps.client.post<{ id: string; name: string }>('/api/v1/secretbackends', backendBody);
+  log(`    created (id: ${backend.id})`);
+
+  // 9. Kick initial rotation so tokenMeta is populated + self-rotation is proven.
+  //    This uses the FIRST token (just-minted) to mint its successor. The old
+  //    first token is then revoked by accessor.
+  log('  → running initial rotation (seeds tokenMeta) …');
+  try {
+    await deps.client.post(`/api/v1/secretbackends/${backend.id}/rotate`, {});
+    log('    rotated — tokenMeta populated');
+  } catch (err) {
+    log(`    warn: initial rotation failed: ${err instanceof Error ? err.message : String(err)}`);
+    log('         backend is still usable; rotation will retry on the 24h loop');
+  }
+
+  // 10. Optional promote.
+  const promote = input.promoteToDefault
+    ?? await prompt.confirm(`Promote '${input.name}' to default backend?`, true);
+  if (promote) {
+    await deps.client.post(`/api/v1/secretbackends/${backend.id}/default`, {});
+    log(`    promoted '${input.name}' to default`);
+  }
+
+  // 11. Migration hint.
+  log('');
+  await printMigrationHint(deps.client, input.name, log);
+
+  log('');
+  log(`Describe the new backend:   mcpctl --direct describe secretbackend ${input.name}`);
+  log(`Force a rotation manually:  mcpctl --direct rotate secretbackend ${input.name}`);
+}
+
+async function createSecret(
+  client: ApiClient,
+  name: string,
+  data: Record<string, string>,
+): Promise<void> {
+  try {
+    await client.post('/api/v1/secrets', { name, data });
+  } catch (err) {
+    // 409 → secret already exists with this name. Update its data instead so
+    // re-running the wizard with the same --name is idempotent.
+    const status = (err as { status?: number }).status;
+    if (status !== 409) throw err;
+    const existing = (await client.get<Array<{ id: string; name: string }>>('/api/v1/secrets'))
+      .find((s) => s.name === name);
+    if (existing === undefined) throw err;
+    await client.put(`/api/v1/secrets/${existing.id}`, { data });
+  }
+}
+
+async function printMigrationHint(
+  client: ApiClient,
+  newBackendName: string,
+  log: (...args: unknown[]) => void,
+): Promise<void> {
+  // Find the current default backend name (likely 'default') so the hint
+  // points at a real source.
+  let defaultName = 'default';
+  try {
+    const rows = await client.get<Array<{ name: string; isDefault: boolean }>>('/api/v1/secretbackends');
+    const d = rows.find((r) => r.isDefault);
+    if (d !== undefined && d.name !== newBackendName) defaultName = d.name;
+  } catch {
+    /* fall through with 'default' guess */
+  }
+
+  // Count candidate secrets.
+  try {
+    const body = await client.post<{ candidates: Array<{ name: string }> }>(
+      '/api/v1/secrets/migrate',
+      { from: defaultName, to: newBackendName, dryRun: true },
+    );
+    const n = body.candidates.length;
+    if (n === 0) {
+      log(`No secrets to migrate — '${defaultName}' is empty.`);
+      return;
+    }
+    log(`You have ${String(n)} secret(s) on '${defaultName}'. To migrate them to '${newBackendName}':`);
+    log('');
+    log(`    mcpctl --direct migrate secrets --from ${defaultName} --to ${newBackendName} --dry-run`);
+    log(`    mcpctl --direct migrate secrets --from ${defaultName} --to ${newBackendName}`);
+  } catch (err) {
+    log(`(could not dry-run migration: ${err instanceof Error ? err.message : String(err)})`);
+    log(`Manual command:  mcpctl --direct migrate secrets --from ${defaultName} --to ${newBackendName}`);
+  }
+}
--- a/src/cli/src/commands/create.ts
+++ b/src/cli/src/commands/create.ts
@@ -319,23 +319,64 @@ export function createCreateCommand(deps: CreateCommandDeps): Command {
    .option('--namespace <ns>', 'openbao: X-Vault-Namespace header value')
    .option('--mount <mount>', 'openbao: KV v2 mount point (default: secret)')
    .option('--path-prefix <prefix>', 'openbao: path prefix under mount (default: mcpctl)')
-    .option('--token-secret <ref>', 'openbao: token secret reference in SECRET/KEY form (e.g. bao-creds/token)')
+    .option('--auth <method>', "openbao: auth method — 'token' (default) or 'kubernetes'")
+    .option('--token-secret <ref>', 'openbao token auth: token secret reference in SECRET/KEY form (e.g. bao-creds/token)')
+    .option('--role <name>', "openbao kubernetes auth: vault role to login as (e.g. 'mcpctl')")
+    .option('--auth-mount <path>', "openbao kubernetes auth: vault auth method mount path (default: 'kubernetes')")
+    .option('--sa-token-path <path>', "openbao kubernetes auth: filesystem path to projected SA token (default: '/var/run/secrets/kubernetes.io/serviceaccount/token')")
    .option('--config <entry>', 'Extra config as key=value (repeat for multiple)', collect, [])
+    .option('--wizard', 'Interactive wizard (openbao only): provision policy + token role, mint token, store on mcpd, suggest migration')
+    .option('--admin-token <token>', "openbao wizard: OpenBao admin/root token (prompted if omitted). Used only for provisioning; NEVER persisted.")
+    .option('--policy-name <name>', "openbao wizard: name for the policy created on OpenBao (default: 'app-mcpd')")
+    .option('--token-role <name>', "openbao wizard: name for the token role created on OpenBao (default: 'app-mcpd-role')")
+    .option('--no-promote-default', 'openbao wizard: do not promote this backend to default after creation')
    .option('--force', 'Update if already exists')
    .action(async (name: string, opts) => {
      const type = opts.type as string;
+      // Wizard path — delegates to create-secretbackend-wizard.ts.
+      if (opts.wizard === true) {
+        if (type !== 'openbao') {
+          throw new Error(`--wizard is only supported for --type openbao (got '${type}')`);
+        }
+        const { runSecretBackendOpenbaoWizard } = await import('./create-secretbackend-wizard.js');
+        const wizardInput: Parameters<typeof runSecretBackendOpenbaoWizard>[0] = { name };
+        if (opts.url !== undefined) wizardInput.url = opts.url as string;
+        if (opts.adminToken !== undefined) wizardInput.adminToken = opts.adminToken as string;
+        if (opts.mount !== undefined) wizardInput.mount = opts.mount as string;
+        if (opts.pathPrefix !== undefined) wizardInput.pathPrefix = opts.pathPrefix as string;
+        if (opts.policyName !== undefined) wizardInput.policyName = opts.policyName as string;
+        if (opts.tokenRole !== undefined) wizardInput.tokenRole = opts.tokenRole as string;
+        // `--no-promote-default` → opts.promoteDefault === false (commander negated flag)
+        if (opts.promoteDefault !== undefined) wizardInput.promoteToDefault = opts.promoteDefault as boolean;
+        await runSecretBackendOpenbaoWizard(wizardInput, { client, log });
+        return;
+      }
      const config: Record<string, unknown> = {};

      if (type === 'openbao') {
        if (!opts.url) throw new Error('--url is required for openbao backend');
-        if (!opts.tokenSecret) throw new Error('--token-secret is required for openbao backend (format: SECRET/KEY)');
-        const slashIdx = (opts.tokenSecret as string).indexOf('/');
-        if (slashIdx < 1) throw new Error(`Invalid --token-secret '${opts.tokenSecret as string}'. Expected SECRET_NAME/KEY_NAME`);
+        const auth = (opts.auth as string | undefined) ?? 'token';
+        if (auth !== 'token' && auth !== 'kubernetes') {
+          throw new Error(`--auth must be 'token' or 'kubernetes' (got '${auth}')`);
+        }
        config.url = opts.url;
-        config.tokenSecretRef = {
-          name: (opts.tokenSecret as string).slice(0, slashIdx),
-          key: (opts.tokenSecret as string).slice(slashIdx + 1),
-        };
+        config.auth = auth;
+
+        if (auth === 'token') {
+          if (!opts.tokenSecret) throw new Error('--token-secret is required for openbao token auth (format: SECRET/KEY)');
+          const slashIdx = (opts.tokenSecret as string).indexOf('/');
+          if (slashIdx < 1) throw new Error(`Invalid --token-secret '${opts.tokenSecret as string}'. Expected SECRET_NAME/KEY_NAME`);
+          config.tokenSecretRef = {
+            name: (opts.tokenSecret as string).slice(0, slashIdx),
+            key: (opts.tokenSecret as string).slice(slashIdx + 1),
+          };
+        } else {
+          if (!opts.role) throw new Error("--role is required for openbao kubernetes auth (the vault role bound to this pod's ServiceAccount)");
+          config.role = opts.role;
+          if (opts.authMount) config.authMount = opts.authMount;
+          if (opts.saTokenPath) config.serviceAccountTokenPath = opts.saTokenPath;
+        }
+
        if (opts.namespace) config.namespace = opts.namespace;
        if (opts.mount) config.mount = opts.mount;
        if (opts.pathPrefix) config.pathPrefix = opts.pathPrefix;
@@ -378,6 +419,8 @@ export function createCreateCommand(deps: CreateCommandDeps): Command {
    .option('-d, --description <text>', 'Project description', '')
    .option('--proxy-model <name>', 'Plugin name (default, content-pipeline, gate, none)')
    .option('--prompt <text>', 'Project-level prompt / instructions for the LLM')
+    .option('--llm <name>', "Name of an Llm resource (see 'mcpctl get llms'), or 'none' to disable")
+    .option('--llm-model <model>', 'Override the model string for this project (defaults to the Llm\'s own model)')
    .option('--gated', '[deprecated: use --proxy-model default]')
    .option('--no-gated', '[deprecated: use --proxy-model content-pipeline]')
    .option('--server <name>', 'Server name (repeat for multiple)', collect, [])
@@ -397,6 +440,8 @@ export function createCreateCommand(deps: CreateCommandDeps): Command {
      // Pass gated for backward compat with older mcpd
      if (opts.gated !== undefined) body.gated = opts.gated as boolean;
      if (opts.server.length > 0) body.servers = opts.server;
+      if (opts.llm) body.llmProvider = opts.llm;
+      if (opts.llmModel) body.llmModel = opts.llmModel;

      try {
        const project = await client.post<{ id: string; name: string }>('/api/v1/projects', body);
--- a/src/cli/src/commands/describe.ts
+++ b/src/cli/src/commands/describe.ts
@@ -137,6 +137,7 @@ function formatInstanceDetail(instance: Record<string, unknown>, inspect?: Recor
 function formatProjectDetail(
  project: Record<string, unknown>,
  prompts: Array<{ name: string; priority: number; linkTarget: string | null }> = [],
+  knownLlmNames?: Set<string>,
 ): string {
  const lines: string[] = [];
  lines.push(`=== Project: ${project.name} ===`);
@@ -151,8 +152,21 @@ function formatProjectDetail(
  lines.push('');
  lines.push('Plugin Config:');
  lines.push(`  ${pad('Plugin:', 18)}${proxyModel}`);
-  if (llmProvider) lines.push(`  ${pad('LLM Provider:', 18)}${llmProvider}`);
-  if (llmModel) lines.push(`  ${pad('LLM Model:', 18)}${llmModel}`);
+  if (llmProvider) {
+    // As of Phase 4, llmProvider names a centralized Llm resource (see
+    // `mcpctl get llms`). A value like "none" disables LLM for the project;
+    // anything else that doesn't match a registered Llm falls back to the
+    // registry default on consumers — flag it so operators notice.
+    const resolvable = knownLlmNames === undefined
+      || llmProvider === 'none'
+      || knownLlmNames.has(llmProvider);
+    if (resolvable) {
+      lines.push(`  ${pad('LLM:', 18)}${llmProvider}`);
+    } else {
+      lines.push(`  ${pad('LLM:', 18)}${llmProvider}  [warning: no Llm registered with this name — will fall back to registry default]`);
+    }
+  }
+  if (llmModel) lines.push(`  ${pad('LLM Model:', 18)}${llmModel} (override)`);

  // Servers section
  const servers = project.servers as Array<{ server: { name: string } }> | undefined;
@@ -283,6 +297,12 @@ function formatSecretBackendDetail(backend: Record<string, unknown>): string {
    }
  }

+  const tokenMeta = (backend.tokenMeta ?? {}) as Record<string, unknown>;
+  if (tokenMeta.rotatable === true) {
+    lines.push('');
+    lines.push(...formatTokenHealth(tokenMeta));
+  }
+
  lines.push('');
  lines.push('Metadata:');
  lines.push(`  ${pad('ID:', 12)}${backend.id}`);
@@ -292,6 +312,66 @@ function formatSecretBackendDetail(backend: Record<string, unknown>): string {
  return lines.join('\n');
 }

+/**
+ * Render the Token health section for a wizard-provisioned openbao backend.
+ * Returns an array of lines (caller pushes them). Stale = no successful
+ * rotation in >26h (2h grace over the nominal 24h cadence).
+ */
+function formatTokenHealth(meta: Record<string, unknown>): string[] {
+  const lines: string[] = [];
+  const generatedAt = parseIso(meta.generatedAt);
+  const nextRenewalAt = parseIso(meta.nextRenewalAt);
+  const validUntil = parseIso(meta.validUntil);
+  const lastRotationAt = parseIso(meta.lastRotationAt);
+  const lastError = meta.lastRotationError as string | null | undefined;
+  const now = Date.now();
+
+  const STALE_GRACE_MS = 26 * 3600 * 1000;
+  const staleByAge = lastRotationAt !== null && (now - lastRotationAt.getTime()) > STALE_GRACE_MS;
+  const hasError = typeof lastError === 'string' && lastError !== '';
+
+  let status: string;
+  if (hasError && staleByAge) status = 'ERROR (stale)';
+  else if (staleByAge) status = 'STALE — no successful rotation in the last cycle';
+  else if (hasError) status = 'WARNING — last rotation hit an error but token is still fresh';
+  else status = 'healthy';
+
+  lines.push(`Token health:   ${status}`);
+  if (generatedAt !== null) {
+    lines.push(`  ${pad('Generated:', 16)}${generatedAt.toISOString()}${describeAge(generatedAt, now)}`);
+  }
+  if (nextRenewalAt !== null) {
+    lines.push(`  ${pad('Next renewal:', 16)}${nextRenewalAt.toISOString()}${describeAge(nextRenewalAt, now)}`);
+  }
+  if (validUntil !== null) {
+    lines.push(`  ${pad('Valid until:', 16)}${validUntil.toISOString()}${describeAge(validUntil, now)}`);
+  }
+  if (lastRotationAt !== null) {
+    lines.push(`  ${pad('Last rotation:', 16)}${lastRotationAt.toISOString()}${describeAge(lastRotationAt, now)}`);
+  }
+  if (hasError) {
+    lines.push(`  ${pad('Last error:', 16)}${lastError}`);
+  }
+  return lines;
+}
+
+function parseIso(v: unknown): Date | null {
+  if (typeof v !== 'string' || v === '') return null;
+  const d = new Date(v);
+  return Number.isNaN(d.getTime()) ? null : d;
+}
+
+function describeAge(target: Date, now: number): string {
+  const diffMs = target.getTime() - now;
+  const abs = Math.abs(diffMs);
+  const hours = Math.round(abs / 3600_000);
+  const days = Math.round(abs / 86_400_000);
+  if (abs < 60_000) return '  (just now)';
+  if (abs < 3600_000) return `  (${String(Math.round(abs / 60_000))} min ${diffMs < 0 ? 'ago' : 'away'})`;
+  if (hours < 48) return `  (${String(hours)}h ${diffMs < 0 ? 'ago' : 'away'})`;
+  return `  (${String(days)}d ${diffMs < 0 ? 'ago' : 'away'})`;
+}
+
 function formatTemplateDetail(template: Record<string, unknown>): string {
  const lines: string[] = [];
  lines.push(`=== Template: ${template.name} ===`);
@@ -887,10 +967,16 @@ export function createDescribeCommand(deps: DescribeCommandDeps): Command {
            deps.log(formatLlmDetail(item));
            break;
          case 'projects': {
-            const projectPrompts = await deps.client
-              .get<Array<{ name: string; priority: number; linkTarget: string | null }>>(`/api/v1/prompts?projectId=${item.id as string}`)
-              .catch(() => []);
-            deps.log(formatProjectDetail(item, projectPrompts));
+            const [projectPrompts, llms] = await Promise.all([
+              deps.client
+                .get<Array<{ name: string; priority: number; linkTarget: string | null }>>(`/api/v1/prompts?projectId=${item.id as string}`)
+                .catch(() => []),
+              deps.client
+                .get<Array<{ name: string }>>('/api/v1/llms')
+                .catch(() => [] as Array<{ name: string }>),
+            ]);
+            const llmNames = new Set(llms.map((l) => l.name));
+            deps.log(formatProjectDetail(item, projectPrompts, llmNames));
            break;
          }
          case 'users': {
--- a/src/cli/src/commands/rotate.ts
+++ b/src/cli/src/commands/rotate.ts
@@ -0,0 +1,50 @@
+/**
+ * `mcpctl rotate secretbackend <name>` — force an immediate token rotation on
+ * a wizard-provisioned OpenBao backend.
+ *
+ * Hits `POST /api/v1/secretbackends/:id/rotate` after resolving name → id.
+ * Gated server-side by the `rotate-secretbackend` operation.
+ */
+import { Command } from 'commander';
+import type { ApiClient } from '../api-client.js';
+import { resolveNameOrId } from './shared.js';
+
+export interface RotateCommandDeps {
+  client: ApiClient;
+  log: (...args: unknown[]) => void;
+}
+
+export function createRotateCommand(deps: RotateCommandDeps): Command {
+  const { client, log } = deps;
+
+  const cmd = new Command('rotate')
+    .description('Force rotation of a credential-rotating resource (currently: secretbackend)');
+
+  cmd.command('secretbackend')
+    .alias('sb')
+    .description('Rotate the vault token on an OpenBao SecretBackend (wizard-provisioned)')
+    .argument('<name>', 'SecretBackend name or id')
+    .action(async (nameOrId: string) => {
+      const id = await resolveNameOrId(client, 'secretbackends', nameOrId);
+      const res = await client.post<{ ok?: boolean; tokenMeta?: Record<string, unknown>; error?: string }>(
+        `/api/v1/secretbackends/${id}/rotate`,
+        {},
+      );
+      if (res.ok !== true) {
+        throw new Error(`rotation failed: ${res.error ?? 'unknown error'}`);
+      }
+      log(`secretbackend '${nameOrId}' rotated.`);
+      const meta = res.tokenMeta ?? {};
+      if (typeof meta.generatedAt === 'string') {
+        log(`  generated:    ${meta.generatedAt}`);
+      }
+      if (typeof meta.nextRenewalAt === 'string') {
+        log(`  next renewal: ${meta.nextRenewalAt}`);
+      }
+      if (typeof meta.validUntil === 'string') {
+        log(`  valid until:  ${meta.validUntil}`);
+      }
+    });
+
+  return cmd;
+}
--- a/src/cli/src/index.ts
+++ b/src/cli/src/index.ts
@@ -19,6 +19,7 @@ import { createPatchCommand } from './commands/patch.js';
 import { createConsoleCommand } from './commands/console/index.js';
 import { createCacheCommand } from './commands/cache.js';
 import { createMigrateCommand } from './commands/migrate.js';
+import { createRotateCommand } from './commands/rotate.js';
 import { ApiClient, ApiError } from './api-client.js';
 import { loadConfig } from './config/index.js';
 import { loadCredentials } from './auth/index.js';
@@ -255,6 +256,11 @@ export function createProgram(): Command {
    log: (...args) => console.log(...args),
  }));

+  program.addCommand(createRotateCommand({
+    client,
+    log: (...args) => console.log(...args),
+  }));
+
  return program;
 }

--- a/src/cli/tests/commands/create-secretbackend-wizard.test.ts
+++ b/src/cli/tests/commands/create-secretbackend-wizard.test.ts
@@ -0,0 +1,150 @@
+import { describe, it, expect, vi } from 'vitest';
+import { runSecretBackendOpenbaoWizard } from '../../src/commands/create-secretbackend-wizard.js';
+import type { ApiClient } from '../../src/api-client.js';
+import type { ConfigSetupPrompt } from '../../src/commands/config-setup.js';
+
+function mockClient(handlers: Record<string, (body?: unknown) => unknown>): ApiClient {
+  const call = (method: 'GET' | 'POST' | 'PUT' | 'DELETE') => async (path: string, body?: unknown) => {
+    const handler = handlers[`${method} ${path}`] ?? handlers[path];
+    if (handler === undefined) throw new Error(`unmocked ${method} ${path}`);
+    return handler(body);
+  };
+  return {
+    get: call('GET'),
+    post: call('POST'),
+    put: call('PUT'),
+    delete: call('DELETE'),
+  } as unknown as ApiClient;
+}
+
+function vaultFetch(responses: Array<{ match: RegExp; status: number; body?: unknown }>): ReturnType<typeof vi.fn> {
+  return vi.fn(async (url: string | URL, init?: RequestInit) => {
+    const key = `${init?.method ?? 'GET'} ${String(url)}`;
+    const match = responses.find((r) => r.match.test(key) || r.match.test(String(url)));
+    if (!match) throw new Error(`unexpected vault fetch: ${key}`);
+    const body = match.body !== undefined ? JSON.stringify(match.body) : '';
+    return new Response(body, { status: match.status });
+  });
+}
+
+function scriptedPrompt(answers: {
+  input?: Record<string, string>;
+  password?: Record<string, string>;
+  confirm?: Record<string, boolean>;
+}): ConfigSetupPrompt {
+  return {
+    async input(message, def) {
+      return answers.input?.[message] ?? def ?? '';
+    },
+    async password(message) {
+      return answers.password?.[message] ?? '';
+    },
+    async confirm(message, def) {
+      return answers.confirm?.[message] ?? def ?? true;
+    },
+    select: vi.fn(),
+  };
+}
+
+describe('runSecretBackendOpenbaoWizard', () => {
+  it('walks through provisioning and creates Secret + SecretBackend + triggers initial rotate', async () => {
+    const logs: string[] = [];
+    const log = (...args: unknown[]) => logs.push(args.map(String).join(' '));
+
+    const vaultResponses = [
+      { match: /GET .*\/v1\/sys\/health$/, status: 200, body: { initialized: true, sealed: false, standby: false, version: '2.5.2' } },
+      { match: /GET .*\/v1\/sys\/mounts$/, status: 200, body: { 'secret/': { type: 'kv', options: { version: '2' } } } },
+      { match: /PUT .*\/v1\/sys\/policies\/acl\/app-mcpd$/, status: 200 },
+      { match: /POST .*\/v1\/auth\/token\/roles\/app-mcpd-role$/, status: 200 },
+      { match: /POST .*\/v1\/auth\/token\/create\/app-mcpd-role$/, status: 200, body: { auth: { client_token: 'hvs.AAA', accessor: 'acc-first', lease_duration: 2592000, renewable: true } } },
+      // smoke test: write / read / delete
+      { match: /POST .*\/v1\/secret\/data\/mcpd\/\.__mcpctl_wizard_smoke__$/, status: 200 },
+      { match: /GET .*\/v1\/secret\/data\/mcpd\/\.__mcpctl_wizard_smoke__$/, status: 200, body: { data: { data: { marker: 'mcpctl-smoke' } } } },
+      { match: /DELETE .*\/v1\/secret\/metadata\/mcpd\/\.__mcpctl_wizard_smoke__$/, status: 200 },
+    ];
+    const fetchFn = vaultFetch(vaultResponses);
+
+    const created: Record<string, unknown> = {};
+    const client = mockClient({
+      'POST /api/v1/secrets': (body) => { created.secret = body; return { id: 'sec-new', name: (body as { name: string }).name }; },
+      'POST /api/v1/secretbackends': (body) => { created.backend = body; return { id: 'backend-new', name: (body as { name: string }).name }; },
+      'POST /api/v1/secretbackends/backend-new/rotate': () => ({ ok: true, tokenMeta: { generatedAt: 'now' } }),
+      'POST /api/v1/secretbackends/backend-new/default': () => ({ id: 'backend-new' }),
+      'GET /api/v1/secretbackends': () => [{ name: 'default', isDefault: true }],
+      'POST /api/v1/secrets/migrate': () => ({ dryRun: true, candidates: [{ id: 's1', name: 'grafana-creds' }, { id: 's2', name: 'unifi-creds' }] }),
+    });
+
+    const prompt = scriptedPrompt({
+      input: {
+        'OpenBao URL': 'http://bao.example:8200',
+        'KV v2 mount': 'secret',
+        'Path prefix under mount': 'mcpd',
+        'Policy name': 'app-mcpd',
+        'Token role name': 'app-mcpd-role',
+      },
+      password: {
+        'OpenBao admin / root token': 'root.admin.token',
+      },
+      confirm: {
+        "Promote 'bao' to default backend?": true,
+      },
+    });
+
+    await runSecretBackendOpenbaoWizard(
+      { name: 'bao' },
+      { client, log, prompt, fetch: fetchFn as unknown as typeof fetch },
+    );
+
+    // Admin token used for the provisioning calls (first 5 vault requests)
+    const firstCallInit = fetchFn.mock.calls[0]![1] as RequestInit;
+    expect((firstCallInit.headers as Record<string, string>)['X-Vault-Token']).toBe('root.admin.token');
+
+    // Secret was created with the minted token value (hvs.AAA), not the admin token
+    expect(created.secret).toMatchObject({ name: 'bao-creds', data: { token: 'hvs.AAA' } });
+
+    // SecretBackend created with rotation config
+    expect(created.backend).toMatchObject({
+      name: 'bao',
+      type: 'openbao',
+      config: expect.objectContaining({
+        url: 'http://bao.example:8200',
+        auth: 'token',
+        tokenSecretRef: { name: 'bao-creds', key: 'token' },
+        rotation: expect.objectContaining({ enabled: true, tokenRole: 'app-mcpd-role' }),
+      }),
+    });
+
+    // Migration hint mentions both candidate count + the concrete command
+    const fullLog = logs.join('\n');
+    expect(fullLog).toContain("You have 2 secret(s) on 'default'");
+    expect(fullLog).toContain('mcpctl --direct migrate secrets --from default --to bao');
+
+    // Admin token never appears in the log (critical)
+    expect(fullLog).not.toContain('root.admin.token');
+  });
+
+  it('rejects when admin token is empty', async () => {
+    const prompt = scriptedPrompt({
+      input: { 'OpenBao URL': 'http://x' },
+      password: { 'OpenBao admin / root token': '' },
+    });
+    await expect(runSecretBackendOpenbaoWizard(
+      { name: 'bao' },
+      { client: mockClient({}), log: () => {}, prompt, fetch: vi.fn() as unknown as typeof fetch },
+    )).rejects.toThrow(/admin token is required/);
+  });
+
+  it('rejects when vault is sealed', async () => {
+    const fetchFn = vaultFetch([
+      { match: /\/sys\/health$/, status: 200, body: { initialized: true, sealed: true, standby: false, version: '2.5.2' } },
+    ]);
+    const prompt = scriptedPrompt({
+      input: { 'OpenBao URL': 'http://x' },
+      password: { 'OpenBao admin / root token': 't' },
+    });
+    await expect(runSecretBackendOpenbaoWizard(
+      { name: 'bao' },
+      { client: mockClient({}), log: () => {}, prompt, fetch: fetchFn as unknown as typeof fetch },
+    )).rejects.toThrow(/not ready/);
+  });
+});
--- a/src/cli/tests/commands/describe.test.ts
+++ b/src/cli/tests/commands/describe.test.ts
@@ -108,6 +108,77 @@ describe('describe command', () => {
    expect(text).not.toContain('Gated:');
  });

+  it('shows project Llm reference without warning when the name matches a registered Llm', async () => {
+    const deps = makeDeps({
+      id: 'proj-1',
+      name: 'with-llm',
+      description: '',
+      ownerId: 'user-1',
+      proxyModel: 'default',
+      llmProvider: 'claude',
+      llmModel: 'claude-3-opus',
+      createdAt: '2025-01-01',
+    });
+    // /api/v1/llms returns a claude entry → no warning
+    deps.client = {
+      get: vi.fn(async (path: string) => {
+        if (path === '/api/v1/llms') return [{ name: 'claude' }];
+        return [];
+      }),
+    } as unknown as typeof deps.client;
+    const cmd = createDescribeCommand(deps);
+    await cmd.parseAsync(['node', 'test', 'project', 'proj-1']);
+    const text = deps.output.join('\n');
+    expect(text).toContain('LLM:');
+    expect(text).toContain('claude');
+    expect(text).not.toContain('warning:');
+  });
+
+  it('warns on describe project when llmProvider does not resolve to any registered Llm', async () => {
+    const deps = makeDeps({
+      id: 'proj-1',
+      name: 'orphan',
+      description: '',
+      ownerId: 'user-1',
+      proxyModel: 'default',
+      llmProvider: 'claude-ghost',
+      createdAt: '2025-01-01',
+    });
+    deps.client = {
+      get: vi.fn(async (path: string) => {
+        if (path === '/api/v1/llms') return [{ name: 'claude' }, { name: 'gpt-4o' }];
+        return [];
+      }),
+    } as unknown as typeof deps.client;
+    const cmd = createDescribeCommand(deps);
+    await cmd.parseAsync(['node', 'test', 'project', 'proj-1']);
+    const text = deps.output.join('\n');
+    expect(text).toContain('claude-ghost');
+    expect(text).toContain('warning:');
+    expect(text).toContain('fall back to registry default');
+  });
+
+  it('does not warn when llmProvider is "none" (explicit disable)', async () => {
+    const deps = makeDeps({
+      id: 'proj-1',
+      name: 'no-llm',
+      description: '',
+      ownerId: 'user-1',
+      proxyModel: 'default',
+      llmProvider: 'none',
+      createdAt: '2025-01-01',
+    });
+    deps.client = {
+      get: vi.fn(async () => []),
+    } as unknown as typeof deps.client;
+    const cmd = createDescribeCommand(deps);
+    await cmd.parseAsync(['node', 'test', 'project', 'proj-1']);
+    const text = deps.output.join('\n');
+    expect(text).toContain('LLM:');
+    expect(text).toContain('none');
+    expect(text).not.toContain('warning:');
+  });
+
  it('shows project Plugin Config defaulting to "default" when proxyModel is empty', async () => {
    const deps = makeDeps({
      id: 'proj-1',
--- a/src/db/prisma/schema.prisma
+++ b/src/db/prisma/schema.prisma
@@ -125,6 +125,12 @@ model SecretBackend {
  name        String   @unique
  type        String                                  // plaintext | openbao | (future: vault, aws-sm, ...)
  config      Json     @default("{}")                 // type-specific: url, mount, namespace, tokenSecretRef
+  // Runtime metadata for auto-rotating backend credentials (openbao token
+  // auth). Fields: generatedAt, nextRenewalAt, validUntil, lastRotationAt,
+  // lastRotationError, rotatable (true only for wizard-provisioned tokens).
+  // Empty object for backends that don't use rotation (plaintext, kubernetes
+  // auth, or static tokens). Managed entirely by the rotator service.
+  tokenMeta   Json     @default("{}")
  isDefault   Boolean  @default(false)                // exactly one row has isDefault=true
  description String   @default("")
  version     Int      @default(1)
@@ -142,7 +148,12 @@ model SecretBackend {
 model Secret {
  id          String   @id @default(cuid())
  name        String   @unique
-  backendId   String                                  // FK to SecretBackend — dispatches read/write
+  // FK to SecretBackend. Default empty string lets `prisma db push` add the
+  // column to pre-existing rows without a data-loss reset; `bootstrapSecretBackends`
+  // then points any empty-string values at the seeded `default` plaintext backend
+  // on next mcpd startup. New rows written by SecretService always carry a
+  // valid FK immediately.
+  backendId   String   @default("")
  data        Json     @default("{}")                 // populated by plaintext backend only
  externalRef String   @default("")                   // populated by non-plaintext backends (e.g. "mount/path#v3")
  version     Int      @default(1)
--- a/src/db/src/scripts/pre-migrate-bootstrap.ts
+++ b/src/db/src/scripts/pre-migrate-bootstrap.ts
@@ -0,0 +1,105 @@
+/**
+ * Self-healing pre-migration step for the SecretBackend rollout (Phase 0).
+ *
+ * Why this exists: `prisma db push` applies schema changes sequentially. When
+ * a cluster upgrades from a pre-SecretBackend DB:
+ *   1. `Secret.backendId` column is added with `DEFAULT ''`
+ *   2. `SecretBackend` table is created (empty)
+ *   3. The FK `Secret.backendId → SecretBackend.id` is added — and FAILS
+ *      because every Secret row now has `backendId = ''` which references no
+ *      row in SecretBackend.
+ *
+ * This script runs AFTER a failed `prisma db push` attempt:
+ *   - If SecretBackend table doesn't exist yet → noop (fresh install case;
+ *     db push will create everything and the FK succeeds because there are
+ *     no Secret rows to violate it).
+ *   - If SecretBackend exists but is empty → insert a default plaintext row.
+ *   - If any Secret rows have `backendId = ''` → point them at the default.
+ *
+ * Idempotent: safe to run multiple times. No-op on a fully-migrated cluster.
+ * Never throws; logs and exits 0 even on errors so the subsequent
+ * `prisma db push` retry is still attempted.
+ */
+import { PrismaClient, Prisma } from '@prisma/client';
+
+const DEFAULT_ID = 'cdefault000backend00000001';
+
+async function main(): Promise<void> {
+  const prisma = new PrismaClient();
+  try {
+    // Does the SecretBackend table exist yet? We check by querying the
+    // information_schema rather than catching Prisma's error — cleaner, and
+    // lets us distinguish "table missing" from "query succeeded but empty".
+    const tableExists = await prisma.$queryRaw<Array<{ exists: boolean }>>`
+      SELECT EXISTS (
+        SELECT 1 FROM information_schema.tables
+        WHERE table_schema = 'public' AND table_name = 'SecretBackend'
+      ) AS exists
+    `;
+    if (!tableExists[0]?.exists) {
+      console.log('bootstrap: SecretBackend table not present yet — skipping');
+      return;
+    }
+
+    // Ensure at least one row exists, marked isDefault.
+    const existingDefault = await prisma.$queryRaw<Array<{ id: string }>>`
+      SELECT id FROM "SecretBackend" WHERE "isDefault" = true LIMIT 1
+    `;
+    let defaultId: string;
+    if (existingDefault.length === 0) {
+      await prisma.$executeRaw`
+        INSERT INTO "SecretBackend"
+          ("id", "name", "type", "config", "isDefault", "description", "version", "createdAt", "updatedAt")
+        VALUES (
+          ${DEFAULT_ID},
+          'default',
+          'plaintext',
+          '{}'::jsonb,
+          true,
+          'Default in-database plaintext backend. Seeded by pre-migrate-bootstrap.',
+          1,
+          CURRENT_TIMESTAMP,
+          CURRENT_TIMESTAMP
+        )
+        ON CONFLICT (name) DO NOTHING
+      `;
+      // Re-read — if there was an existing row with the same name but no
+      // isDefault flag we need its id, not the one we tried to insert.
+      const afterInsert = await prisma.$queryRaw<Array<{ id: string }>>`
+        SELECT id FROM "SecretBackend" WHERE name = 'default' LIMIT 1
+      `;
+      if (afterInsert.length === 0) {
+        console.log('bootstrap: could not establish a default SecretBackend — bailing');
+        return;
+      }
+      defaultId = afterInsert[0]!.id;
+      // Make sure it's flagged default.
+      await prisma.$executeRaw`
+        UPDATE "SecretBackend" SET "isDefault" = true WHERE id = ${defaultId}
+      `;
+      console.log(`bootstrap: seeded default SecretBackend (id=${defaultId})`);
+    } else {
+      defaultId = existingDefault[0]!.id;
+    }
+
+    // Backfill Secret.backendId for any rows left with an empty value.
+    // Using $executeRaw returns affected row count.
+    const updated = await prisma.$executeRaw(
+      Prisma.sql`UPDATE "Secret" SET "backendId" = ${defaultId} WHERE "backendId" = ''`,
+    );
+    if (updated > 0) {
+      console.log(`bootstrap: backfilled ${updated} Secret row(s) with default backendId`);
+    }
+  } catch (err) {
+    // Never fail the deploy — worst case prisma db push tries again anyway.
+    // Log the error so it's visible in pod logs.
+    console.error('bootstrap: non-fatal error:', err instanceof Error ? err.message : err);
+  } finally {
+    await prisma.$disconnect();
+  }
+}
+
+main().catch((err: unknown) => {
+  console.error('bootstrap: fatal error (ignored):', err);
+  // Intentionally exit 0 — we don't want to block the deploy on this.
+});
--- a/src/mcpd/src/main.ts
+++ b/src/mcpd/src/main.ts
@@ -26,9 +26,14 @@ import { SecretMigrateService } from './services/secret-migrate.service.js';
 import { bootstrapSecretBackends } from './bootstrap/secret-backends.js';
 import { registerSecretBackendRoutes } from './routes/secret-backends.js';
 import { registerSecretMigrateRoutes } from './routes/secret-migrate.js';
+import { SecretBackendRotator } from './services/secret-backend-rotator.service.js';
+import { SecretBackendRotatorLoop } from './services/secret-backend-rotator-loop.js';
+import { registerSecretBackendRotateRoutes } from './routes/secret-backend-rotate.js';
 import { LlmRepository } from './repositories/llm.repository.js';
 import { LlmService } from './services/llm.service.js';
+import { LlmAdapterRegistry } from './services/llm/dispatcher.js';
 import { registerLlmRoutes } from './routes/llms.js';
+import { registerLlmInferRoutes } from './routes/llm-infer.js';
 import { PromptRepository } from './repositories/prompt.repository.js';
 import { PromptRequestRepository } from './repositories/prompt-request.repository.js';
 import { bootstrapSystemProject } from './bootstrap/system-project.js';
@@ -104,6 +109,18 @@ function mapUrlToPermission(method: string, url: string): PermissionCheck {
  if (segment === 'audit-logs' && method === 'DELETE') return { kind: 'operation', operation: 'audit-purge' };
  // /api/v1/secrets/migrate is a bulk cross-backend operation — treat as op, not a plain secret write.
  if (url.startsWith('/api/v1/secrets/migrate')) return { kind: 'operation', operation: 'migrate-secrets' };
+  // /api/v1/secretbackends/:id/rotate — manual rotation trigger. Operation so
+  // only explicitly-granted callers can force it (the loop itself bypasses
+  // RBAC by calling the rotator in-process).
+  if (/^\/api\/v1\/secretbackends\/[^/?]+\/rotate/.test(url)) {
+    return { kind: 'operation', operation: 'rotate-secretbackend' };
+  }
+
+  // /api/v1/llms/:name/infer → `run:llms:<name>` (not the default create:llms).
+  const inferMatch = url.match(/^\/api\/v1\/llms\/([^/?]+)\/infer/);
+  if (inferMatch?.[1]) {
+    return { kind: 'resource', resource: 'llms', action: 'run', resourceName: inferMatch[1] };
+  }

  const resourceMap: Record<string, string | undefined> = {
    'servers': 'servers',
@@ -223,7 +240,7 @@ async function migrateAdminRole(rbacRepo: InstanceType<typeof RbacDefinitionRepo
    // Add operation bindings (idempotent — only for wildcard admin)
    const hasWildcard = bindings.some((b) => b['role'] === 'admin' && b['resource'] === '*');
    if (hasWildcard) {
-      const ops = ['impersonate', 'logs', 'backup', 'restore', 'audit-purge'];
+      const ops = ['impersonate', 'logs', 'backup', 'restore', 'audit-purge', 'migrate-secrets', 'rotate-secretbackend'];
      for (const op of ops) {
        if (!newBindings.some((b) => b['action'] === op)) {
          newBindings.push({ role: 'run', action: op });
@@ -333,7 +350,16 @@ async function main(): Promise<void> {
  });
  const secretService = new SecretService(secretRepo, secretBackendService);
  const secretMigrateService = new SecretMigrateService(secretRepo, secretBackendService);
+  const secretBackendRotator = new SecretBackendRotator({
+    backends: secretBackendService,
+    secrets: secretService,
+  });
+  const secretBackendRotatorLoop = new SecretBackendRotatorLoop({
+    backends: secretBackendService,
+    rotator: secretBackendRotator,
+  });
  const llmService = new LlmService(llmRepo, secretService);
+  const llmAdapters = new LlmAdapterRegistry();
  const instanceService = new InstanceService(instanceRepo, serverRepo, orchestrator, secretService);
  serverService.setInstanceService(instanceService);
  const projectService = new ProjectService(projectRepo, serverRepo);
@@ -473,8 +499,26 @@ async function main(): Promise<void> {
  registerTemplateRoutes(app, templateService);
  registerSecretRoutes(app, secretService);
  registerSecretBackendRoutes(app, secretBackendService);
+  registerSecretBackendRotateRoutes(app, secretBackendRotator);
  registerSecretMigrateRoutes(app, secretMigrateService);
  registerLlmRoutes(app, llmService);
+  registerLlmInferRoutes(app, {
+    llmService,
+    adapters: llmAdapters,
+    onInferenceEvent: (event) => {
+      app.log.info({
+        event: 'llm_inference_call',
+        llm: event.llmName,
+        model: event.model,
+        type: event.type,
+        userId: event.userId,
+        tokenSha: event.tokenSha,
+        streaming: event.streaming,
+        durationMs: event.durationMs,
+        status: event.status,
+      });
+    },
+  });
  registerInstanceRoutes(app, instanceService);
  registerProjectRoutes(app, projectService);
  registerAuditLogRoutes(app, auditLogService);
@@ -615,11 +659,19 @@ async function main(): Promise<void> {
  );
  healthProbeRunner.start(15_000);

+  // SecretBackend token rotator — wakes up for wizard-provisioned openbao
+  // backends only, noop for the rest. Errors inside the loop are logged +
+  // surfaced in `describe secretbackend`, never kill the process.
+  secretBackendRotatorLoop.start().catch((err: unknown) => {
+    app.log.error({ err }, 'secret-backend rotator loop failed to start');
+  });
+
  // Graceful shutdown
  setupGracefulShutdown(app, {
    disconnectDb: async () => {
      clearInterval(reconcileTimer);
      healthProbeRunner.stop();
+      secretBackendRotatorLoop.stop();
      gitBackup.stop();
      await prisma.$disconnect();
    },
--- a/src/mcpd/src/repositories/secret-backend.repository.ts
+++ b/src/mcpd/src/repositories/secret-backend.repository.ts
@@ -12,6 +12,7 @@ export interface UpdateSecretBackendInput {
  config?: Record<string, unknown>;
  isDefault?: boolean;
  description?: string;
+  tokenMeta?: Record<string, unknown>;
 }

 export interface ISecretBackendRepository {
@@ -79,6 +80,7 @@ export class SecretBackendRepository implements ISecretBackendRepository {
      if (data.config !== undefined) updateData.config = data.config as Prisma.InputJsonValue;
      if (data.isDefault !== undefined) updateData.isDefault = data.isDefault;
      if (data.description !== undefined) updateData.description = data.description;
+      if (data.tokenMeta !== undefined) updateData.tokenMeta = data.tokenMeta as Prisma.InputJsonValue;
      return tx.secretBackend.update({ where: { id }, data: updateData });
    });
  }
--- a/src/mcpd/src/routes/llm-infer.ts
+++ b/src/mcpd/src/routes/llm-infer.ts
@@ -0,0 +1,145 @@
+/**
+ * POST /api/v1/llms/:name/infer
+ *
+ * OpenAI-compatible chat completions endpoint. The RBAC check runs in the
+ * global hook — this URL maps to `run:llms:<name>`, not the default
+ * `create:llms`. See `main.ts:mapUrlToPermission`.
+ *
+ * Non-streaming: resolves the Llm, dispatches to the right provider adapter,
+ * returns the OpenAI chat.completion JSON.
+ *
+ * Streaming (`stream: true`): pipes adapter-emitted chunks back as
+ * `text/event-stream`. Adapters translate provider-native SSE into OpenAI
+ * `chat.completion.chunk`s so clients can use any OpenAI SDK unchanged.
+ */
+import type { FastifyInstance, FastifyReply } from 'fastify';
+import type { LlmService } from '../services/llm.service.js';
+import type { LlmAdapterRegistry } from '../services/llm/dispatcher.js';
+import { NotFoundError } from '../services/mcp-server.service.js';
+import type { OpenAiChatRequest, InferContext } from '../services/llm/types.js';
+
+export interface LlmInferDeps {
+  llmService: LlmService;
+  adapters: LlmAdapterRegistry;
+  /** Optional hook to emit audit events — consumer may ignore. */
+  onInferenceEvent?: (event: InferenceAuditEvent) => void;
+}
+
+export interface InferenceAuditEvent {
+  kind: 'llm_inference_call';
+  llmName: string;
+  model: string;
+  type: string;
+  userId?: string | undefined;
+  tokenSha?: string | undefined;
+  streaming: boolean;
+  durationMs: number;
+  status: number;
+}
+
+export function registerLlmInferRoutes(
+  app: FastifyInstance,
+  deps: LlmInferDeps,
+): void {
+  app.post<{ Params: { name: string }; Body: OpenAiChatRequest }>(
+    '/api/v1/llms/:name/infer',
+    async (request, reply) => {
+      const started = Date.now();
+      let llm;
+      try {
+        llm = await deps.llmService.getByName(request.params.name);
+      } catch (err) {
+        if (err instanceof NotFoundError) {
+          reply.code(404);
+          return { error: err.message };
+        }
+        throw err;
+      }
+
+      const body = (request.body ?? {}) as OpenAiChatRequest;
+      if (!body.messages || body.messages.length === 0) {
+        reply.code(400);
+        return { error: 'messages is required' };
+      }
+
+      // Resolve API key (may be empty string for providers that don't take one).
+      let apiKey = '';
+      if (llm.apiKeyRef !== null) {
+        try {
+          apiKey = await deps.llmService.resolveApiKey(llm.name);
+        } catch (err) {
+          reply.code(500);
+          return { error: `Failed to resolve API key: ${err instanceof Error ? err.message : String(err)}` };
+        }
+      }
+
+      const ctx: InferContext = {
+        body,
+        modelOverride: llm.model,
+        apiKey,
+        url: llm.url,
+        extraConfig: llm.extraConfig,
+      };
+
+      const adapter = deps.adapters.get(llm.type);
+      const streaming = body.stream === true;
+
+      const audit = (status: number): void => {
+        if (deps.onInferenceEvent === undefined) return;
+        deps.onInferenceEvent({
+          kind: 'llm_inference_call',
+          llmName: llm.name,
+          model: llm.model,
+          type: llm.type,
+          userId: request.userId,
+          tokenSha: request.mcpToken?.tokenSha,
+          streaming,
+          durationMs: Date.now() - started,
+          status,
+        });
+      };
+
+      if (!streaming) {
+        try {
+          const result = await adapter.infer(ctx);
+          reply.code(result.status);
+          audit(result.status);
+          return result.body;
+        } catch (err) {
+          audit(502);
+          reply.code(502);
+          return { error: err instanceof Error ? err.message : String(err) };
+        }
+      }
+
+      // Streaming path — set SSE headers and pipe chunks.
+      reply.raw.writeHead(200, {
+        'Content-Type': 'text/event-stream',
+        'Cache-Control': 'no-cache',
+        Connection: 'keep-alive',
+        'X-Accel-Buffering': 'no',
+      });
+      try {
+        for await (const chunk of adapter.stream(ctx)) {
+          writeSseChunk(reply, chunk.data);
+          if (chunk.done === true) break;
+        }
+        audit(200);
+      } catch (err) {
+        const payload = JSON.stringify({
+          error: { message: err instanceof Error ? err.message : String(err) },
+        });
+        writeSseChunk(reply, payload);
+        writeSseChunk(reply, '[DONE]');
+        audit(502);
+      } finally {
+        reply.raw.end();
+      }
+      return reply;
+    },
+  );
+}
+
+function writeSseChunk(reply: FastifyReply, data: string): void {
+  reply.raw.write(`data: ${data}\n\n`);
+}
--- a/src/mcpd/src/routes/llms.ts
+++ b/src/mcpd/src/routes/llms.ts
@@ -10,9 +10,12 @@ export function registerLlmRoutes(
    return service.list();
  });

+  // Accepts either CUID or human name. Used both by the CLI (which usually
+  // resolves to CUID first) and by FailoverRouter's RBAC pre-check (which
+  // hands over the user-facing name to avoid an extra round-trip).
  app.get<{ Params: { id: string } }>('/api/v1/llms/:id', async (request, reply) => {
    try {
-      return await service.getById(request.params.id);
+      return await getByIdOrName(service, request.params.id);
    } catch (err) {
      if (err instanceof NotFoundError) {
        reply.code(404);
@@ -22,6 +25,10 @@ export function registerLlmRoutes(
    }
  });

+  // No explicit HEAD handler: Fastify auto-derives HEAD from GET, which runs
+  // the same RBAC hook + lookup and drops the body. That's exactly what
+  // FailoverRouter wants for its "can the caller still view this Llm?" probe.
+
  app.post('/api/v1/llms', async (request, reply) => {
    try {
      const row = await service.create(request.body);
@@ -62,3 +69,17 @@ export function registerLlmRoutes(
    }
  });
 }
+
+const CUID_RE = /^c[a-z0-9]{24}/i;
+
+/**
+ * Look up by CUID first; if the input doesn't look like one, fall back to
+ * findByName. Lets the same URL serve both `mcpctl describe llm <name>` and
+ * the FailoverRouter's name-based RBAC check.
+ */
+async function getByIdOrName(service: LlmService, idOrName: string) {
+  if (CUID_RE.test(idOrName)) {
+    return service.getById(idOrName);
+  }
+  return service.getByName(idOrName);
+}
--- a/src/mcpd/src/routes/secret-backend-rotate.ts
+++ b/src/mcpd/src/routes/secret-backend-rotate.ts
@@ -0,0 +1,29 @@
+/**
+ * POST /api/v1/secretbackends/:id/rotate — force an immediate rotation.
+ *
+ * Used by the wizard (final verify step) + operators troubleshooting a
+ * stale backend. RBAC handled in the global hook via the operation
+ * `rotate-secretbackend` (see `main.ts:mapUrlToPermission`).
+ */
+import type { FastifyInstance } from 'fastify';
+import type { SecretBackendRotator } from '../services/secret-backend-rotator.service.js';
+import { NotFoundError } from '../services/mcp-server.service.js';
+
+export function registerSecretBackendRotateRoutes(
+  app: FastifyInstance,
+  rotator: SecretBackendRotator,
+): void {
+  app.post<{ Params: { id: string } }>('/api/v1/secretbackends/:id/rotate', async (request, reply) => {
+    try {
+      const tokenMeta = await rotator.rotateOne(request.params.id);
+      return { ok: true, tokenMeta };
+    } catch (err) {
+      if (err instanceof NotFoundError) {
+        reply.code(404);
+        return { error: err.message };
+      }
+      reply.code(502);
+      return { error: err instanceof Error ? err.message : String(err) };
+    }
+  });
+}
--- a/src/mcpd/src/services/llm/adapters/anthropic.ts
+++ b/src/mcpd/src/services/llm/adapters/anthropic.ts
@@ -0,0 +1,256 @@
+/**
+ * Anthropic adapter — translates between OpenAI chat/completions format and
+ * the Anthropic Messages API (`POST /v1/messages`).
+ *
+ * Key differences we translate:
+ *   - OpenAI `role: 'system'` messages become a top-level `system` string.
+ *   - Anthropic returns `content: [{ type: 'text', text }]` — we join into
+ *     OpenAI's `content: "…"` string.
+ *   - Streaming: Anthropic emits a sequence of
+ *     `message_start / content_block_{start,delta,stop} / message_delta /
+ *     message_stop` events. We translate those to OpenAI
+ *     `chat.completion.chunk` deltas.
+ *
+ * This adapter implements the subset needed for plain-text chat — tool-use
+ * translation is intentionally left out for this phase; agents that need tool
+ * calling should target an OpenAI-compatible provider until the translator
+ * covers it.
+ */
+import type {
+  LlmAdapter,
+  InferContext,
+  NonStreamingResult,
+  StreamingChunk,
+  AdapterDeps,
+  OpenAiMessage,
+} from '../types.js';
+
+const DEFAULT_ANTHROPIC_URL = 'https://api.anthropic.com';
+const ANTHROPIC_VERSION = '2023-06-01';
+
+interface AnthropicMessageResponse {
+  id: string;
+  model: string;
+  role: 'assistant';
+  content: Array<{ type: 'text'; text: string } | { type: string; [k: string]: unknown }>;
+  stop_reason?: string;
+  usage?: { input_tokens: number; output_tokens: number };
+}
+
+export class AnthropicAdapter implements LlmAdapter {
+  readonly kind = 'anthropic';
+  private readonly fetchImpl: typeof globalThis.fetch;
+
+  constructor(deps: AdapterDeps = {}) {
+    this.fetchImpl = deps.fetch ?? globalThis.fetch;
+  }
+
+  async infer(ctx: InferContext): Promise<NonStreamingResult> {
+    const url = (ctx.url !== '' ? ctx.url : DEFAULT_ANTHROPIC_URL).replace(/\/+$/, '');
+    const body = this.toAnthropicRequest(ctx, false);
+    const res = await this.fetchImpl(`${url}/v1/messages`, {
+      method: 'POST',
+      headers: this.headers(ctx),
+      body: JSON.stringify(body),
+    });
+    if (!res.ok) {
+      const text = await res.text().catch(() => '');
+      return {
+        status: res.status,
+        body: { error: { message: `anthropic: HTTP ${String(res.status)} ${text}` } },
+      };
+    }
+    const anth = await res.json() as AnthropicMessageResponse;
+    return { status: 200, body: this.toOpenAiResponse(anth) };
+  }
+
+  async *stream(ctx: InferContext): AsyncGenerator<StreamingChunk> {
+    const url = (ctx.url !== '' ? ctx.url : DEFAULT_ANTHROPIC_URL).replace(/\/+$/, '');
+    const body = this.toAnthropicRequest(ctx, true);
+    const res = await this.fetchImpl(`${url}/v1/messages`, {
+      method: 'POST',
+      headers: this.headers(ctx),
+      body: JSON.stringify(body),
+    });
+    if (!res.ok || res.body === null) {
+      const text = await res.text().catch(() => '');
+      throw new Error(`anthropic stream: HTTP ${String(res.status)} ${text}`);
+    }
+
+    const id = `chatcmpl-${cryptoNonce()}`;
+    const model = body.model;
+    const created = Math.floor(Date.now() / 1000);
+
+    // Parse Anthropic SSE. Each event is `event: <name>\ndata: <json>\n\n`.
+    const decoder = new TextDecoder();
+    let buf = '';
+    const reader = res.body.getReader();
+    let emittedFirst = false;
+
+    const baseChunk = (delta: Record<string, unknown>, finishReason?: string): string => {
+      const chunk = {
+        id,
+        object: 'chat.completion.chunk',
+        created,
+        model,
+        choices: [{
+          index: 0,
+          delta,
+          finish_reason: finishReason ?? null,
+        }],
+      };
+      return JSON.stringify(chunk);
+    };
+
+    try {
+      // eslint-disable-next-line no-constant-condition
+      while (true) {
+        const { value, done } = await reader.read();
+        if (done) break;
+        buf += decoder.decode(value, { stream: true });
+
+        let idx: number;
+        while ((idx = buf.indexOf('\n\n')) !== -1) {
+          const rawEvent = buf.slice(0, idx);
+          buf = buf.slice(idx + 2);
+          const parsed = parseSseEvent(rawEvent);
+          if (parsed === null) continue;
+          const { event, data } = parsed;
+
+          if (event === 'content_block_delta') {
+            const textDelta = (data as { delta?: { type?: string; text?: string } }).delta;
+            if (textDelta?.type === 'text_delta' && typeof textDelta.text === 'string') {
+              if (!emittedFirst) {
+                yield { data: baseChunk({ role: 'assistant', content: '' }) };
+                emittedFirst = true;
+              }
+              yield { data: baseChunk({ content: textDelta.text }) };
+            }
+          } else if (event === 'message_delta') {
+            const stopReason = (data as { delta?: { stop_reason?: string } }).delta?.stop_reason;
+            if (typeof stopReason === 'string') {
+              yield { data: baseChunk({}, mapStopReason(stopReason)) };
+            }
+          } else if (event === 'message_stop') {
+            yield { data: '[DONE]', done: true };
+            return;
+          } else if (event === 'error') {
+            throw new Error(`anthropic stream error: ${JSON.stringify(data)}`);
+          }
+        }
+      }
+    } finally {
+      reader.releaseLock();
+    }
+    // Anthropic closed without message_stop — give consumer a clean end.
+    yield { data: '[DONE]', done: true };
+  }
+
+  private headers(ctx: InferContext): Record<string, string> {
+    return {
+      'Content-Type': 'application/json',
+      'x-api-key': ctx.apiKey,
+      'anthropic-version': ANTHROPIC_VERSION,
+    };
+  }
+
+  /** Translate the OpenAI request to the Anthropic Messages shape. */
+  private toAnthropicRequest(ctx: InferContext, stream: boolean): {
+    model: string;
+    max_tokens: number;
+    messages: Array<{ role: 'user' | 'assistant'; content: string }>;
+    system?: string;
+    stream?: boolean;
+    temperature?: number;
+    top_p?: number;
+    stop_sequences?: string[];
+  } {
+    const { body } = ctx;
+    const systemParts: string[] = [];
+    const messages: Array<{ role: 'user' | 'assistant'; content: string }> = [];
+
+    for (const msg of body.messages) {
+      const text = normaliseContent(msg);
+      if (msg.role === 'system') {
+        systemParts.push(text);
+      } else if (msg.role === 'user' || msg.role === 'assistant') {
+        messages.push({ role: msg.role, content: text });
+      }
+      // `tool` role messages are dropped — tool translation is out of scope
+      // for this phase.
+    }
+
+    const out: ReturnType<typeof this.toAnthropicRequest> = {
+      model: body.model !== '' ? body.model : ctx.modelOverride,
+      max_tokens: typeof body.max_tokens === 'number' ? body.max_tokens : 1024,
+      messages,
+    };
+    if (systemParts.length > 0) out.system = systemParts.join('\n\n');
+    if (stream) out.stream = true;
+    if (typeof body.temperature === 'number') out.temperature = body.temperature;
+    if (typeof body.top_p === 'number') out.top_p = body.top_p;
+    if (body.stop !== undefined) {
+      out.stop_sequences = Array.isArray(body.stop) ? body.stop : [body.stop];
+    }
+    return out;
+  }
+
+  private toOpenAiResponse(anth: AnthropicMessageResponse): Record<string, unknown> {
+    const text = anth.content
+      .map((c) => (c.type === 'text' && typeof (c as { text?: unknown }).text === 'string'
+        ? (c as { text: string }).text
+        : ''))
+      .join('');
+    return {
+      id: `chatcmpl-${anth.id}`,
+      object: 'chat.completion',
+      created: Math.floor(Date.now() / 1000),
+      model: anth.model,
+      choices: [{
+        index: 0,
+        message: { role: 'assistant', content: text },
+        finish_reason: mapStopReason(anth.stop_reason ?? 'end_turn'),
+      }],
+      usage: anth.usage ? {
+        prompt_tokens: anth.usage.input_tokens,
+        completion_tokens: anth.usage.output_tokens,
+        total_tokens: anth.usage.input_tokens + anth.usage.output_tokens,
+      } : undefined,
+    };
+  }
+}
+
+function normaliseContent(msg: OpenAiMessage): string {
+  if (typeof msg.content === 'string') return msg.content;
+  return msg.content
+    .map((part) => (typeof part.text === 'string' ? part.text : ''))
+    .join('');
+}
+
+function mapStopReason(r: string): string {
+  // Anthropic → OpenAI finish_reason
+  if (r === 'end_turn' || r === 'stop_sequence') return 'stop';
+  if (r === 'max_tokens') return 'length';
+  if (r === 'tool_use') return 'tool_calls';
+  return r;
+}
+
+function parseSseEvent(raw: string): { event: string; data: unknown } | null {
+  let event = '';
+  let dataLine = '';
+  for (const line of raw.split('\n')) {
+    if (line.startsWith('event:')) event = line.slice(6).trim();
+    else if (line.startsWith('data:')) dataLine += line.slice(5).trim();
+  }
+  if (dataLine === '') return null;
+  try {
+    return { event, data: JSON.parse(dataLine) as unknown };
+  } catch {
+    return null;
+  }
+}
+
+function cryptoNonce(): string {
+  // Not security-sensitive — just a short randomish id.
+  return Math.random().toString(36).slice(2, 10);
+}
--- a/src/mcpd/src/services/llm/adapters/openai-passthrough.ts
+++ b/src/mcpd/src/services/llm/adapters/openai-passthrough.ts
@@ -0,0 +1,112 @@
+/**
+ * OpenAI-passthrough adapter.
+ *
+ * Covers any provider that already speaks OpenAI chat/completions on the
+ * wire: `openai`, `vllm`, `deepseek`, `ollama` (with their openai-compatible
+ * endpoint enabled). The adapter forwards the request body verbatim and
+ * streams the response straight through — no wire translation.
+ *
+ * Defaults when `url` is empty:
+ *   - openai → https://api.openai.com
+ *   - deepseek → https://api.deepseek.com
+ *   - vllm/ollama → must be configured; these have no canonical public URL.
+ */
+import type { LlmAdapter, InferContext, NonStreamingResult, StreamingChunk, AdapterDeps } from '../types.js';
+
+const DEFAULT_URLS: Record<string, string> = {
+  openai: 'https://api.openai.com',
+  deepseek: 'https://api.deepseek.com',
+};
+
+export class OpenAiPassthroughAdapter implements LlmAdapter {
+  readonly kind: string;
+  private readonly fetchImpl: typeof globalThis.fetch;
+
+  constructor(kind: 'openai' | 'vllm' | 'deepseek' | 'ollama', deps: AdapterDeps = {}) {
+    this.kind = kind;
+    this.fetchImpl = deps.fetch ?? globalThis.fetch;
+  }
+
+  async infer(ctx: InferContext): Promise<NonStreamingResult> {
+    const url = this.endpointUrl(ctx.url);
+    const body = this.prepareBody(ctx, false);
+    const res = await this.fetchImpl(`${url}/v1/chat/completions`, {
+      method: 'POST',
+      headers: this.headers(ctx),
+      body: JSON.stringify(body),
+    });
+    const json = await res.json() as unknown;
+    return { status: res.status, body: json };
+  }
+
+  async *stream(ctx: InferContext): AsyncGenerator<StreamingChunk> {
+    const url = this.endpointUrl(ctx.url);
+    const body = this.prepareBody(ctx, true);
+    const res = await this.fetchImpl(`${url}/v1/chat/completions`, {
+      method: 'POST',
+      headers: this.headers(ctx),
+      body: JSON.stringify(body),
+    });
+    if (!res.ok || res.body === null) {
+      const text = await res.text().catch(() => '');
+      throw new Error(`${this.kind} stream: HTTP ${String(res.status)} ${text}`);
+    }
+
+    // Re-frame the provider's SSE stream into our `StreamingChunk` shape.
+    // OpenAI-compat providers already emit `data: {...}` + `data: [DONE]` —
+    // we just unwrap the `data: ` prefix, forward payloads, and emit a
+    // single terminal `done` chunk so the consumer always gets one.
+    const decoder = new TextDecoder();
+    let buf = '';
+    const reader = res.body.getReader();
+    try {
+      // eslint-disable-next-line no-constant-condition
+      while (true) {
+        const { value, done } = await reader.read();
+        if (done) break;
+        buf += decoder.decode(value, { stream: true });
+        let idx: number;
+        while ((idx = buf.indexOf('\n\n')) !== -1) {
+          const event = buf.slice(0, idx);
+          buf = buf.slice(idx + 2);
+          for (const line of event.split('\n')) {
+            if (!line.startsWith('data:')) continue;
+            const payload = line.slice(5).trim();
+            if (payload === '') continue;
+            if (payload === '[DONE]') {
+              yield { data: '[DONE]', done: true };
+              return;
+            }
+            yield { data: payload };
+          }
+        }
+      }
+    } finally {
+      reader.releaseLock();
+    }
+    // Provider closed without emitting [DONE] — give the consumer a clean end.
+    yield { data: '[DONE]', done: true };
+  }
+
+  private endpointUrl(url: string): string {
+    if (url !== '') return url.replace(/\/+$/, '');
+    const def = DEFAULT_URLS[this.kind];
+    if (def === undefined) {
+      throw new Error(`${this.kind}: url is required (no default endpoint for this provider)`);
+    }
+    return def;
+  }
+
+  private headers(ctx: InferContext): Record<string, string> {
+    const headers: Record<string, string> = { 'Content-Type': 'application/json' };
+    if (ctx.apiKey !== '') headers['Authorization'] = `Bearer ${ctx.apiKey}`;
+    return headers;
+  }
+
+  private prepareBody(ctx: InferContext, stream: boolean): Record<string, unknown> {
+    const out: Record<string, unknown> = { ...ctx.body };
+    if (out.model === undefined || out.model === '') out.model = ctx.modelOverride;
+    out.stream = stream;
+    return out;
+  }
+}
--- a/src/mcpd/src/services/llm/dispatcher.ts
+++ b/src/mcpd/src/services/llm/dispatcher.ts
@@ -0,0 +1,52 @@
+/**
+ * Adapter dispatcher for the inference proxy.
+ *
+ * `getAdapter(type)` returns the right adapter instance for an Llm's `type`
+ * column. Adapters are cached per-type — they carry no per-request state.
+ * The caller (the infer route) supplies the resolved API key + request body
+ * through `InferContext`, so a single adapter instance serves every Llm of
+ * that type.
+ */
+import type { LlmAdapter, AdapterDeps } from './types.js';
+import { OpenAiPassthroughAdapter } from './adapters/openai-passthrough.js';
+import { AnthropicAdapter } from './adapters/anthropic.js';
+
+export class UnsupportedProviderError extends Error {
+  constructor(type: string) {
+    super(`Unsupported LLM provider: ${type}`);
+    this.name = 'UnsupportedProviderError';
+  }
+}
+
+export class LlmAdapterRegistry {
+  private readonly cache = new Map<string, LlmAdapter>();
+
+  constructor(private readonly deps: AdapterDeps = {}) {}
+
+  get(type: string): LlmAdapter {
+    const cached = this.cache.get(type);
+    if (cached !== undefined) return cached;
+    const adapter = this.build(type);
+    this.cache.set(type, adapter);
+    return adapter;
+  }
+
+  private build(type: string): LlmAdapter {
+    switch (type) {
+      case 'openai':
+      case 'vllm':
+      case 'deepseek':
+      case 'ollama':
+        return new OpenAiPassthroughAdapter(type, this.deps);
+      case 'anthropic':
+        return new AnthropicAdapter(this.deps);
+      case 'gemini-cli':
+        // Intentionally deferred — gemini-cli requires the binary on the mcpd
+        // pod filesystem and subprocess lifecycle management. Flagged as
+        // homelab-only in the plan; not landing in this phase.
+        throw new UnsupportedProviderError(`${type} (subprocess providers are not supported in the proxy yet)`);
+      default:
+        throw new UnsupportedProviderError(type);
+    }
+  }
+}
--- a/src/mcpd/src/services/llm/types.ts
+++ b/src/mcpd/src/services/llm/types.ts
@@ -0,0 +1,70 @@
+/**
+ * Shared types for the LLM inference proxy.
+ *
+ * The wire format on the mcpctl side is OpenAI's chat/completions v1 — it's
+ * the de-facto lingua franca and every client library already speaks it.
+ * Provider-specific adapters translate to/from that shape.
+ */
+
+export interface OpenAiMessage {
+  role: 'system' | 'user' | 'assistant' | 'tool';
+  content: string | Array<{ type: string; text?: string; [k: string]: unknown }>;
+  name?: string;
+  tool_call_id?: string;
+  tool_calls?: Array<{ id: string; type: 'function'; function: { name: string; arguments: string } }>;
+}
+
+export interface OpenAiChatRequest {
+  model: string;
+  messages: OpenAiMessage[];
+  stream?: boolean;
+  temperature?: number;
+  max_tokens?: number;
+  top_p?: number;
+  stop?: string | string[];
+  tools?: Array<{ type: 'function'; function: { name: string; description?: string; parameters?: Record<string, unknown> } }>;
+  tool_choice?: unknown;
+  // Passthrough: unknown extras forwarded as-is.
+  [k: string]: unknown;
+}
+
+export interface InferContext {
+  /** Normalised OpenAI-format body. Adapters read/transform from here. */
+  body: OpenAiChatRequest;
+  /** The Llm row's `model` field, used when the request body has an empty model. */
+  modelOverride: string;
+  /** The resolved API key, or empty string for providers that don't take one. */
+  apiKey: string;
+  /** Target URL from the Llm row (may be empty for provider-default). */
+  url: string;
+  /** Arbitrary config from the Llm row (e.g. vllm gpu settings). */
+  extraConfig: Record<string, unknown>;
+}
+
+export interface NonStreamingResult {
+  status: number;
+  /** OpenAI chat.completion response body. */
+  body: unknown;
+}
+
+export interface StreamingChunk {
+  /** Raw SSE data payload. Consumer emits `data: <payload>\n\n`. */
+  data: string;
+  /** Mark the end of stream — consumer emits `data: [DONE]\n\n`. */
+  done?: boolean;
+}
+
+export interface LlmAdapter {
+  readonly kind: string;
+  /** Non-streaming request. Returns the final chat.completion body. */
+  infer(ctx: InferContext): Promise<NonStreamingResult>;
+  /**
+   * Streaming request. Yields OpenAI-format SSE chunks. Adapters translate
+   * provider-native stream formats into OpenAI `chat.completion.chunk`s.
+   */
+  stream(ctx: InferContext): AsyncGenerator<StreamingChunk>;
+}
+
+export interface AdapterDeps {
+  fetch?: typeof globalThis.fetch;
+}
--- a/src/mcpd/src/services/secret-backend-rotator-loop.ts
+++ b/src/mcpd/src/services/secret-backend-rotator-loop.ts
@@ -0,0 +1,129 @@
+/**
+ * Background loop that drives `SecretBackendRotator` on a 24h cadence.
+ *
+ * - On `start()`: scan all rotatable backends. For each that is overdue
+ *   (never rotated OR last rotation > 24h ago), kick rotation immediately.
+ *   Then schedule a per-backend setTimeout for the next tick.
+ * - On `stop()`: clear every pending timer. Called from the graceful-shutdown
+ *   hook so restarts don't leak timers or interrupt an in-flight rotation.
+ *
+ * Jitter (±10 min by default) keeps multiple mcpd replicas from hammering
+ * OpenBao simultaneously if someone scales the Deployment up.
+ *
+ * Failures are swallowed with a warn log — the next scheduled tick will
+ * retry. The rotator service itself writes `lastRotationError` to the row
+ * so operators see the failure in `describe`.
+ */
+import type { SecretBackend } from '@prisma/client';
+import type { SecretBackendService } from './secret-backend.service.js';
+import type { SecretBackendRotator } from './secret-backend-rotator.service.js';
+
+export interface SecretBackendRotatorLoopDeps {
+  backends: SecretBackendService;
+  rotator: SecretBackendRotator;
+  /** Millisecond jitter applied to the 24h base interval; defaults to ±600_000 (10 min). */
+  jitterMs?: number;
+  /** Override in tests. */
+  setTimeout?: (cb: () => void, ms: number) => NodeJS.Timeout;
+  clearTimeout?: (t: NodeJS.Timeout) => void;
+  log?: { info: (msg: string) => void; warn: (msg: string) => void };
+}
+
+const DEFAULT_INTERVAL_MS = 24 * 3600 * 1000;
+const DEFAULT_JITTER_MS = 10 * 60 * 1000;
+
+export class SecretBackendRotatorLoop {
+  private readonly timers = new Map<string, NodeJS.Timeout>();
+  private readonly setT: (cb: () => void, ms: number) => NodeJS.Timeout;
+  private readonly clearT: (t: NodeJS.Timeout) => void;
+  private readonly log: { info: (msg: string) => void; warn: (msg: string) => void };
+  private stopped = false;
+
+  constructor(private readonly deps: SecretBackendRotatorLoopDeps) {
+    this.setT = deps.setTimeout ?? ((cb, ms) => global.setTimeout(cb, ms));
+    this.clearT = deps.clearTimeout ?? ((t) => global.clearTimeout(t));
+    this.log = deps.log ?? {
+      // eslint-disable-next-line no-console
+      info: (m) => console.log(`[rotator] ${m}`),
+      // eslint-disable-next-line no-console
+      warn: (m) => console.warn(`[rotator] ${m}`),
+    };
+  }
+
+  async start(): Promise<void> {
+    const backends = (await this.deps.backends.list())
+      .filter((b) => this.deps.rotator.isRotatable(b));
+
+    if (backends.length === 0) {
+      this.log.info('no rotatable backends registered — loop idle');
+      return;
+    }
+    this.log.info(`starting rotation loop for ${String(backends.length)} backend(s)`);
+
+    for (const b of backends) {
+      if (this.deps.rotator.isOverdue(b)) {
+        this.log.info(`backend '${b.name}' is overdue — rotating now`);
+        this.runOnce(b.id, b.name).catch((err) => {
+          this.log.warn(`initial rotation of '${b.name}' failed: ${err instanceof Error ? err.message : String(err)}`);
+        });
+      }
+      this.schedule(b);
+    }
+  }
+
+  stop(): void {
+    this.stopped = true;
+    for (const [, t] of this.timers) this.clearT(t);
+    this.timers.clear();
+    this.log.info('rotation loop stopped');
+  }
+
+  /** Test hook — force a rotation + rescheduling for one backend. */
+  async rotateNow(backendId: string): Promise<void> {
+    const backend = await this.deps.backends.getById(backendId);
+    await this.runOnce(backendId, backend.name);
+    this.schedule(backend);
+  }
+
+  private schedule(backend: SecretBackend): void {
+    if (this.stopped) return;
+    // Clear any existing timer for this backend
+    const prev = this.timers.get(backend.id);
+    if (prev !== undefined) this.clearT(prev);
+
+    const delay = this.nextDelayMs(backend);
+    const t = this.setT(() => {
+      this.runOnce(backend.id, backend.name)
+        .catch((err) => this.log.warn(`scheduled rotation of '${backend.name}' failed: ${err instanceof Error ? err.message : String(err)}`))
+        .finally(() => {
+          // Re-fetch to pick up latest tokenMeta (nextRenewalAt) for the next delay calc.
+          if (this.stopped) return;
+          this.deps.backends.getById(backend.id)
+            .then((b) => this.schedule(b))
+            .catch((err) => this.log.warn(`re-schedule lookup for '${backend.name}' failed: ${err instanceof Error ? err.message : String(err)}`));
+        });
+    }, delay);
+    this.timers.set(backend.id, t);
+  }
+
+  private async runOnce(backendId: string, name: string): Promise<void> {
+    try {
+      await this.deps.rotator.rotateOne(backendId);
+      this.log.info(`rotated '${name}' successfully`);
+    } catch (err) {
+      // Error already recorded in tokenMeta by rotator; just log.
+      throw err;
+    }
+  }
+
+  private nextDelayMs(backend: SecretBackend): number {
+    const cfg = backend.config as { rotation?: { intervalHours?: number } };
+    const baseMs = cfg.rotation?.intervalHours !== undefined
+      ? cfg.rotation.intervalHours * 3600 * 1000
+      : DEFAULT_INTERVAL_MS;
+    const jitter = this.deps.jitterMs ?? DEFAULT_JITTER_MS;
+    // Uniform in [-jitter, +jitter]
+    const offset = (Math.random() * 2 - 1) * jitter;
+    return Math.max(60_000, Math.floor(baseMs + offset));
+  }
+}
--- a/src/mcpd/src/services/secret-backend-rotator.service.ts
+++ b/src/mcpd/src/services/secret-backend-rotator.service.ts
@@ -0,0 +1,186 @@
+/**
+ * Rotator for wizard-provisioned OpenBao backends.
+ *
+ * Flow on every tick:
+ *   1. Read the CURRENT mcpd token from its backing plaintext Secret.
+ *   2. Use that token to mint a SUCCESSOR via `auth/token/create/<role>`
+ *      (the `app-mcpd` policy grants the caller exactly this path).
+ *   3. Verify the successor with `auth/token/lookup-self`.
+ *   4. Persist the successor in the same Secret (overwriting the old value).
+ *   5. Revoke the predecessor by accessor (best-effort; old tokens expire on
+ *      their own anyway).
+ *   6. Update `tokenMeta` on the SecretBackend row with the new timestamps.
+ *
+ * On any failure: old token remains in place, `tokenMeta.lastRotationError`
+ * is populated, the exception is re-thrown. Old tokens still have ~29 days
+ * of remaining TTL by design (ttl=720h, rotation cadence=24h), so a few
+ * days of rotation failures are survivable without a user outage.
+ */
+import type { SecretBackend } from '@prisma/client';
+import {
+  mintRoleToken,
+  lookupSelf,
+  revokeAccessor,
+  type VaultDeps,
+  type MintedToken,
+} from '@mcpctl/shared';
+import type { SecretBackendService } from './secret-backend.service.js';
+import type { SecretService } from './secret.service.js';
+
+/** Shape of `SecretBackend.config` we require for rotation. */
+export interface RotatableOpenBaoConfig {
+  url: string;
+  auth?: 'token';
+  mount?: string;
+  pathPrefix?: string;
+  namespace?: string;
+  tokenSecretRef: { name: string; key: string };
+  rotation: {
+    enabled: true;
+    tokenRole: string;
+    intervalHours?: number;
+  };
+}
+
+/** Shape we store in `SecretBackend.tokenMeta`. */
+export interface TokenMeta {
+  generatedAt?: string;
+  nextRenewalAt?: string;
+  validUntil?: string;
+  lastRotationAt?: string;
+  lastRotationError?: string | null;
+  currentAccessor?: string;
+  rotatable?: boolean;
+}
+
+export interface SecretBackendRotatorDeps {
+  backends: SecretBackendService;
+  secrets: SecretService;
+  fetch?: typeof globalThis.fetch;
+  now?: () => Date;
+}
+
+export class SecretBackendRotator {
+  private readonly now: () => Date;
+
+  constructor(private readonly deps: SecretBackendRotatorDeps) {
+    this.now = deps.now ?? (() => new Date());
+  }
+
+  /** True iff this backend is a wizard-provisioned token-auth openbao with rotation enabled. */
+  isRotatable(backend: SecretBackend): boolean {
+    if (backend.type !== 'openbao') return false;
+    const cfg = backend.config as Partial<RotatableOpenBaoConfig>;
+    return (cfg.auth ?? 'token') === 'token'
+      && cfg.rotation?.enabled === true
+      && typeof cfg.rotation?.tokenRole === 'string'
+      && typeof cfg.tokenSecretRef?.name === 'string';
+  }
+
+  /**
+   * Execute one rotation pass on the given backend. Returns the freshly
+   * recorded `tokenMeta`. Throws on any failure — callers decide whether to
+   * log + move on (loop) or propagate (manual trigger).
+   */
+  async rotateOne(backendId: string): Promise<TokenMeta> {
+    const backend = await this.deps.backends.getById(backendId);
+    if (!this.isRotatable(backend)) {
+      throw new Error(`SecretBackend '${backend.name}' is not rotatable (need type=openbao, auth=token, rotation.enabled=true)`);
+    }
+    const cfg = backend.config as unknown as RotatableOpenBaoConfig;
+    const meta = (backend.tokenMeta as unknown as TokenMeta | null | undefined) ?? {};
+
+    const vaultDeps: VaultDeps = {};
+    if (this.deps.fetch !== undefined) vaultDeps.fetch = this.deps.fetch;
+    if (cfg.namespace !== undefined) vaultDeps.namespace = cfg.namespace;
+
+    // 1. Read current token from the backing plaintext Secret.
+    const secretRow = await this.deps.secrets.getByName(cfg.tokenSecretRef.name);
+    const data = await this.deps.secrets.resolveData(secretRow);
+    const currentToken = data[cfg.tokenSecretRef.key];
+    if (currentToken === undefined || currentToken === '') {
+      const err = new Error(`rotation: current token missing at ${cfg.tokenSecretRef.name}/${cfg.tokenSecretRef.key}`);
+      await this.recordError(backendId, meta, err.message);
+      throw err;
+    }
+    const oldAccessor = meta.currentAccessor;
+
+    let minted: MintedToken;
+    try {
+      // 2. Mint successor.
+      minted = await mintRoleToken(cfg.url, currentToken, cfg.rotation.tokenRole, vaultDeps);
+      if (!minted.renewable) {
+        throw new Error(`minted token from role '${cfg.rotation.tokenRole}' is not renewable — check the token role's renewable + period settings`);
+      }
+
+      // 3. Verify successor works (belt-and-suspenders — if bao returned a token
+      //    that can't auth back, we'd lock ourselves out on persist).
+      await lookupSelf(cfg.url, minted.clientToken, vaultDeps);
+
+      // 4. Persist successor in the same Secret. Update in-place — we keep
+      //    the other keys (if any) intact.
+      const nextData = { ...data, [cfg.tokenSecretRef.key]: minted.clientToken };
+      await this.deps.secrets.update(secretRow.id, { data: nextData });
+    } catch (err) {
+      const msg = err instanceof Error ? err.message : String(err);
+      await this.recordError(backendId, meta, msg);
+      throw err;
+    }
+
+    // 5. Revoke predecessor (best-effort — old tokens expire anyway).
+    if (oldAccessor !== undefined && oldAccessor !== '') {
+      try {
+        await revokeAccessor(cfg.url, minted.clientToken, oldAccessor, vaultDeps);
+      } catch (err) {
+        // Log but don't fail the rotation — the new token is already live.
+        const msg = err instanceof Error ? err.message : String(err);
+        // eslint-disable-next-line no-console
+        console.warn(`rotation: revoke old accessor '${oldAccessor}' on backend '${backend.name}' failed (continuing): ${msg}`);
+      }
+    }
+
+    // 6. Record success in tokenMeta.
+    const now = this.now();
+    const intervalHours = cfg.rotation.intervalHours ?? 24;
+    const nextMeta: TokenMeta = {
+      generatedAt: now.toISOString(),
+      nextRenewalAt: new Date(now.getTime() + intervalHours * 3600 * 1000).toISOString(),
+      validUntil: minted.leaseDuration > 0
+        ? new Date(now.getTime() + minted.leaseDuration * 1000).toISOString()
+        : undefined as unknown as string, // typed but optional; undefined drops on JSON round-trip
+      lastRotationAt: now.toISOString(),
+      lastRotationError: null,
+      currentAccessor: minted.accessor,
+      rotatable: true,
+    };
+    // Strip undefined so JSON is clean.
+    const cleanMeta: Record<string, unknown> = {};
+    for (const [k, v] of Object.entries(nextMeta)) {
+      if (v !== undefined) cleanMeta[k] = v;
+    }
+    await this.deps.backends.updateTokenMeta(backendId, cleanMeta);
+    return nextMeta;
+  }
+
+  /** Is this backend overdue for rotation? Used by the loop on startup. */
+  isOverdue(backend: SecretBackend): boolean {
+    const meta = (backend.tokenMeta as unknown as TokenMeta | null | undefined) ?? {};
+    if (meta.lastRotationAt === undefined) return true;
+    const last = new Date(meta.lastRotationAt).getTime();
+    if (Number.isNaN(last)) return true;
+    const cfg = backend.config as Partial<RotatableOpenBaoConfig>;
+    const intervalHours = cfg.rotation?.intervalHours ?? 24;
+    return this.now().getTime() - last > intervalHours * 3600 * 1000;
+  }
+
+  private async recordError(backendId: string, prev: TokenMeta, message: string): Promise<void> {
+    const nextMeta: Record<string, unknown> = { ...prev, lastRotationError: message };
+    try {
+      await this.deps.backends.updateTokenMeta(backendId, nextMeta);
+    } catch (inner) {
+      // Don't mask the original error — just log the DB failure.
+      // eslint-disable-next-line no-console
+      console.warn(`rotation: failed to persist lastRotationError (${message}): ${inner instanceof Error ? inner.message : String(inner)}`);
+    }
+  }
+}
--- a/src/mcpd/src/services/secret-backend.service.ts
+++ b/src/mcpd/src/services/secret-backend.service.ts
@@ -63,6 +63,16 @@ export class SecretBackendService {
    return row;
  }

+  /**
+   * Replace `tokenMeta` on a backend row. Called exclusively by the rotator
+   * service every time it mints or fails to mint a successor token. The field
+   * is runtime state (not user-managed config) so it bypasses the normal
+   * update path + doesn't invalidate the driver cache.
+   */
+  async updateTokenMeta(id: string, tokenMeta: Record<string, unknown>): Promise<SecretBackend> {
+    return this.repo.update(id, { tokenMeta });
+  }
+
  async setDefault(id: string): Promise<SecretBackend> {
    await this.getById(id);
    return this.repo.setAsDefault(id);
--- a/src/mcpd/src/services/secret-backends/factory.ts
+++ b/src/mcpd/src/services/secret-backends/factory.ts
@@ -25,10 +25,26 @@ export function createDriver(row: SecretBackend, deps: DriverFactoryDeps): Secre

    case 'openbao': {
      const cfg = row.config as unknown as OpenBaoConfig;
-      if (!cfg.url || !cfg.tokenSecretRef?.name || !cfg.tokenSecretRef?.key) {
-        throw new Error(
-          `SecretBackend '${row.name}' (openbao): config must provide url + tokenSecretRef {name, key}`,
-        );
+      if (!cfg.url) {
+        throw new Error(`SecretBackend '${row.name}' (openbao): config.url is required`);
+      }
+      const auth = cfg.auth ?? 'token';
+      if (auth === 'token') {
+        const t = cfg as Extract<OpenBaoConfig, { auth?: 'token' }>;
+        if (!t.tokenSecretRef?.name || !t.tokenSecretRef?.key) {
+          throw new Error(
+            `SecretBackend '${row.name}' (openbao token auth): config.tokenSecretRef {name, key} is required`,
+          );
+        }
+      } else if (auth === 'kubernetes') {
+        const k = cfg as Extract<OpenBaoConfig, { auth: 'kubernetes' }>;
+        if (!k.role) {
+          throw new Error(
+            `SecretBackend '${row.name}' (openbao kubernetes auth): config.role is required`,
+          );
+        }
+      } else {
+        throw new Error(`SecretBackend '${row.name}' (openbao): unknown auth '${String(auth)}'`);
      }
      const driverDeps: { fetch?: typeof globalThis.fetch; secretRefResolver: SecretRefResolver } = {
        secretRefResolver: deps.secretRefResolver,
--- a/src/mcpd/src/services/secret-backends/openbao.ts
+++ b/src/mcpd/src/services/secret-backends/openbao.ts
@@ -8,33 +8,69 @@
 *   GET    <url>/v1/<mount>/data/<path>     -- read latest
 *   DELETE <url>/v1/<mount>/metadata/<path> -- full delete (all versions)
 *   LIST   <url>/v1/<mount>/metadata/        -- for migration
+ *   POST   <url>/v1/auth/<mount>/login      -- kubernetes auth
 *
- * Auth: static token for v1. The token is stored in a `Secret` on the
- * plaintext backend (see `config.tokenSecretRef = { name, key }`); the driver
- * resolves it on construction via the injected `SecretRefResolver`. Follow-up
- * work (not here) adds Kubernetes ServiceAccount auth.
+ * Auth strategies (`config.auth`):
+ *   - `token` (default): static token loaded once via the injected
+ *     SecretRefResolver from a Secret on the plaintext backend
+ *     (`tokenSecretRef = { name, key }`). Cached for the driver's lifetime —
+ *     no expiry handling.
+ *   - `kubernetes`: log in to OpenBao's Kubernetes auth method using the
+ *     pod's projected ServiceAccount token. Vault returns a client token +
+ *     lease TTL; we cache it and renew lazily on TTL expiry, with a 60s
+ *     grace window. No static credentials in the database — the bao-side
+ *     role binds to the mcpd ServiceAccount + namespace.
 *
 * Path layout inside OpenBao:
 *   <mount>/<pathPrefix>/<secretName>
 * `mount` and `pathPrefix` come from the backend's `config` JSON; defaults are
 * `secret` and `mcpctl/`.
 */
+import { readFile } from 'node:fs/promises';
 import type { SecretBackendDriver, SecretData, ExternalRef, SecretRefResolver } from './types.js';

-export interface OpenBaoConfig {
+export interface OpenBaoConfigBase {
  url: string;
  mount?: string;
  pathPrefix?: string;
  namespace?: string;
+}
+
+export interface OpenBaoConfigToken extends OpenBaoConfigBase {
+  auth?: 'token';
  tokenSecretRef: { name: string; key: string };
 }

+export interface OpenBaoConfigKubernetes extends OpenBaoConfigBase {
+  auth: 'kubernetes';
+  /** Vault role to login as (configured server-side at `auth/<authMount>/role/<role>`). */
+  role: string;
+  /** Auth method mount path. Defaults to `kubernetes`. */
+  authMount?: string;
+  /**
+   * Filesystem path to the projected ServiceAccount token. Defaults to
+   * `/var/run/secrets/kubernetes.io/serviceaccount/token` (the standard
+   * mount). Override only for tests or non-default projections.
+   */
+  serviceAccountTokenPath?: string;
+}
+
+export type OpenBaoConfig = OpenBaoConfigToken | OpenBaoConfigKubernetes;
+
 export interface OpenBaoDriverDeps {
  /** Injected HTTP fetcher — mockable in tests. */
  fetch?: typeof globalThis.fetch;
-  secretRefResolver: SecretRefResolver;
+  /** Required only for `auth: 'token'`. */
+  secretRefResolver?: SecretRefResolver;
+  /** Override for the SA-token reader; tests use this to supply a fake JWT. */
+  readServiceAccountToken?: (path: string) => Promise<string>;
+  /** Clock for cache TTL — overridable in tests. */
+  now?: () => number;
 }

+const SA_TOKEN_DEFAULT_PATH = '/var/run/secrets/kubernetes.io/serviceaccount/token';
+const TOKEN_RENEW_GRACE_MS = 60_000;
+
 export class OpenBaoDriver implements SecretBackendDriver {
  readonly kind = 'openbao';

@@ -42,19 +78,48 @@ export class OpenBaoDriver implements SecretBackendDriver {
  private readonly mount: string;
  private readonly pathPrefix: string;
  private readonly namespace: string | undefined;
-  private readonly tokenSecretRef: { name: string; key: string };
+  private readonly authStrategy: 'token' | 'kubernetes';
+  private readonly tokenSecretRef: { name: string; key: string } | undefined;
+  private readonly k8sRole: string | undefined;
+  private readonly k8sAuthMount: string;
+  private readonly k8sTokenPath: string;
  private readonly fetchImpl: typeof globalThis.fetch;
-  private readonly resolver: SecretRefResolver;
+  private readonly resolver: SecretRefResolver | undefined;
+  private readonly readSaToken: (path: string) => Promise<string>;
+  private readonly nowFn: () => number;
+
+  // Cached vault token + when (epoch ms) it should be considered expired and refetched.
  private cachedToken: string | undefined;
+  private cachedTokenExpiresAt: number = Number.POSITIVE_INFINITY;

  constructor(config: OpenBaoConfig, deps: OpenBaoDriverDeps) {
    this.url = config.url.replace(/\/+$/, '');
    this.mount = (config.mount ?? 'secret').replace(/^\/|\/$/g, '');
    this.pathPrefix = (config.pathPrefix ?? 'mcpctl').replace(/^\/|\/$/g, '');
    if (config.namespace !== undefined) this.namespace = config.namespace;
-    this.tokenSecretRef = config.tokenSecretRef;
+
+    this.authStrategy = config.auth ?? 'token';
+    if (this.authStrategy === 'kubernetes') {
+      const k = config as OpenBaoConfigKubernetes;
+      if (!k.role) throw new Error('openbao kubernetes auth: `role` is required');
+      this.k8sRole = k.role;
+      this.k8sAuthMount = (k.authMount ?? 'kubernetes').replace(/^\/|\/$/g, '');
+      this.k8sTokenPath = k.serviceAccountTokenPath ?? SA_TOKEN_DEFAULT_PATH;
+    } else {
+      const t = config as OpenBaoConfigToken;
+      if (!t.tokenSecretRef) throw new Error('openbao token auth: `tokenSecretRef` is required');
+      if (deps.secretRefResolver === undefined) {
+        throw new Error('openbao token auth: secretRefResolver dependency is required');
+      }
+      this.tokenSecretRef = t.tokenSecretRef;
+      this.k8sAuthMount = 'kubernetes';
+      this.k8sTokenPath = SA_TOKEN_DEFAULT_PATH;
+    }
+
    this.fetchImpl = deps.fetch ?? globalThis.fetch;
-    this.resolver = deps.secretRefResolver;
+    if (deps.secretRefResolver !== undefined) this.resolver = deps.secretRefResolver;
+    this.readSaToken = deps.readServiceAccountToken ?? ((path) => readFile(path, 'utf-8').then((s) => s.trim()));
+    this.nowFn = deps.now ?? (() => Date.now());
  }

  async read(input: { name: string; externalRef: ExternalRef; data: SecretData }): Promise<SecretData> {
@@ -113,10 +178,44 @@ export class OpenBaoDriver implements SecretBackendDriver {
  }

  private async getToken(): Promise<string> {
-    if (this.cachedToken !== undefined) return this.cachedToken;
-    const token = await this.resolver.resolve(this.tokenSecretRef.name, this.tokenSecretRef.key);
-    this.cachedToken = token;
-    return token;
+    if (this.cachedToken !== undefined && this.nowFn() < this.cachedTokenExpiresAt - TOKEN_RENEW_GRACE_MS) {
+      return this.cachedToken;
+    }
+
+    if (this.authStrategy === 'token') {
+      // Static token from a plaintext Secret. No TTL — cache for the driver's lifetime.
+      const token = await this.resolver!.resolve(this.tokenSecretRef!.name, this.tokenSecretRef!.key);
+      this.cachedToken = token;
+      this.cachedTokenExpiresAt = Number.POSITIVE_INFINITY;
+      return token;
+    }
+
+    // Kubernetes auth: read the projected SA JWT, exchange it for a Vault token.
+    const jwt = await this.readSaToken(this.k8sTokenPath);
+    const loginUrl = `${this.url}/v1/auth/${this.k8sAuthMount}/login`;
+    const headers: Record<string, string> = { 'Content-Type': 'application/json' };
+    if (this.namespace !== undefined) headers['X-Vault-Namespace'] = this.namespace;
+    const res = await this.fetchImpl(loginUrl, {
+      method: 'POST',
+      headers,
+      body: JSON.stringify({ role: this.k8sRole, jwt }),
+    });
+    if (!res.ok) {
+      const text = await res.text().catch(() => '');
+      throw new Error(`OpenBao kubernetes login (role=${this.k8sRole!}): HTTP ${String(res.status)} ${text}`);
+    }
+    const body = await res.json() as { auth?: { client_token?: string; lease_duration?: number } };
+    const clientToken = body.auth?.client_token;
+    if (clientToken === undefined || clientToken === '') {
+      throw new Error(`OpenBao kubernetes login: response missing auth.client_token`);
+    }
+    // lease_duration is seconds; 0 means token doesn't expire (rare for k8s auth).
+    const leaseSec = body.auth?.lease_duration ?? 0;
+    this.cachedToken = clientToken;
+    this.cachedTokenExpiresAt = leaseSec > 0
+      ? this.nowFn() + leaseSec * 1000
+      : Number.POSITIVE_INFINITY;
+    return clientToken;
  }

  private async request(method: string, path: string, body?: unknown): Promise<Response> {
@@ -128,6 +227,21 @@ export class OpenBaoDriver implements SecretBackendDriver {
    const init: RequestInit = { method, headers };
    if (body !== undefined) init.body = JSON.stringify(body);

-    return this.fetchImpl(`${this.url}${path}`, init);
+    const res = await this.fetchImpl(`${this.url}${path}`, init);
+
+    // If the cached token expired between cache-check and request (k8s clock
+    // skew, server-side revocation, etc.), purge cache and retry once.
+    if (res.status === 403 && this.cachedToken !== undefined) {
+      this.cachedToken = undefined;
+      this.cachedTokenExpiresAt = 0;
+      const fresh = await this.getToken();
+      const retryHeaders: Record<string, string> = { 'X-Vault-Token': fresh };
+      if (this.namespace !== undefined) retryHeaders['X-Vault-Namespace'] = this.namespace;
+      if (body !== undefined) retryHeaders['Content-Type'] = 'application/json';
+      const retryInit: RequestInit = { method, headers: retryHeaders };
+      if (body !== undefined) retryInit.body = JSON.stringify(body);
+      return this.fetchImpl(`${this.url}${path}`, retryInit);
+    }
+    return res;
  }
 }
--- a/src/mcpd/tests/llm-adapters.test.ts
+++ b/src/mcpd/tests/llm-adapters.test.ts
@@ -0,0 +1,210 @@
+import { describe, it, expect, vi } from 'vitest';
+import { OpenAiPassthroughAdapter } from '../src/services/llm/adapters/openai-passthrough.js';
+import { AnthropicAdapter } from '../src/services/llm/adapters/anthropic.js';
+import { LlmAdapterRegistry, UnsupportedProviderError } from '../src/services/llm/dispatcher.js';
+import type { InferContext } from '../src/services/llm/types.js';
+
+function mockFetch(responses: Array<{ match: RegExp; status: number; body?: unknown; text?: string }>): ReturnType<typeof vi.fn> {
+  return vi.fn(async (input: string | URL, _init?: RequestInit) => {
+    const url = String(input);
+    const match = responses.find((r) => r.match.test(url));
+    if (!match) throw new Error(`unexpected fetch: ${url}`);
+    const body = match.body !== undefined ? JSON.stringify(match.body) : (match.text ?? '');
+    return new Response(body, { status: match.status, headers: { 'Content-Type': 'application/json' } });
+  });
+}
+
+function makeCtx(overrides: Partial<InferContext> = {}): InferContext {
+  return {
+    body: { model: '', messages: [{ role: 'user', content: 'hello' }] },
+    modelOverride: 'default-model',
+    apiKey: 'test-key',
+    url: '',
+    extraConfig: {},
+    ...overrides,
+  };
+}
+
+// Helper to build a streaming Response from SSE lines.
+function sseResponse(events: string[]): Response {
+  const body = events.join('\n\n') + '\n\n';
+  const stream = new ReadableStream<Uint8Array>({
+    start(controller) {
+      controller.enqueue(new TextEncoder().encode(body));
+      controller.close();
+    },
+  });
+  return new Response(stream, { status: 200, headers: { 'Content-Type': 'text/event-stream' } });
+}
+
+describe('OpenAiPassthroughAdapter', () => {
+  it('infer: POSTs to <url>/v1/chat/completions with Authorization + body', async () => {
+    const fetchFn = mockFetch([{
+      match: /\/v1\/chat\/completions$/,
+      status: 200,
+      body: { id: 'x', choices: [{ message: { role: 'assistant', content: 'hi' } }] },
+    }]);
+    const adapter = new OpenAiPassthroughAdapter('openai', { fetch: fetchFn as unknown as typeof fetch });
+    const ctx = makeCtx({ url: 'https://api.example.com' });
+    const res = await adapter.infer(ctx);
+    expect(res.status).toBe(200);
+    const [url, init] = fetchFn.mock.calls[0] as [string, RequestInit];
+    expect(url).toBe('https://api.example.com/v1/chat/completions');
+    expect(init.method).toBe('POST');
+    const headers = init.headers as Record<string, string>;
+    expect(headers['Authorization']).toBe('Bearer test-key');
+    const sent = JSON.parse(init.body as string) as { model: string; stream: boolean };
+    expect(sent.model).toBe('default-model');  // filled from modelOverride
+    expect(sent.stream).toBe(false);
+  });
+
+  it('infer: uses default URL for openai when url is empty', async () => {
+    const fetchFn = mockFetch([{ match: /api\.openai\.com/, status: 200, body: {} }]);
+    const adapter = new OpenAiPassthroughAdapter('openai', { fetch: fetchFn as unknown as typeof fetch });
+    await adapter.infer(makeCtx());
+    const [url] = fetchFn.mock.calls[0] as [string, RequestInit];
+    expect(url).toBe('https://api.openai.com/v1/chat/completions');
+  });
+
+  it('infer: throws for vllm when url is empty (no default)', async () => {
+    const adapter = new OpenAiPassthroughAdapter('vllm', { fetch: vi.fn() as unknown as typeof fetch });
+    await expect(adapter.infer(makeCtx())).rejects.toThrow(/no default endpoint/);
+  });
+
+  it('infer: omits Authorization when apiKey is empty', async () => {
+    const fetchFn = mockFetch([{ match: /ollama/, status: 200, body: {} }]);
+    const adapter = new OpenAiPassthroughAdapter('ollama', { fetch: fetchFn as unknown as typeof fetch });
+    await adapter.infer(makeCtx({ url: 'http://ollama:11434', apiKey: '' }));
+    const [, init] = fetchFn.mock.calls[0] as [string, RequestInit];
+    const headers = init.headers as Record<string, string>;
+    expect(headers['Authorization']).toBeUndefined();
+  });
+
+  it('stream: forwards SSE chunks and emits terminal [DONE]', async () => {
+    const fetchFn = vi.fn(async () => sseResponse([
+      'data: {"choices":[{"delta":{"content":"hi"}}]}',
+      'data: {"choices":[{"delta":{"content":"!"}}]}',
+      'data: [DONE]',
+    ]));
+    const adapter = new OpenAiPassthroughAdapter('openai', { fetch: fetchFn as unknown as typeof fetch });
+    const ctx = makeCtx({ url: 'http://example', body: { model: '', messages: [], stream: true } });
+    const chunks: { data: string; done?: boolean }[] = [];
+    for await (const c of adapter.stream(ctx)) chunks.push(c);
+    expect(chunks).toHaveLength(3);
+    expect(chunks[2]?.done).toBe(true);
+  });
+});
+
+describe('AnthropicAdapter', () => {
+  it('infer: translates system+user messages, posts to /v1/messages', async () => {
+    const fetchFn = mockFetch([{
+      match: /\/v1\/messages$/,
+      status: 200,
+      body: {
+        id: 'msg_01', model: 'claude-3-5-sonnet-20241022', role: 'assistant',
+        content: [{ type: 'text', text: 'howdy' }],
+        stop_reason: 'end_turn',
+        usage: { input_tokens: 5, output_tokens: 2 },
+      },
+    }]);
+    const adapter = new AnthropicAdapter({ fetch: fetchFn as unknown as typeof fetch });
+    const ctx = makeCtx({
+      body: {
+        model: '', messages: [
+          { role: 'system', content: 'be nice' },
+          { role: 'user', content: 'hi' },
+        ],
+      },
+      modelOverride: 'claude-3-5-sonnet-20241022',
+    });
+    const res = await adapter.infer(ctx);
+    expect(res.status).toBe(200);
+
+    const [url, init] = fetchFn.mock.calls[0] as [string, RequestInit];
+    expect(url).toBe('https://api.anthropic.com/v1/messages');
+    const headers = init.headers as Record<string, string>;
+    expect(headers['x-api-key']).toBe('test-key');
+    expect(headers['anthropic-version']).toBeDefined();
+
+    const sent = JSON.parse(init.body as string) as {
+      model: string; system: string; messages: Array<{ role: string; content: string }>; max_tokens: number;
+    };
+    expect(sent.model).toBe('claude-3-5-sonnet-20241022');
+    expect(sent.system).toBe('be nice');
+    expect(sent.messages).toEqual([{ role: 'user', content: 'hi' }]);
+    expect(sent.max_tokens).toBe(1024); // default
+
+    // Response shape: OpenAI chat.completion
+    const body = res.body as { choices: Array<{ message: { content: string }; finish_reason: string }>; usage: { total_tokens: number } };
+    expect(body.choices[0]!.message.content).toBe('howdy');
+    expect(body.choices[0]!.finish_reason).toBe('stop');
+    expect(body.usage.total_tokens).toBe(7);
+  });
+
+  it('infer: returns a synthetic error body on non-2xx', async () => {
+    const fetchFn = vi.fn(async () => new Response('boom', { status: 500 }));
+    const adapter = new AnthropicAdapter({ fetch: fetchFn as unknown as typeof fetch });
+    const res = await adapter.infer(makeCtx({ body: { model: '', messages: [{ role: 'user', content: 'x' }] } }));
+    expect(res.status).toBe(500);
+    const body = res.body as { error: { message: string } };
+    expect(body.error.message).toMatch(/HTTP 500/);
+  });
+
+  it('stream: translates anthropic event stream into OpenAI chunks', async () => {
+    const events = [
+      'event: message_start\ndata: {"type":"message_start","message":{"id":"m","content":[]}}',
+      'event: content_block_delta\ndata: {"type":"content_block_delta","delta":{"type":"text_delta","text":"he"}}',
+      'event: content_block_delta\ndata: {"type":"content_block_delta","delta":{"type":"text_delta","text":"llo"}}',
+      'event: message_delta\ndata: {"type":"message_delta","delta":{"stop_reason":"end_turn"}}',
+      'event: message_stop\ndata: {"type":"message_stop"}',
+    ];
+    const fetchFn = vi.fn(async () => sseResponse(events));
+    const adapter = new AnthropicAdapter({ fetch: fetchFn as unknown as typeof fetch });
+    const ctx = makeCtx({ body: { model: '', messages: [{ role: 'user', content: 'hi' }], stream: true } });
+
+    const chunks: { data: string; done?: boolean }[] = [];
+    for await (const c of adapter.stream(ctx)) chunks.push(c);
+
+    // Expect: role-prime, two text deltas, finish-reason, [DONE]
+    expect(chunks[chunks.length - 1]?.data).toBe('[DONE]');
+    expect(chunks[chunks.length - 1]?.done).toBe(true);
+
+    // First chunk is the role-prime (role: assistant, content: '').
+    const first = JSON.parse(chunks[0]!.data) as { choices: [{ delta: { role: string; content: string } }] };
+    expect(first.choices[0]!.delta.role).toBe('assistant');
+
+    // Next two chunks carry the text.
+    const d1 = JSON.parse(chunks[1]!.data) as { choices: [{ delta: { content: string } }] };
+    const d2 = JSON.parse(chunks[2]!.data) as { choices: [{ delta: { content: string } }] };
+    expect(d1.choices[0]!.delta.content).toBe('he');
+    expect(d2.choices[0]!.delta.content).toBe('llo');
+
+    // Finish-reason chunk.
+    const stopped = JSON.parse(chunks[3]!.data) as { choices: [{ finish_reason: string }] };
+    expect(stopped.choices[0]!.finish_reason).toBe('stop');
+  });
+});
+
+describe('LlmAdapterRegistry', () => {
+  it('returns the right adapter kind for each type', () => {
+    const reg = new LlmAdapterRegistry();
+    expect(reg.get('openai').kind).toBe('openai');
+    expect(reg.get('vllm').kind).toBe('vllm');
+    expect(reg.get('deepseek').kind).toBe('deepseek');
+    expect(reg.get('ollama').kind).toBe('ollama');
+    expect(reg.get('anthropic').kind).toBe('anthropic');
+  });
+
+  it('caches adapters between calls', () => {
+    const reg = new LlmAdapterRegistry();
+    const a = reg.get('openai');
+    const b = reg.get('openai');
+    expect(a).toBe(b);
+  });
+
+  it('rejects unsupported providers (gemini-cli is deferred)', () => {
+    const reg = new LlmAdapterRegistry();
+    expect(() => reg.get('gemini-cli')).toThrow(UnsupportedProviderError);
+    expect(() => reg.get('bogus')).toThrow(UnsupportedProviderError);
+  });
+});
--- a/src/mcpd/tests/llm-infer-route.test.ts
+++ b/src/mcpd/tests/llm-infer-route.test.ts
@@ -0,0 +1,208 @@
+import { describe, it, expect, vi, afterEach } from 'vitest';
+import Fastify from 'fastify';
+import type { FastifyInstance } from 'fastify';
+import { registerLlmInferRoutes } from '../src/routes/llm-infer.js';
+import { LlmAdapterRegistry } from '../src/services/llm/dispatcher.js';
+import { errorHandler } from '../src/middleware/error-handler.js';
+import type { LlmView } from '../src/services/llm.service.js';
+import { NotFoundError } from '../src/services/mcp-server.service.js';
+
+let app: FastifyInstance;
+
+function makeLlmView(overrides: Partial<LlmView> = {}): LlmView {
+  return {
+    id: 'llm-1',
+    name: 'claude',
+    type: 'anthropic',
+    model: 'claude-3-5-sonnet-20241022',
+    url: '',
+    tier: 'heavy',
+    description: '',
+    apiKeyRef: { name: 'anthropic-key', key: 'token' },
+    extraConfig: {},
+    version: 1,
+    createdAt: new Date(),
+    updatedAt: new Date(),
+    ...overrides,
+  };
+}
+
+afterEach(async () => {
+  if (app) await app.close();
+});
+
+function sseResponse(events: string[]): Response {
+  const body = events.join('\n\n') + '\n\n';
+  const stream = new ReadableStream<Uint8Array>({
+    start(controller) {
+      controller.enqueue(new TextEncoder().encode(body));
+      controller.close();
+    },
+  });
+  return new Response(stream, { status: 200 });
+}
+
+interface LlmServiceLike {
+  getByName: (name: string) => Promise<LlmView>;
+  resolveApiKey: (name: string) => Promise<string>;
+}
+
+async function setupApp(
+  llmService: LlmServiceLike,
+  adapters: LlmAdapterRegistry,
+  onInferenceEvent?: Parameters<typeof registerLlmInferRoutes>[1]['onInferenceEvent'],
+): Promise<FastifyInstance> {
+  app = Fastify({ logger: false });
+  app.setErrorHandler(errorHandler);
+  const deps: Parameters<typeof registerLlmInferRoutes>[1] = {
+    // eslint-disable-next-line @typescript-eslint/no-explicit-any
+    llmService: llmService as any,
+    adapters,
+  };
+  if (onInferenceEvent !== undefined) deps.onInferenceEvent = onInferenceEvent;
+  registerLlmInferRoutes(app, deps);
+  await app.ready();
+  return app;
+}
+
+describe('POST /api/v1/llms/:name/infer', () => {
+  it('returns 404 when the Llm does not exist', async () => {
+    const svc: LlmServiceLike = {
+      getByName: async () => { throw new NotFoundError('Llm not found: missing'); },
+      resolveApiKey: async () => '',
+    };
+    await setupApp(svc, new LlmAdapterRegistry());
+    const res = await app.inject({
+      method: 'POST',
+      url: '/api/v1/llms/missing/infer',
+      payload: { messages: [{ role: 'user', content: 'hi' }] },
+    });
+    expect(res.statusCode).toBe(404);
+  });
+
+  it('returns 400 when messages is missing', async () => {
+    const svc: LlmServiceLike = {
+      getByName: async () => makeLlmView({ apiKeyRef: null }),
+      resolveApiKey: async () => '',
+    };
+    await setupApp(svc, new LlmAdapterRegistry());
+    const res = await app.inject({
+      method: 'POST',
+      url: '/api/v1/llms/claude/infer',
+      payload: {},
+    });
+    expect(res.statusCode).toBe(400);
+  });
+
+  it('dispatches non-streaming to the adapter and returns its JSON', async () => {
+    const fetchFn = vi.fn(async () => new Response(JSON.stringify({
+      id: 'msg_1', model: 'claude-3-5-sonnet-20241022', role: 'assistant',
+      content: [{ type: 'text', text: 'hello' }],
+      stop_reason: 'end_turn',
+      usage: { input_tokens: 1, output_tokens: 1 },
+    }), { status: 200 }));
+    const adapters = new LlmAdapterRegistry({ fetch: fetchFn as unknown as typeof fetch });
+    const svc: LlmServiceLike = {
+      getByName: async () => makeLlmView(),
+      resolveApiKey: async () => 'sk-ant-xyz',
+    };
+    const events: unknown[] = [];
+    await setupApp(svc, adapters, (e) => events.push(e));
+
+    const res = await app.inject({
+      method: 'POST',
+      url: '/api/v1/llms/claude/infer',
+      payload: { messages: [{ role: 'user', content: 'hi' }] },
+    });
+    expect(res.statusCode).toBe(200);
+    const body = res.json<{ choices: Array<{ message: { content: string } }> }>();
+    expect(body.choices[0]!.message.content).toBe('hello');
+
+    // Audit event emitted
+    expect(events).toHaveLength(1);
+    expect((events[0] as { kind: string; llmName: string; status: number }).kind).toBe('llm_inference_call');
+    expect((events[0] as { llmName: string }).llmName).toBe('claude');
+    expect((events[0] as { streaming: boolean }).streaming).toBe(false);
+    expect((events[0] as { status: number }).status).toBe(200);
+  });
+
+  it('500s when apiKey resolution fails', async () => {
+    const adapters = new LlmAdapterRegistry();
+    const svc: LlmServiceLike = {
+      getByName: async () => makeLlmView(),
+      resolveApiKey: async () => { throw new Error('secret not found'); },
+    };
+    await setupApp(svc, adapters);
+
+    const res = await app.inject({
+      method: 'POST',
+      url: '/api/v1/llms/claude/infer',
+      payload: { messages: [{ role: 'user', content: 'hi' }] },
+    });
+    expect(res.statusCode).toBe(500);
+    expect(res.json<{ error: string }>().error).toMatch(/secret not found/);
+  });
+
+  it('skips apiKey resolution when the Llm has no apiKeyRef', async () => {
+    const fetchFn = vi.fn(async () => new Response(JSON.stringify({ id: 'x', choices: [] }), { status: 200 }));
+    const adapters = new LlmAdapterRegistry({ fetch: fetchFn as unknown as typeof fetch });
+    const resolveSpy = vi.fn();
+    const svc: LlmServiceLike = {
+      getByName: async () => makeLlmView({ type: 'ollama', url: 'http://ollama:11434', apiKeyRef: null }),
+      resolveApiKey: resolveSpy as unknown as LlmServiceLike['resolveApiKey'],
+    };
+    await setupApp(svc, adapters);
+
+    const res = await app.inject({
+      method: 'POST',
+      url: '/api/v1/llms/ollama-local/infer',
+      payload: { messages: [{ role: 'user', content: 'hi' }] },
+    });
+    expect(res.statusCode).toBe(200);
+    expect(resolveSpy).not.toHaveBeenCalled();
+  });
+
+  it('streams SSE chunks for stream: true', async () => {
+    const fetchFn = vi.fn(async () => sseResponse([
+      'event: content_block_delta\ndata: {"type":"content_block_delta","delta":{"type":"text_delta","text":"hi"}}',
+      'event: message_stop\ndata: {"type":"message_stop"}',
+    ]));
+    const adapters = new LlmAdapterRegistry({ fetch: fetchFn as unknown as typeof fetch });
+    const svc: LlmServiceLike = {
+      getByName: async () => makeLlmView(),
+      resolveApiKey: async () => 'sk-ant-xyz',
+    };
+    const events: Array<{ streaming: boolean; status: number }> = [];
+    // eslint-disable-next-line @typescript-eslint/no-explicit-any
+    await setupApp(svc, adapters, ((e: any) => events.push(e)) as any);
+
+    const res = await app.inject({
+      method: 'POST',
+      url: '/api/v1/llms/claude/infer',
+      payload: { messages: [{ role: 'user', content: 'hi' }], stream: true },
+    });
+    expect(res.statusCode).toBe(200);
+    expect(res.body).toContain('data:');
+    expect(res.body).toContain('[DONE]');
+    expect(events).toHaveLength(1);
+    expect(events[0]!.streaming).toBe(true);
+  });
+
+  it('502s on adapter errors (non-streaming)', async () => {
+    const fetchFn = vi.fn(async () => { throw new Error('upstream down'); });
+    const adapters = new LlmAdapterRegistry({ fetch: fetchFn as unknown as typeof fetch });
+    const svc: LlmServiceLike = {
+      getByName: async () => makeLlmView({ type: 'openai', url: 'http://example', apiKeyRef: null }),
+      resolveApiKey: async () => '',
+    };
+    await setupApp(svc, adapters);
+
+    const res = await app.inject({
+      method: 'POST',
+      url: '/api/v1/llms/x/infer',
+      payload: { messages: [{ role: 'user', content: 'hi' }] },
+    });
+    expect(res.statusCode).toBe(502);
+    expect(res.json<{ error: string }>().error).toMatch(/upstream down/);
+  });
+});
--- a/src/mcpd/tests/llm-routes.test.ts
+++ b/src/mcpd/tests/llm-routes.test.ts
@@ -104,6 +104,25 @@ describe('Llm Routes', () => {
    expect(res.statusCode).toBe(404);
  });

+  it('GET /api/v1/llms/:nameOrId resolves by human name when not a CUID', async () => {
+    await createApp(mockRepo([makeLlm({ id: 'llm-1', name: 'claude' })]));
+    const res = await app.inject({ method: 'GET', url: '/api/v1/llms/claude' });
+    expect(res.statusCode).toBe(200);
+    expect(res.json<{ name: string; id: string }>().name).toBe('claude');
+  });
+
+  it('HEAD /api/v1/llms/:name returns 200 for an existing Llm (failover RBAC pre-check)', async () => {
+    await createApp(mockRepo([makeLlm({ name: 'claude' })]));
+    const res = await app.inject({ method: 'HEAD', url: '/api/v1/llms/claude' });
+    expect(res.statusCode).toBe(200);
+  });
+
+  it('HEAD /api/v1/llms/:name returns 404 for a missing Llm', async () => {
+    await createApp(mockRepo());
+    const res = await app.inject({ method: 'HEAD', url: '/api/v1/llms/missing' });
+    expect(res.statusCode).toBe(404);
+  });
+
  it('POST /api/v1/llms creates and returns 201', async () => {
    await createApp(mockRepo());
    const res = await app.inject({
--- a/src/mcpd/tests/secret-backend-rotator.test.ts
+++ b/src/mcpd/tests/secret-backend-rotator.test.ts
@@ -0,0 +1,276 @@
+import { describe, it, expect, vi } from 'vitest';
+import { SecretBackendRotator } from '../src/services/secret-backend-rotator.service.js';
+import type { SecretBackend, Secret } from '@prisma/client';
+
+function makeBackend(overrides: Partial<SecretBackend> = {}): SecretBackend {
+  return {
+    id: 'backend-1',
+    name: 'bao',
+    type: 'openbao',
+    config: {
+      url: 'http://bao.example:8200',
+      auth: 'token',
+      mount: 'secret',
+      pathPrefix: 'mcpd',
+      tokenSecretRef: { name: 'bao-creds', key: 'token' },
+      rotation: { enabled: true, tokenRole: 'app-mcpd-role', intervalHours: 24 },
+    } as unknown as SecretBackend['config'],
+    tokenMeta: { rotatable: true } as unknown as SecretBackend['tokenMeta'],
+    isDefault: false,
+    description: '',
+    version: 1,
+    createdAt: new Date(),
+    updatedAt: new Date(),
+    ...overrides,
+  };
+}
+
+function makeSecret(overrides: Partial<Secret> = {}): Secret {
+  return {
+    id: 'sec-1',
+    name: 'bao-creds',
+    backendId: 'backend-plaintext',
+    data: { token: 'old.token.value' },
+    externalRef: '',
+    version: 1,
+    createdAt: new Date(),
+    updatedAt: new Date(),
+    ...overrides,
+  };
+}
+
+interface MockState {
+  backend: SecretBackend;
+  secret: Secret;
+  secretData: Record<string, string>;
+  lastTokenMeta: Record<string, unknown> | null;
+  lastSecretUpdate: Record<string, unknown> | null;
+}
+
+function mockDeps(state: MockState, vaultResponses: Array<{ match: RegExp; status: number; body?: unknown }>) {
+  const fetchFn = vi.fn(async (url: string | URL, init?: RequestInit) => {
+    const key = `${init?.method ?? 'GET'} ${String(url)}`;
+    const match = vaultResponses.find((r) => r.match.test(key) || r.match.test(String(url)));
+    if (!match) throw new Error(`unexpected vault call: ${key}`);
+    const body = match.body !== undefined ? JSON.stringify(match.body) : '';
+    return new Response(body, { status: match.status });
+  });
+
+  const backends = {
+    getById: vi.fn(async (id: string) => {
+      if (id === state.backend.id) return state.backend;
+      throw new Error(`not found: ${id}`);
+    }),
+    updateTokenMeta: vi.fn(async (id: string, meta: Record<string, unknown>) => {
+      expect(id).toBe(state.backend.id);
+      state.lastTokenMeta = meta;
+      state.backend = { ...state.backend, tokenMeta: meta as unknown as SecretBackend['tokenMeta'] };
+      return state.backend;
+    }),
+  };
+
+  const secrets = {
+    getByName: vi.fn(async (name: string) => {
+      if (name === state.secret.name) return state.secret;
+      throw new Error(`secret not found: ${name}`);
+    }),
+    resolveData: vi.fn(async () => ({ ...state.secretData })),
+    update: vi.fn(async (id: string, input: { data: Record<string, string> }) => {
+      expect(id).toBe(state.secret.id);
+      state.secretData = { ...input.data };
+      state.lastSecretUpdate = input as unknown as Record<string, unknown>;
+      return state.secret;
+    }),
+  };
+
+  return { fetchFn, backends, secrets };
+}
+
+describe('SecretBackendRotator', () => {
+  it('isRotatable: true for wizard-provisioned openbao', () => {
+    const state: MockState = {
+      backend: makeBackend(),
+      secret: makeSecret(),
+      secretData: { token: 'x' },
+      lastTokenMeta: null,
+      lastSecretUpdate: null,
+    };
+    const { backends, secrets } = mockDeps(state, []);
+    const r = new SecretBackendRotator({
+      backends: backends as unknown as Parameters<typeof SecretBackendRotator.prototype.rotateOne>[0] extends never ? never : never,
+      // eslint-disable-next-line @typescript-eslint/no-explicit-any
+    } as any);
+    // Use a real rotator with both deps filled.
+    const rotator = new SecretBackendRotator({
+      backends: backends as never,
+      secrets: secrets as never,
+    });
+    expect(rotator.isRotatable(state.backend)).toBe(true);
+    expect(r).toBeDefined();
+  });
+
+  it('isRotatable: false for kubernetes-auth openbao', () => {
+    const state: MockState = {
+      backend: makeBackend({
+        config: {
+          url: 'http://bao', auth: 'kubernetes', role: 'r',
+          rotation: { enabled: true, tokenRole: 'app-mcpd-role' },
+        } as unknown as SecretBackend['config'],
+      }),
+      secret: makeSecret(),
+      secretData: {},
+      lastTokenMeta: null,
+      lastSecretUpdate: null,
+    };
+    const { backends, secrets } = mockDeps(state, []);
+    const rotator = new SecretBackendRotator({ backends: backends as never, secrets: secrets as never });
+    expect(rotator.isRotatable(state.backend)).toBe(false);
+  });
+
+  it('rotateOne: mints → verifies → persists → revokes old → updates tokenMeta', async () => {
+    const state: MockState = {
+      backend: makeBackend({ tokenMeta: { rotatable: true, currentAccessor: 'old-accessor' } as unknown as SecretBackend['tokenMeta'] }),
+      secret: makeSecret({ data: { token: 'old.token.value' } as Secret['data'] }),
+      secretData: { token: 'old.token.value' },
+      lastTokenMeta: null,
+      lastSecretUpdate: null,
+    };
+    const { fetchFn, backends, secrets } = mockDeps(state, [
+      { match: /POST .*auth\/token\/create\/app-mcpd-role$/, status: 200, body: { auth: { client_token: 'new.token.value', accessor: 'new-accessor', lease_duration: 720 * 3600, renewable: true } } },
+      { match: /GET .*auth\/token\/lookup-self$/, status: 200, body: { data: { accessor: 'new-accessor', ttl: 720 * 3600 } } },
+      { match: /POST .*auth\/token\/revoke-accessor$/, status: 200 },
+    ]);
+
+    const rotator = new SecretBackendRotator({
+      backends: backends as never,
+      secrets: secrets as never,
+      fetch: fetchFn as unknown as typeof fetch,
+      now: () => new Date('2026-04-20T10:00:00Z'),
+    });
+
+    const meta = await rotator.rotateOne(state.backend.id);
+
+    // Correct order of HTTP calls: create (with OLD token) → lookup (with NEW token) → revoke (with NEW token)
+    const calls = fetchFn.mock.calls.map((c) => `${(c[1] as RequestInit).method ?? 'GET'} ${String(c[0])}`);
+    expect(calls[0]).toMatch(/POST .*create\/app-mcpd-role/);
+    expect(calls[1]).toMatch(/GET .*lookup-self/);
+    expect(calls[2]).toMatch(/POST .*revoke-accessor/);
+    expect((fetchFn.mock.calls[0]![1] as RequestInit).headers).toMatchObject({ 'X-Vault-Token': 'old.token.value' });
+    expect((fetchFn.mock.calls[1]![1] as RequestInit).headers).toMatchObject({ 'X-Vault-Token': 'new.token.value' });
+    expect((fetchFn.mock.calls[2]![1] as RequestInit).headers).toMatchObject({ 'X-Vault-Token': 'new.token.value' });
+
+    // Secret was updated BEFORE revoke — state reflects ordering by sequence above.
+    expect(state.secretData.token).toBe('new.token.value');
+
+    // tokenMeta carries fresh timestamps + accessor
+    expect(meta.currentAccessor).toBe('new-accessor');
+    expect(meta.lastRotationError).toBeNull();
+    expect(meta.generatedAt).toBe('2026-04-20T10:00:00.000Z');
+    expect(meta.nextRenewalAt).toBe('2026-04-21T10:00:00.000Z');
+    expect(meta.validUntil).toBe('2026-05-20T10:00:00.000Z');
+    expect(state.lastTokenMeta?.rotatable).toBe(true);
+  });
+
+  it('rotateOne: on mint failure, records lastRotationError and keeps old token', async () => {
+    const state: MockState = {
+      backend: makeBackend(),
+      secret: makeSecret({ data: { token: 'old.token' } as Secret['data'] }),
+      secretData: { token: 'old.token' },
+      lastTokenMeta: null,
+      lastSecretUpdate: null,
+    };
+    const { fetchFn, backends, secrets } = mockDeps(state, [
+      { match: /create\/app-mcpd-role$/, status: 403, body: { errors: ['permission denied'] } },
+    ]);
+    const rotator = new SecretBackendRotator({
+      backends: backends as never,
+      secrets: secrets as never,
+      fetch: fetchFn as unknown as typeof fetch,
+    });
+
+    await expect(rotator.rotateOne(state.backend.id)).rejects.toThrow(/HTTP 403/);
+
+    // Secret was NOT updated
+    expect(state.secretData.token).toBe('old.token');
+    expect(secrets.update).not.toHaveBeenCalled();
+    // tokenMeta records the error
+    expect(state.lastTokenMeta?.lastRotationError).toMatch(/HTTP 403/);
+  });
+
+  it('rotateOne: rejects when minted token is not renewable', async () => {
+    const state: MockState = {
+      backend: makeBackend(),
+      secret: makeSecret({ data: { token: 'old' } as Secret['data'] }),
+      secretData: { token: 'old' },
+      lastTokenMeta: null,
+      lastSecretUpdate: null,
+    };
+    const { fetchFn, backends, secrets } = mockDeps(state, [
+      { match: /create\/app-mcpd-role$/, status: 200, body: { auth: { client_token: 'new', accessor: 'a', lease_duration: 100, renewable: false } } },
+    ]);
+    const rotator = new SecretBackendRotator({
+      backends: backends as never,
+      secrets: secrets as never,
+      fetch: fetchFn as unknown as typeof fetch,
+    });
+    await expect(rotator.rotateOne(state.backend.id)).rejects.toThrow(/not renewable/);
+    expect(state.secretData.token).toBe('old');
+  });
+
+  it('rotateOne: continues despite revoke-accessor failure (old token expires anyway)', async () => {
+    const state: MockState = {
+      backend: makeBackend({ tokenMeta: { rotatable: true, currentAccessor: 'old-accessor' } as unknown as SecretBackend['tokenMeta'] }),
+      secret: makeSecret({ data: { token: 'old' } as Secret['data'] }),
+      secretData: { token: 'old' },
+      lastTokenMeta: null,
+      lastSecretUpdate: null,
+    };
+    const { fetchFn, backends, secrets } = mockDeps(state, [
+      { match: /create\/app-mcpd-role$/, status: 200, body: { auth: { client_token: 'new', accessor: 'new-a', lease_duration: 3600, renewable: true } } },
+      { match: /lookup-self$/, status: 200, body: { data: { accessor: 'new-a', ttl: 3600 } } },
+      { match: /revoke-accessor$/, status: 502 },
+    ]);
+    const rotator = new SecretBackendRotator({
+      backends: backends as never,
+      secrets: secrets as never,
+      fetch: fetchFn as unknown as typeof fetch,
+    });
+    const meta = await rotator.rotateOne(state.backend.id);
+    expect(state.secretData.token).toBe('new');
+    expect(meta.lastRotationError).toBeNull();
+  });
+
+  it('isOverdue: true when lastRotationAt missing or >24h old', () => {
+    const state: MockState = {
+      backend: makeBackend({ tokenMeta: { rotatable: true } as unknown as SecretBackend['tokenMeta'] }),
+      secret: makeSecret(),
+      secretData: {},
+      lastTokenMeta: null,
+      lastSecretUpdate: null,
+    };
+    const { backends, secrets } = mockDeps(state, []);
+    const now = () => new Date('2026-04-20T10:00:00Z');
+    const r = new SecretBackendRotator({ backends: backends as never, secrets: secrets as never, now });
+
+    expect(r.isOverdue(state.backend)).toBe(true);
+
+    const fresh = { ...state.backend, tokenMeta: { rotatable: true, lastRotationAt: '2026-04-20T09:00:00Z' } as unknown as SecretBackend['tokenMeta'] };
+    expect(r.isOverdue(fresh)).toBe(false);
+
+    const stale = { ...state.backend, tokenMeta: { rotatable: true, lastRotationAt: '2026-04-18T10:00:00Z' } as unknown as SecretBackend['tokenMeta'] };
+    expect(r.isOverdue(stale)).toBe(true);
+  });
+
+  it('rotateOne: throws when backend is not rotatable', async () => {
+    const state: MockState = {
+      backend: makeBackend({ type: 'plaintext', config: {} as SecretBackend['config'] }),
+      secret: makeSecret(),
+      secretData: {},
+      lastTokenMeta: null,
+      lastSecretUpdate: null,
+    };
+    const { backends, secrets } = mockDeps(state, []);
+    const r = new SecretBackendRotator({ backends: backends as never, secrets: secrets as never });
+    await expect(r.rotateOne(state.backend.id)).rejects.toThrow(/not rotatable/);
+  });
+});
--- a/src/mcpd/tests/secret-backends.test.ts
+++ b/src/mcpd/tests/secret-backends.test.ts
@@ -129,4 +129,116 @@ describe('OpenBaoDriver', () => {
    const headers = init.headers as Record<string, string>;
    expect(headers['X-Vault-Namespace']).toBe('myteam');
  });
+
+  describe('kubernetes auth', () => {
+    it('exchanges the SA JWT for a vault client token via /v1/auth/kubernetes/login', async () => {
+      const calls: Array<{ url: string; init: RequestInit }> = [];
+      const fetchFn = vi.fn(async (url: string | URL, init?: RequestInit) => {
+        const u = String(url);
+        calls.push({ url: u, init: init ?? {} });
+        if (u.endsWith('/v1/auth/kubernetes/login')) {
+          return new Response(JSON.stringify({
+            auth: { client_token: 'vault.client.token.xyz', lease_duration: 3600 },
+          }), { status: 200 });
+        }
+        return new Response(JSON.stringify({}), { status: 200 });
+      });
+
+      const driver = new OpenBaoDriver(
+        { url: 'http://bao.example:8200', auth: 'kubernetes', role: 'mcpctl' },
+        {
+          fetch: fetchFn as unknown as typeof fetch,
+          readServiceAccountToken: async () => 'eyJ.fake.sa.jwt',
+        },
+      );
+      await driver.write({ name: 'x', data: { k: 'v' } });
+
+      // Two calls: login + write
+      expect(calls).toHaveLength(2);
+      expect(calls[0]!.url).toBe('http://bao.example:8200/v1/auth/kubernetes/login');
+      expect(JSON.parse(calls[0]!.init.body as string)).toEqual({ role: 'mcpctl', jwt: 'eyJ.fake.sa.jwt' });
+
+      // Write uses the returned client token
+      const writeHeaders = calls[1]!.init.headers as Record<string, string>;
+      expect(writeHeaders['X-Vault-Token']).toBe('vault.client.token.xyz');
+    });
+
+    it('caches the vault token across requests and renews after lease expiry', async () => {
+      let nowMs = 1_000_000_000_000;
+      let loginCount = 0;
+      const fetchFn = vi.fn(async (url: string | URL) => {
+        const u = String(url);
+        if (u.endsWith('/v1/auth/kubernetes/login')) {
+          loginCount++;
+          // 600s lease leaves 540s of cached window after the 60s grace.
+          return new Response(JSON.stringify({
+            auth: { client_token: `tok-${String(loginCount)}`, lease_duration: 600 },
+          }), { status: 200 });
+        }
+        return new Response(JSON.stringify({}), { status: 200 });
+      });
+
+      const driver = new OpenBaoDriver(
+        { url: 'http://bao.example:8200', auth: 'kubernetes', role: 'mcpctl' },
+        {
+          fetch: fetchFn as unknown as typeof fetch,
+          readServiceAccountToken: async () => 'jwt',
+          now: () => nowMs,
+        },
+      );
+
+      await driver.write({ name: 'a', data: { k: 'v' } });
+      await driver.write({ name: 'b', data: { k: 'v' } });
+      expect(loginCount).toBe(1); // both writes share the cached token
+
+      // Advance past lease - grace window → driver re-logs in
+      nowMs += 600_000;
+      await driver.write({ name: 'c', data: { k: 'v' } });
+      expect(loginCount).toBe(2);
+    });
+
+    it('honours custom authMount path', async () => {
+      const calls: string[] = [];
+      const fetchFn = vi.fn(async (url: string | URL) => {
+        calls.push(String(url));
+        if (String(url).includes('/login')) {
+          return new Response(JSON.stringify({ auth: { client_token: 't', lease_duration: 3600 } }), { status: 200 });
+        }
+        return new Response(JSON.stringify({}), { status: 200 });
+      });
+
+      const driver = new OpenBaoDriver(
+        { url: 'http://bao.example:8200', auth: 'kubernetes', role: 'mcpctl', authMount: 'kubernetes/cluster-a' },
+        { fetch: fetchFn as unknown as typeof fetch, readServiceAccountToken: async () => 'jwt' },
+      );
+      await driver.write({ name: 'x', data: {} });
+      expect(calls[0]).toBe('http://bao.example:8200/v1/auth/kubernetes/cluster-a/login');
+    });
+
+    it('throws on login failure with a clear error', async () => {
+      const fetchFn = vi.fn(async () => new Response('permission denied', { status: 403 }));
+      const driver = new OpenBaoDriver(
+        { url: 'http://bao.example:8200', auth: 'kubernetes', role: 'mcpctl' },
+        { fetch: fetchFn as unknown as typeof fetch, readServiceAccountToken: async () => 'jwt' },
+      );
+      await expect(driver.read({ name: 'x', externalRef: '', data: {} }))
+        .rejects.toThrow(/kubernetes login.*role=mcpctl.*HTTP 403/);
+    });
+
+    it('rejects construction when role is missing', () => {
+      expect(() => new OpenBaoDriver(
+        // eslint-disable-next-line @typescript-eslint/no-explicit-any
+        { url: 'http://bao.example:8200', auth: 'kubernetes' } as any,
+        { fetch: vi.fn() as unknown as typeof fetch, readServiceAccountToken: async () => 'jwt' },
+      )).toThrow(/role.*required/);
+    });
+
+    it('rejects token-auth construction when tokenSecretRef is missing', () => {
+      expect(() => new OpenBaoDriver(
+        // eslint-disable-next-line @typescript-eslint/no-explicit-any
+        { url: 'http://bao.example:8200' } as any,
+        { fetch: vi.fn() as unknown as typeof fetch, secretRefResolver: resolver },
+      )).toThrow(/tokenSecretRef.*required/);
+    });
+  });
 });
--- a/src/mcplocal/src/discovery.ts
+++ b/src/mcplocal/src/discovery.ts
@@ -57,9 +57,16 @@ export async function refreshProjectUpstreams(

 /**
 * Fetch a project's LLM config (llmProvider, llmModel) from mcpd.
- * These are the project-level "recommendations" — local overrides take priority.
+ *
+ * Phase 4 redefines `llmProvider` semantically: it names a centralized `Llm`
+ * resource (see `mcpctl get llms`) — NOT a local provider. Consumers should
+ * resolve it through mcpd's inference proxy when reachable. The field remains
+ * a free-form string on the wire for backward compatibility; local overrides
+ * in `~/.mcpctl/config.json` still take priority, and unknown names fall
+ * through to the registry default.
 */
 export interface ProjectLlmConfig {
+  /** Name of an `Llm` resource on mcpd, or 'none' to disable LLM features. */
  llmProvider?: string;
  llmModel?: string;
  proxyModel?: string;
@@ -67,6 +74,31 @@ export interface ProjectLlmConfig {
  serverOverrides?: Record<string, { proxyModel?: string }>;
 }

+/**
+ * Resolve a project's `llmProvider` against mcpd's Llm registry. Returns:
+ *   - 'registered'  — an Llm with this name exists
+ *   - 'disabled'    — value is 'none'
+ *   - 'unregistered'— no Llm matches (consumer should fall back to registry default)
+ *   - 'unreachable' — mcpd couldn't be queried
+ */
+export type LlmReferenceStatus = 'registered' | 'disabled' | 'unregistered' | 'unreachable';
+
+export async function resolveProjectLlmReference(
+  mcpdClient: McpdClient,
+  llmProvider: string | undefined,
+): Promise<LlmReferenceStatus> {
+  if (llmProvider === undefined || llmProvider === '') return 'unregistered';
+  if (llmProvider === 'none') return 'disabled';
+  try {
+    await mcpdClient.get(`/api/v1/llms/${encodeURIComponent(llmProvider)}`);
+    return 'registered';
+  } catch (err) {
+    const msg = err instanceof Error ? err.message : String(err);
+    if (msg.includes('404') || msg.toLowerCase().includes('not found')) return 'unregistered';
+    return 'unreachable';
+  }
+}
+
 export async function fetchProjectLlmConfig(
  mcpdClient: McpdClient,
  projectName: string,
--- a/src/mcplocal/src/http/config.ts
+++ b/src/mcplocal/src/http/config.ts
@@ -64,6 +64,14 @@ export interface LlmProviderFileEntry {
  idleTimeoutMinutes?: number;
  /** vllm-managed: extra args for `vllm serve` */
  extraArgs?: string[];
+  /**
+   * If set, this local provider is allowed to substitute for the centralized
+   * Llm of this name when the mcpd inference proxy is unreachable.
+   * RBAC is still enforced — the caller must have view permission on the
+   * named Llm via mcpd before failover is permitted (fail-closed if mcpd
+   * itself can't be reached).
+   */
+  failoverFor?: string;
 }

 export interface ProjectLlmOverride {
--- a/src/mcplocal/src/http/project-mcp-endpoint.ts
+++ b/src/mcplocal/src/http/project-mcp-endpoint.ts
@@ -101,7 +101,16 @@ export function registerProjectMcpEndpoint(app: FastifyInstance, mcpdClient: Mcp
      complete: async () => '',
      available: () => false,
    };
-    // Build cache namespace: provider--model--proxymodel
+    // Build cache namespace: provider--model--proxymodel.
+    // Resolution order:
+    //   1. local ~/.mcpctl override
+    //   2. mcpdConfig.llmProvider (Phase 4: name of a centralized Llm)
+    //   3. local registry default (fast tier → active provider)
+    //   4. literal 'none'
+    // If (2) names an Llm the HTTP-mode proxy-model pipeline can route
+    // through mcpd's /api/v1/llms/:name/infer (pivot lands when the client
+    // integrates that path); meanwhile the value is still usable as a cache
+    // key, and the describe-project warning flags stale configs.
    const llmProvider = localOverride?.provider ?? mcpdConfig.llmProvider
      ?? effectiveRegistry?.getTierProviders('fast')[0]
      ?? effectiveRegistry?.getActiveName()
--- a/src/mcplocal/src/llm-config.ts
+++ b/src/mcplocal/src/llm-config.ts
@@ -173,6 +173,9 @@ export async function createProvidersFromConfig(
    if (entry.tier) {
      registry.assignTier(provider.name, entry.tier);
    }
+    if (entry.failoverFor) {
+      registry.registerFailover(entry.failoverFor, provider.name);
+    }
  }

  return registry;
--- a/src/mcplocal/src/providers/failover-router.ts
+++ b/src/mcplocal/src/providers/failover-router.ts
@@ -0,0 +1,107 @@
+/**
+ * FailoverRouter — orchestrates "try mcpd's centralized Llm, fall back to a
+ * local provider when authorized" for clients that consume the inference
+ * proxy.
+ *
+ * Decision flow on a centralized inference call:
+ *
+ *   1. Call the primary (the supplied `primary` callback, typically an HTTP
+ *      POST to mcpd /api/v1/llms/:name/infer).
+ *   2. If that succeeds → done.
+ *   3. If it fails AND a local provider is registered as failover for this
+ *      Llm name → call mcpd /api/v1/llms/:name (RBAC-gated) to verify the
+ *      caller still has permission to view this Llm. mcpd unreachable →
+ *      fail-closed (re-throw the original error). 403 → fail-closed.
+ *   4. 200 → invoke the local provider's `complete()` and tag the result
+ *      as `failover: true` for client-side audit.
+ *
+ * The check call uses HEAD to avoid pulling the Llm body (and any
+ * description / extraConfig) over the wire — mcpd treats both methods the
+ * same in the RBAC hook because the URL maps to the same permission.
+ */
+import type { LlmProvider } from './types.js';
+import type { ProviderRegistry } from './registry.js';
+
+export interface FailoverDecision<T> {
+  result: T;
+  failover: boolean;
+  /** Name of the local provider used (only set when failover === true). */
+  via?: string;
+}
+
+export interface FailoverRouterDeps {
+  /** Injected fetch for the RBAC pre-check. Tests mock this. */
+  fetch?: typeof globalThis.fetch;
+  /** mcpd base URL (no trailing slash). */
+  mcpdUrl: string;
+  /** Bearer token to attach to the RBAC pre-check call. */
+  bearerToken?: string;
+}
+
+/** Outcome of the RBAC pre-check. Used internally + exposed for tests. */
+export type AuthCheckOutcome = 'allowed' | 'forbidden' | 'unreachable';
+
+export class FailoverRouter {
+  private readonly fetchImpl: typeof globalThis.fetch;
+  private readonly mcpdUrl: string;
+  private readonly bearer: string | undefined;
+
+  constructor(
+    private readonly registry: ProviderRegistry,
+    deps: FailoverRouterDeps,
+  ) {
+    this.fetchImpl = deps.fetch ?? globalThis.fetch;
+    this.mcpdUrl = deps.mcpdUrl.replace(/\/+$/, '');
+    if (deps.bearerToken !== undefined) this.bearer = deps.bearerToken;
+  }
+
+  /**
+   * Run a primary inference attempt; on failure, fall back to the local
+   * provider if one is registered for this Llm AND the caller still has
+   * `view:llms:<llmName>` on mcpd.
+   *
+   * `primary` should reject (throw) when mcpd's proxy is unreachable or
+   * returns a 5xx — that's the signal to consider failover. 4xx errors that
+   * indicate a bad request are surfaced as-is; the router only retries on
+   * primary failure shapes that look like an upstream/network issue.
+   */
+  async run<T>(
+    llmName: string,
+    primary: () => Promise<T>,
+    localCall: (provider: LlmProvider) => Promise<T>,
+  ): Promise<FailoverDecision<T>> {
+    try {
+      const result = await primary();
+      return { result, failover: false };
+    } catch (primaryErr) {
+      const local = this.registry.getFailoverFor(llmName);
+      if (local === null) throw primaryErr;
+
+      const auth = await this.checkAuth(llmName);
+      if (auth !== 'allowed') {
+        // Fail-closed for forbidden AND unreachable.
+        throw primaryErr;
+      }
+
+      const result = await localCall(local);
+      return { result, failover: true, via: local.name };
+    }
+  }
+
+  /** RBAC pre-check exposed for tests / status-display callers. */
+  async checkAuth(llmName: string): Promise<AuthCheckOutcome> {
+    const url = `${this.mcpdUrl}/api/v1/llms/${encodeURIComponent(llmName)}`;
+    const headers: Record<string, string> = {};
+    if (this.bearer !== undefined) headers['Authorization'] = `Bearer ${this.bearer}`;
+    let res: Response;
+    try {
+      res = await this.fetchImpl(url, { method: 'HEAD', headers });
+    } catch {
+      return 'unreachable';
+    }
+    if (res.status === 200 || res.status === 204) return 'allowed';
+    if (res.status === 403 || res.status === 401) return 'forbidden';
+    // Anything else (404, 500…) — treat as unreachable for the failover flow.
+    return 'unreachable';
+  }
+}
--- a/src/mcplocal/src/providers/registry.ts
+++ b/src/mcplocal/src/providers/registry.ts
@@ -8,6 +8,8 @@ export class ProviderRegistry {
  private providers = new Map<string, LlmProvider>();
  private activeProvider: string | null = null;
  private tierProviders = new Map<Tier, string[]>();
+  /** Maps a centralized Llm name → local provider name that can substitute when mcpd is unreachable. */
+  private failoverMap = new Map<string, string>();

  register(provider: LlmProvider): void {
    this.providers.set(provider.name, provider);
@@ -31,6 +33,30 @@ export class ProviderRegistry {
        this.tierProviders.set(tier, filtered);
      }
    }
+    // Remove from failover map (any entry whose local-provider value points at this name)
+    for (const [centralName, localName] of this.failoverMap) {
+      if (localName === name) this.failoverMap.delete(centralName);
+    }
+  }
+
+  /** Mark `localProviderName` as the failover for the centralized Llm named `centralLlmName`. */
+  registerFailover(centralLlmName: string, localProviderName: string): void {
+    if (!this.providers.has(localProviderName)) {
+      throw new Error(`Provider '${localProviderName}' is not registered`);
+    }
+    this.failoverMap.set(centralLlmName, localProviderName);
+  }
+
+  /** Look up the local provider that can substitute for a centralized Llm, if any. */
+  getFailoverFor(centralLlmName: string): LlmProvider | null {
+    const localName = this.failoverMap.get(centralLlmName);
+    if (localName === undefined) return null;
+    return this.providers.get(localName) ?? null;
+  }
+
+  /** Names of central Llms that have a local failover registered. Used in status output. */
+  listFailovers(): Array<{ centralLlmName: string; localProviderName: string }> {
+    return [...this.failoverMap.entries()].map(([centralLlmName, localProviderName]) => ({ centralLlmName, localProviderName }));
  }

  setActive(name: string): void {
--- a/src/mcplocal/tests/failover-router.test.ts
+++ b/src/mcplocal/tests/failover-router.test.ts
@@ -0,0 +1,170 @@
+import { describe, it, expect, vi } from 'vitest';
+import { ProviderRegistry } from '../src/providers/registry.js';
+import { FailoverRouter } from '../src/providers/failover-router.js';
+import type { LlmProvider, CompleteResponse } from '../src/providers/types.js';
+
+function fakeProvider(name: string): LlmProvider {
+  const completeFn = vi.fn(async (): Promise<CompleteResponse> => ({
+    content: 'local response',
+    finishReason: 'stop',
+  }));
+  return {
+    name,
+    complete: completeFn,
+    listModels: vi.fn(async () => [name]),
+    isAvailable: vi.fn(async () => true),
+  };
+}
+
+function makeFetch(behaviour: { method: string; status?: number; throw?: boolean }): ReturnType<typeof vi.fn> {
+  return vi.fn(async (url: string | URL, init?: RequestInit) => {
+    if (behaviour.throw === true) throw new Error('connection refused');
+    expect(init?.method).toBe(behaviour.method);
+    expect(String(url)).toMatch(/\/api\/v1\/llms\//);
+    return new Response(null, { status: behaviour.status ?? 200 });
+  });
+}
+
+describe('ProviderRegistry — failover map', () => {
+  it('registerFailover maps a central name → local provider name', () => {
+    const reg = new ProviderRegistry();
+    const local = fakeProvider('vllm-local');
+    reg.register(local);
+    reg.registerFailover('claude', 'vllm-local');
+
+    const found = reg.getFailoverFor('claude');
+    expect(found?.name).toBe('vllm-local');
+  });
+
+  it('getFailoverFor returns null when no map entry exists', () => {
+    const reg = new ProviderRegistry();
+    reg.register(fakeProvider('vllm-local'));
+    expect(reg.getFailoverFor('claude')).toBeNull();
+  });
+
+  it('registerFailover throws when local provider is not registered', () => {
+    const reg = new ProviderRegistry();
+    expect(() => reg.registerFailover('claude', 'missing')).toThrow(/not registered/);
+  });
+
+  it('unregister removes failover entries that pointed at the removed provider', () => {
+    const reg = new ProviderRegistry();
+    reg.register(fakeProvider('vllm-local'));
+    reg.registerFailover('claude', 'vllm-local');
+    reg.unregister('vllm-local');
+    expect(reg.getFailoverFor('claude')).toBeNull();
+    expect(reg.listFailovers()).toEqual([]);
+  });
+
+  it('listFailovers reports the current map', () => {
+    const reg = new ProviderRegistry();
+    reg.register(fakeProvider('vllm-local'));
+    reg.registerFailover('claude', 'vllm-local');
+    reg.registerFailover('opus', 'vllm-local');
+    expect(reg.listFailovers()).toEqual([
+      { centralLlmName: 'claude', localProviderName: 'vllm-local' },
+      { centralLlmName: 'opus', localProviderName: 'vllm-local' },
+    ]);
+  });
+});
+
+describe('FailoverRouter', () => {
+  it('returns primary result when primary succeeds', async () => {
+    const reg = new ProviderRegistry();
+    reg.register(fakeProvider('vllm-local'));
+    reg.registerFailover('claude', 'vllm-local');
+
+    const router = new FailoverRouter(reg, {
+      mcpdUrl: 'http://mcpd',
+      fetch: vi.fn() as unknown as typeof fetch,
+    });
+    const out = await router.run('claude', async () => 'central', async () => 'local');
+    expect(out.failover).toBe(false);
+    expect(out.result).toBe('central');
+  });
+
+  it('falls back to local when primary fails AND mcpd auth-checks 200', async () => {
+    const reg = new ProviderRegistry();
+    reg.register(fakeProvider('vllm-local'));
+    reg.registerFailover('claude', 'vllm-local');
+
+    const fetchFn = makeFetch({ method: 'HEAD', status: 200 });
+    const router = new FailoverRouter(reg, {
+      mcpdUrl: 'http://mcpd',
+      fetch: fetchFn as unknown as typeof fetch,
+      bearerToken: 'bearer-x',
+    });
+    const out = await router.run(
+      'claude',
+      async () => { throw new Error('upstream down'); },
+      async (provider) => `via:${provider.name}`,
+    );
+    expect(out.failover).toBe(true);
+    expect(out.via).toBe('vllm-local');
+    expect(out.result).toBe('via:vllm-local');
+
+    // Bearer was attached
+    const [, init] = fetchFn.mock.calls[0] as [string, RequestInit];
+    expect((init.headers as Record<string, string>)['Authorization']).toBe('Bearer bearer-x');
+  });
+
+  it('re-throws primary error when no local failover is registered', async () => {
+    const reg = new ProviderRegistry();
+    const router = new FailoverRouter(reg, {
+      mcpdUrl: 'http://mcpd',
+      fetch: vi.fn() as unknown as typeof fetch,
+    });
+    await expect(router.run(
+      'claude',
+      async () => { throw new Error('boom'); },
+      async () => 'never',
+    )).rejects.toThrow('boom');
+  });
+
+  it('re-throws (fail-closed) when mcpd returns 403 to the auth check', async () => {
+    const reg = new ProviderRegistry();
+    reg.register(fakeProvider('vllm-local'));
+    reg.registerFailover('claude', 'vllm-local');
+
+    const router = new FailoverRouter(reg, {
+      mcpdUrl: 'http://mcpd',
+      fetch: makeFetch({ method: 'HEAD', status: 403 }) as unknown as typeof fetch,
+    });
+    await expect(router.run(
+      'claude',
+      async () => { throw new Error('upstream down'); },
+      async () => 'never',
+    )).rejects.toThrow('upstream down');
+  });
+
+  it('re-throws (fail-closed) when mcpd itself is unreachable for the auth check', async () => {
+    const reg = new ProviderRegistry();
+    reg.register(fakeProvider('vllm-local'));
+    reg.registerFailover('claude', 'vllm-local');
+
+    const router = new FailoverRouter(reg, {
+      mcpdUrl: 'http://mcpd',
+      fetch: makeFetch({ method: 'HEAD', throw: true }) as unknown as typeof fetch,
+    });
+    await expect(router.run(
+      'claude',
+      async () => { throw new Error('upstream down'); },
+      async () => 'never',
+    )).rejects.toThrow('upstream down');
+  });
+
+  it('checkAuth maps responses correctly', async () => {
+    const reg = new ProviderRegistry();
+    const make = (status: number) => new FailoverRouter(reg, {
+      mcpdUrl: 'http://mcpd',
+      fetch: (async () => new Response(null, { status })) as unknown as typeof fetch,
+    });
+
+    expect(await make(200).checkAuth('claude')).toBe('allowed');
+    expect(await make(204).checkAuth('claude')).toBe('allowed');
+    expect(await make(401).checkAuth('claude')).toBe('forbidden');
+    expect(await make(403).checkAuth('claude')).toBe('forbidden');
+    expect(await make(404).checkAuth('claude')).toBe('unreachable');
+    expect(await make(500).checkAuth('claude')).toBe('unreachable');
+  });
+});
--- a/src/mcplocal/tests/llm-reference-resolver.test.ts
+++ b/src/mcplocal/tests/llm-reference-resolver.test.ts
@@ -0,0 +1,45 @@
+import { describe, it, expect, vi } from 'vitest';
+import { resolveProjectLlmReference } from '../src/discovery.js';
+import type { McpdClient } from '../src/http/mcpd-client.js';
+
+function mockClient(get: (path: string) => Promise<unknown>): McpdClient {
+  return { get } as unknown as McpdClient;
+}
+
+describe('resolveProjectLlmReference', () => {
+  it('returns "disabled" for the literal string "none"', async () => {
+    const client = mockClient(async () => { throw new Error('should not be called'); });
+    expect(await resolveProjectLlmReference(client, 'none')).toBe('disabled');
+  });
+
+  it('returns "unregistered" when llmProvider is empty or undefined', async () => {
+    const client = mockClient(async () => { throw new Error('should not be called'); });
+    expect(await resolveProjectLlmReference(client, undefined)).toBe('unregistered');
+    expect(await resolveProjectLlmReference(client, '')).toBe('unregistered');
+  });
+
+  it('returns "registered" when mcpd returns 200 for the name', async () => {
+    const get = vi.fn(async () => ({ name: 'claude' }));
+    expect(await resolveProjectLlmReference(mockClient(get), 'claude')).toBe('registered');
+    expect(get).toHaveBeenCalledWith('/api/v1/llms/claude');
+  });
+
+  it('returns "unregistered" on 404', async () => {
+    const client = mockClient(async () => { throw new Error('HTTP 404 not found'); });
+    expect(await resolveProjectLlmReference(client, 'missing')).toBe('unregistered');
+  });
+
+  it('returns "unreachable" on other errors (500, network)', async () => {
+    const client = mockClient(async () => { throw new Error('HTTP 500 internal error'); });
+    expect(await resolveProjectLlmReference(client, 'x')).toBe('unreachable');
+
+    const client2 = mockClient(async () => { throw new Error('ECONNREFUSED'); });
+    expect(await resolveProjectLlmReference(client2, 'x')).toBe('unreachable');
+  });
+
+  it('URL-encodes names with special characters', async () => {
+    const get = vi.fn(async () => ({}));
+    await resolveProjectLlmReference(mockClient(get), 'weird name/with/slashes');
+    expect(get).toHaveBeenCalledWith('/api/v1/llms/weird%20name%2Fwith%2Fslashes');
+  });
+});
--- a/src/mcplocal/tests/smoke/llm-infer.smoke.test.ts
+++ b/src/mcplocal/tests/smoke/llm-infer.smoke.test.ts
@@ -0,0 +1,214 @@
+/**
+ * Smoke tests: `POST /api/v1/llms/:name/infer` against live mcpd.
+ *
+ * Validates the Phase 2 inference proxy path without needing a real provider
+ * key. We exercise the error-shape guarantees:
+ *   1. Missing Llm → 404.
+ *   2. Existing Llm + empty body → 400.
+ *   3. Existing Llm pointed at an unreachable URL → 502 with an error body.
+ *   4. RBAC: non-admin calling infer without `run:llms:<name>` → 403 (skipped
+ *      if we can't mint a scoped McpToken in this environment).
+ *
+ * The happy-path test needs a real provider, so we skip it by default and
+ * gate on LLM_INFER_SMOKE_REAL=1 + a working Llm name supplied via
+ * LLM_INFER_SMOKE_LLM.
+ */
+import { describe, it, expect, beforeAll, afterAll } from 'vitest';
+import http from 'node:http';
+import https from 'node:https';
+import { execSync } from 'node:child_process';
+
+const MCPD_URL = process.env.MCPD_URL ?? 'https://mcpctl.ad.itaz.eu';
+const SUFFIX = Date.now().toString(36);
+const SECRET_NAME = `smoke-infer-sec-${SUFFIX}`;
+const LLM_NAME = `smoke-infer-${SUFFIX}`;
+
+interface CliResult { code: number; stdout: string; stderr: string }
+
+function run(args: string): CliResult {
+  try {
+    const stdout = execSync(`mcpctl --direct ${args}`, {
+      encoding: 'utf-8',
+      timeout: 30_000,
+      stdio: ['ignore', 'pipe', 'pipe'],
+    });
+    return { code: 0, stdout: stdout.trim(), stderr: '' };
+  } catch (err) {
+    const e = err as { status?: number; stdout?: Buffer | string; stderr?: Buffer | string };
+    return {
+      code: e.status ?? 1,
+      stdout: e.stdout ? (typeof e.stdout === 'string' ? e.stdout : e.stdout.toString('utf-8')) : '',
+      stderr: e.stderr ? (typeof e.stderr === 'string' ? e.stderr : e.stderr.toString('utf-8')) : '',
+    };
+  }
+}
+
+function healthz(url: string, timeoutMs = 5000): Promise<boolean> {
+  return new Promise((resolve) => {
+    const parsed = new URL(`${url.replace(/\/$/, '')}/healthz`);
+    const driver = parsed.protocol === 'https:' ? https : http;
+    const req = driver.get(
+      {
+        hostname: parsed.hostname,
+        port: parsed.port || (parsed.protocol === 'https:' ? 443 : 80),
+        path: parsed.pathname,
+        timeout: timeoutMs,
+      },
+      (res) => { resolve((res.statusCode ?? 500) < 500); res.resume(); },
+    );
+    req.on('error', () => resolve(false));
+    req.on('timeout', () => { req.destroy(); resolve(false); });
+  });
+}
+
+/** Look up the current session bearer so we can POST /infer directly. */
+function getBearer(): string | undefined {
+  // Try ~/.mcpctl/credentials.json via the CLI — `mcpctl config get` knows where it lives.
+  // If that shape changes, fall back to MCPCTL_TOKEN env.
+  const envToken = process.env.MCPCTL_TOKEN;
+  if (envToken !== undefined && envToken !== '') return envToken;
+  try {
+    // shape: { "session": { "token": "..." } } or similar — be defensive.
+    const out = execSync('cat ~/.mcpctl/credentials.json 2>/dev/null', { encoding: 'utf-8' });
+    const parsed = JSON.parse(out) as Record<string, unknown>;
+    const token = (parsed.token ?? (parsed.session as { token?: string } | undefined)?.token);
+    return typeof token === 'string' ? token : undefined;
+  } catch {
+    return undefined;
+  }
+}
+
+async function post(
+  path: string,
+  body: unknown,
+  bearer?: string,
+): Promise<{ status: number; body: unknown }> {
+  const url = new URL(`${MCPD_URL.replace(/\/$/, '')}${path}`);
+  const driver = url.protocol === 'https:' ? https : http;
+  const payload = JSON.stringify(body);
+  const headers: Record<string, string> = {
+    'Content-Type': 'application/json',
+    'Content-Length': Buffer.byteLength(payload).toString(),
+  };
+  if (bearer !== undefined) headers['Authorization'] = `Bearer ${bearer}`;
+
+  return new Promise((resolve, reject) => {
+    const req = driver.request(
+      {
+        hostname: url.hostname,
+        port: url.port || (url.protocol === 'https:' ? 443 : 80),
+        path: url.pathname + url.search,
+        method: 'POST',
+        headers,
+        timeout: 15_000,
+      },
+      (res) => {
+        const chunks: Buffer[] = [];
+        res.on('data', (c: Buffer) => chunks.push(c));
+        res.on('end', () => {
+          const raw = Buffer.concat(chunks).toString('utf-8');
+          let parsed: unknown = raw;
+          try { parsed = JSON.parse(raw); } catch { /* leave as string */ }
+          resolve({ status: res.statusCode ?? 0, body: parsed });
+        });
+      },
+    );
+    req.on('error', reject);
+    req.on('timeout', () => { req.destroy(); reject(new Error('request timed out')); });
+    req.write(payload);
+    req.end();
+  });
+}
+
+let mcpdUp = false;
+let bearer: string | undefined;
+
+describe('llm-infer smoke', () => {
+  beforeAll(async () => {
+    mcpdUp = await healthz(MCPD_URL);
+    if (!mcpdUp) {
+      // eslint-disable-next-line no-console
+      console.warn(`\n  ○ llm-infer smoke: skipped — ${MCPD_URL}/healthz unreachable.\n`);
+      return;
+    }
+    bearer = getBearer();
+    if (bearer === undefined) {
+      // eslint-disable-next-line no-console
+      console.warn('\n  ○ llm-infer smoke: no bearer available (set MCPCTL_TOKEN or login). Direct POST tests will skip.\n');
+    }
+  }, 20_000);
+
+  afterAll(() => {
+    if (!mcpdUp) return;
+    run(`delete llm ${LLM_NAME}`);
+    run(`delete secret ${SECRET_NAME}`);
+  });
+
+  it('creates a fixture secret + Llm pointed at an unreachable URL', () => {
+    if (!mcpdUp) return;
+    run(`delete llm ${LLM_NAME}`);
+    run(`delete secret ${SECRET_NAME}`);
+
+    expect(run(`create secret ${SECRET_NAME} --data token=sk-fake`).code).toBe(0);
+    const createLlm = run([
+      `create llm ${LLM_NAME}`,
+      '--type openai',
+      '--model gpt-4o-mini',
+      // Unroutable host so any actual upstream call returns an adapter error → 502
+      '--url http://127.0.0.1:1',
+      `--api-key-ref ${SECRET_NAME}/token`,
+    ].join(' '));
+    expect(createLlm.code, createLlm.stderr || createLlm.stdout).toBe(0);
+  });
+
+  it('returns 404 for an unknown Llm name', async () => {
+    if (!mcpdUp || bearer === undefined) return;
+    const res = await post('/api/v1/llms/__nonexistent_llm__/infer',
+      { messages: [{ role: 'user', content: 'hi' }] }, bearer);
+    expect(res.status).toBe(404);
+  });
+
+  it('returns 400 when messages is missing', async () => {
+    if (!mcpdUp || bearer === undefined) return;
+    const res = await post(`/api/v1/llms/${LLM_NAME}/infer`, {}, bearer);
+    expect(res.status).toBe(400);
+    const body = res.body as { error?: string };
+    expect(body.error ?? '').toMatch(/messages/i);
+  });
+
+  it('returns 502 when the upstream provider is unreachable', async () => {
+    if (!mcpdUp || bearer === undefined) return;
+    const res = await post(`/api/v1/llms/${LLM_NAME}/infer`,
+      { messages: [{ role: 'user', content: 'hi' }] }, bearer);
+    // 502 is what the proxy returns on adapter errors; some paths may return
+    // the upstream's own status if the request reached it, so accept any
+    // non-2xx with an error body.
+    expect(res.status).toBeGreaterThanOrEqual(400);
+    expect(res.status).not.toBe(404);
+    expect(res.status).not.toBe(400);
+    const body = res.body as { error?: string | { message?: string } };
+    const msg = typeof body.error === 'string' ? body.error : body.error?.message ?? '';
+    expect(msg, 'error body must describe the failure').not.toBe('');
+  }, 30_000);
+
+  it('happy-path inference (opt-in: LLM_INFER_SMOKE_REAL=1 + LLM_INFER_SMOKE_LLM=<name>)', async () => {
+    if (!mcpdUp || bearer === undefined) return;
+    if (process.env.LLM_INFER_SMOKE_REAL !== '1') {
+      // eslint-disable-next-line no-console
+      console.warn('    ○ happy-path skipped — set LLM_INFER_SMOKE_REAL=1 and LLM_INFER_SMOKE_LLM=<name> of a working Llm.');
+      return;
+    }
+    const name = process.env.LLM_INFER_SMOKE_LLM;
+    if (name === undefined || name === '') {
+      throw new Error('LLM_INFER_SMOKE_LLM must be set when LLM_INFER_SMOKE_REAL=1');
+    }
+    const res = await post(`/api/v1/llms/${name}/infer`, {
+      messages: [{ role: 'user', content: 'Say "smoke-ok" and nothing else.' }],
+      max_tokens: 8,
+    }, bearer);
+    expect(res.status).toBe(200);
+    const body = res.body as { choices?: Array<{ message?: { content?: string } }> };
+    const content = body.choices?.[0]?.message?.content ?? '';
+    expect(content).toMatch(/smoke-ok/i);
+  }, 60_000);
+});
--- a/src/mcplocal/tests/smoke/llm.smoke.test.ts
+++ b/src/mcplocal/tests/smoke/llm.smoke.test.ts
@@ -0,0 +1,168 @@
+/**
+ * Smoke tests: Llm resource CRUD + apiKeyRef linkage against live mcpd.
+ *
+ * Exercises the Phase 1 CLI contract end-to-end:
+ *   1. Create a secret carrying a fake API key.
+ *   2. `mcpctl create llm` referencing that secret via --api-key-ref.
+ *   3. `mcpctl describe llm` shows type/model/tier + the secret ref.
+ *   4. `mcpctl get llms -o yaml` round-trips cleanly into `apply -f`.
+ *   5. Delete llm + secret.
+ *
+ * Inference itself is covered in llm-infer.smoke.test.ts — this file is
+ * purely about the registry.
+ */
+import { describe, it, expect, beforeAll, afterAll } from 'vitest';
+import http from 'node:http';
+import https from 'node:https';
+import { execSync } from 'node:child_process';
+import { writeFileSync, unlinkSync, mkdtempSync } from 'node:fs';
+import { join } from 'node:path';
+import { tmpdir } from 'node:os';
+
+const MCPD_URL = process.env.MCPD_URL ?? 'https://mcpctl.ad.itaz.eu';
+const SUFFIX = Date.now().toString(36);
+const SECRET_NAME = `smoke-llm-sec-${SUFFIX}`;
+const LLM_NAME = `smoke-llm-${SUFFIX}`;
+
+interface CliResult { code: number; stdout: string; stderr: string }
+
+function run(args: string): CliResult {
+  try {
+    const stdout = execSync(`mcpctl --direct ${args}`, {
+      encoding: 'utf-8',
+      timeout: 30_000,
+      stdio: ['ignore', 'pipe', 'pipe'],
+    });
+    return { code: 0, stdout: stdout.trim(), stderr: '' };
+  } catch (err) {
+    const e = err as { status?: number; stdout?: Buffer | string; stderr?: Buffer | string };
+    return {
+      code: e.status ?? 1,
+      stdout: e.stdout ? (typeof e.stdout === 'string' ? e.stdout : e.stdout.toString('utf-8')) : '',
+      stderr: e.stderr ? (typeof e.stderr === 'string' ? e.stderr : e.stderr.toString('utf-8')) : '',
+    };
+  }
+}
+
+function healthz(url: string, timeoutMs = 5000): Promise<boolean> {
+  return new Promise((resolve) => {
+    const parsed = new URL(`${url.replace(/\/$/, '')}/healthz`);
+    const driver = parsed.protocol === 'https:' ? https : http;
+    const req = driver.get(
+      {
+        hostname: parsed.hostname,
+        port: parsed.port || (parsed.protocol === 'https:' ? 443 : 80),
+        path: parsed.pathname,
+        timeout: timeoutMs,
+      },
+      (res) => { resolve((res.statusCode ?? 500) < 500); res.resume(); },
+    );
+    req.on('error', () => resolve(false));
+    req.on('timeout', () => { req.destroy(); resolve(false); });
+  });
+}
+
+let mcpdUp = false;
+
+describe('llm smoke', () => {
+  beforeAll(async () => {
+    mcpdUp = await healthz(MCPD_URL);
+    if (!mcpdUp) {
+      // eslint-disable-next-line no-console
+      console.warn(`\n  ○ llm smoke: skipped — ${MCPD_URL}/healthz unreachable. Set MCPD_URL to override.\n`);
+    }
+  }, 20_000);
+
+  afterAll(() => {
+    if (!mcpdUp) return;
+    run(`delete llm ${LLM_NAME}`);
+    run(`delete secret ${SECRET_NAME}`);
+  });
+
+  it('creates a secret to hold the fake API key', () => {
+    if (!mcpdUp) return;
+    run(`delete secret ${SECRET_NAME}`); // idempotent cleanup
+    const result = run(`create secret ${SECRET_NAME} --data token=sk-fake-xyz`);
+    expect(result.code, result.stderr).toBe(0);
+  });
+
+  it('creates an Llm pointing at the secret via --api-key-ref', () => {
+    if (!mcpdUp) return;
+    run(`delete llm ${LLM_NAME}`);
+    const cmd = [
+      `create llm ${LLM_NAME}`,
+      '--type openai',
+      '--model gpt-4o-mini',
+      '--tier fast',
+      '--url http://nowhere.example:9000',
+      `--api-key-ref ${SECRET_NAME}/token`,
+      '--description smoke-test',
+    ].join(' ');
+    const result = run(cmd);
+    expect(result.code, result.stderr || result.stdout).toBe(0);
+    expect(result.stdout).toMatch(new RegExp(`llm '${LLM_NAME}'`));
+  });
+
+  it('describe llm shows the secret ref in sectioned output', () => {
+    if (!mcpdUp) return;
+    const result = run(`describe llm ${LLM_NAME}`);
+    expect(result.code, result.stderr).toBe(0);
+    expect(result.stdout).toContain(`=== LLM: ${LLM_NAME} ===`);
+    expect(result.stdout).toContain('Type:');
+    expect(result.stdout).toContain('openai');
+    expect(result.stdout).toContain('Model:');
+    expect(result.stdout).toContain('gpt-4o-mini');
+    expect(result.stdout).toContain('API Key:');
+    expect(result.stdout).toContain(SECRET_NAME);
+    expect(result.stdout).toContain('token');
+    // Raw key value must NOT appear — only the ref
+    expect(result.stdout).not.toContain('sk-fake-xyz');
+  });
+
+  it('get llms shows the row with KEY column rendered as "secret://name/key"', () => {
+    if (!mcpdUp) return;
+    // Table output truncates the KEY column (≈34 chars), so the full
+    // "secret://<name>/<key>" string won't appear verbatim in the row. Assert
+    // against JSON output where the apiKeyRef round-trips as a structured
+    // object.
+    const result = run('get llms -o json');
+    expect(result.code).toBe(0);
+    const rows = JSON.parse(result.stdout) as Array<{ name: string; apiKeyRef?: { name: string; key: string } }>;
+    const row = rows.find((r) => r.name === LLM_NAME);
+    expect(row, `row ${LLM_NAME} must be present`).toBeDefined();
+    expect(row!.apiKeyRef).toEqual({ name: SECRET_NAME, key: 'token' });
+  });
+
+  it('round-trips yaml output → apply -f', () => {
+    if (!mcpdUp) return;
+    const yaml = run(`get llm ${LLM_NAME} -o yaml`);
+    expect(yaml.code).toBe(0);
+    expect(yaml.stdout).toMatch(/kind:\s+llm/);
+    expect(yaml.stdout).toContain(`name: ${LLM_NAME}`);
+    expect(yaml.stdout).toContain(`name: ${SECRET_NAME}`); // apiKeyRef block
+
+    // Change the description via apply -f with the YAML we just pulled.
+    const dir = mkdtempSync(join(tmpdir(), 'mcpctl-smoke-'));
+    const path = join(dir, 'llm.yaml');
+    const amended = yaml.stdout.replace('description: smoke-test', 'description: smoke-test-amended');
+    writeFileSync(path, amended);
+    try {
+      const applied = run(`apply -f ${path}`);
+      expect(applied.code, applied.stderr || applied.stdout).toBe(0);
+      const described = run(`describe llm ${LLM_NAME}`);
+      expect(described.stdout).toContain('smoke-test-amended');
+    } finally {
+      unlinkSync(path);
+    }
+  });
+
+  it('deletes the llm and leaves the underlying secret intact', () => {
+    if (!mcpdUp) return;
+    const del = run(`delete llm ${LLM_NAME}`);
+    expect(del.code, del.stderr).toBe(0);
+
+    // Secret still exists (apiKeyRef uses onDelete: SetNull so the secret isn't touched)
+    const secret = run(`describe secret ${SECRET_NAME}`);
+    expect(secret.code).toBe(0);
+  });
+});
--- a/src/mcplocal/tests/smoke/project-llm-ref.smoke.test.ts
+++ b/src/mcplocal/tests/smoke/project-llm-ref.smoke.test.ts
@@ -0,0 +1,130 @@
+/**
+ * Smoke tests: Project.llmProvider as Llm reference (Phase 4).
+ *
+ * Verifies the describe-project warning behavior against live mcpd:
+ *   1. Project with `--llm <existing>` → no warning.
+ *   2. Project with `--llm <nonexistent>` → describe flags the orphan.
+ *   3. Project with `--llm none` → explicit disable, no warning.
+ */
+import { describe, it, expect, beforeAll, afterAll } from 'vitest';
+import http from 'node:http';
+import https from 'node:https';
+import { execSync } from 'node:child_process';
+
+const MCPD_URL = process.env.MCPD_URL ?? 'https://mcpctl.ad.itaz.eu';
+const SUFFIX = Date.now().toString(36);
+const LLM_NAME = `smoke-proj-llm-${SUFFIX}`;
+const PROJ_OK = `smoke-proj-ok-${SUFFIX}`;
+const PROJ_ORPHAN = `smoke-proj-orphan-${SUFFIX}`;
+const PROJ_NONE = `smoke-proj-none-${SUFFIX}`;
+
+interface CliResult { code: number; stdout: string; stderr: string }
+
+function run(args: string): CliResult {
+  try {
+    const stdout = execSync(`mcpctl --direct ${args}`, {
+      encoding: 'utf-8',
+      timeout: 30_000,
+      stdio: ['ignore', 'pipe', 'pipe'],
+    });
+    return { code: 0, stdout: stdout.trim(), stderr: '' };
+  } catch (err) {
+    const e = err as { status?: number; stdout?: Buffer | string; stderr?: Buffer | string };
+    return {
+      code: e.status ?? 1,
+      stdout: e.stdout ? (typeof e.stdout === 'string' ? e.stdout : e.stdout.toString('utf-8')) : '',
+      stderr: e.stderr ? (typeof e.stderr === 'string' ? e.stderr : e.stderr.toString('utf-8')) : '',
+    };
+  }
+}
+
+function healthz(url: string, timeoutMs = 5000): Promise<boolean> {
+  return new Promise((resolve) => {
+    const parsed = new URL(`${url.replace(/\/$/, '')}/healthz`);
+    const driver = parsed.protocol === 'https:' ? https : http;
+    const req = driver.get(
+      {
+        hostname: parsed.hostname,
+        port: parsed.port || (parsed.protocol === 'https:' ? 443 : 80),
+        path: parsed.pathname,
+        timeout: timeoutMs,
+      },
+      (res) => { resolve((res.statusCode ?? 500) < 500); res.resume(); },
+    );
+    req.on('error', () => resolve(false));
+    req.on('timeout', () => { req.destroy(); resolve(false); });
+  });
+}
+
+let mcpdUp = false;
+
+describe('project-llm-ref smoke', () => {
+  beforeAll(async () => {
+    mcpdUp = await healthz(MCPD_URL);
+    if (!mcpdUp) {
+      // eslint-disable-next-line no-console
+      console.warn(`\n  ○ project-llm-ref smoke: skipped — ${MCPD_URL}/healthz unreachable.\n`);
+      return;
+    }
+    // Fixture: an Llm we can point projects at.
+    run(`delete llm ${LLM_NAME}`);
+    const createLlm = run([
+      `create llm ${LLM_NAME}`,
+      '--type openai',
+      '--model gpt-4o-mini',
+      '--tier fast',
+      '--url http://127.0.0.1:1',
+    ].join(' '));
+    if (createLlm.code !== 0) {
+      // eslint-disable-next-line no-console
+      console.warn(`    ○ could not create fixture Llm: ${createLlm.stderr || createLlm.stdout}`);
+    }
+  }, 30_000);
+
+  afterAll(() => {
+    if (!mcpdUp) return;
+    run(`delete project ${PROJ_OK} --force`);
+    run(`delete project ${PROJ_ORPHAN} --force`);
+    run(`delete project ${PROJ_NONE} --force`);
+    run(`delete llm ${LLM_NAME}`);
+  });
+
+  it('project with --llm pointing at a registered Llm describes without warning', () => {
+    if (!mcpdUp) return;
+    run(`delete project ${PROJ_OK} --force`);
+    const created = run(`create project ${PROJ_OK} --llm ${LLM_NAME}`);
+    expect(created.code, created.stderr || created.stdout).toBe(0);
+
+    const described = run(`describe project ${PROJ_OK}`);
+    expect(described.code).toBe(0);
+    expect(described.stdout).toContain('LLM:');
+    expect(described.stdout).toContain(LLM_NAME);
+    expect(described.stdout).not.toContain('warning:');
+  });
+
+  it('project with --llm naming an unregistered Llm shows the warning line', () => {
+    if (!mcpdUp) return;
+    run(`delete project ${PROJ_ORPHAN} --force`);
+    const created = run(`create project ${PROJ_ORPHAN} --llm claude-ghost-${SUFFIX}`);
+    expect(created.code, created.stderr || created.stdout).toBe(0);
+
+    const described = run(`describe project ${PROJ_ORPHAN}`);
+    expect(described.code).toBe(0);
+    expect(described.stdout).toContain(`claude-ghost-${SUFFIX}`);
+    expect(described.stdout).toContain('warning:');
+    expect(described.stdout).toContain('registry default');
+  });
+
+  it('project with --llm none treats it as an explicit disable (no warning)', () => {
+    if (!mcpdUp) return;
+    run(`delete project ${PROJ_NONE} --force`);
+    const created = run(`create project ${PROJ_NONE} --llm none`);
+    expect(created.code).toBe(0);
+
+    const described = run(`describe project ${PROJ_NONE}`);
+    expect(described.code).toBe(0);
+    expect(described.stdout).toContain('LLM:');
+    expect(described.stdout).toContain('none');
+    expect(described.stdout).not.toContain('warning:');
+  });
+});
--- a/src/mcplocal/tests/smoke/secretbackend.smoke.test.ts
+++ b/src/mcplocal/tests/smoke/secretbackend.smoke.test.ts
@@ -0,0 +1,146 @@
+/**
+ * Smoke tests: SecretBackend CRUD against live mcpd.
+ *
+ * Exercises the Phase 0 CLI contract end-to-end:
+ *   1. `mcpctl get secretbackends` — the seeded `default` (plaintext) row exists
+ *      and is marked isDefault.
+ *   2. `mcpctl create secretbackend <name> --type plaintext` — create + list.
+ *   3. `mcpctl describe secretbackend <name>` — sectioned output; config
+ *      values that look like credentials are masked.
+ *   4. `mcpctl delete secretbackend default` — fails with 409 (cannot delete
+ *      the default row).
+ *   5. Cleanup: delete the test row; confirm it's gone.
+ *
+ * Target: mcpd direct (not mcplocal). We use `--direct` so the CLI bypasses
+ * mcplocal and hits mcpd at the configured URL. If mcpd is unreachable we
+ * skip with a clear message — same pattern as the mcptoken smoke.
+ *
+ * Run with: pnpm test:smoke
+ */
+import { describe, it, expect, beforeAll, afterAll } from 'vitest';
+import http from 'node:http';
+import https from 'node:https';
+import { execSync } from 'node:child_process';
+
+const MCPD_URL = process.env.MCPD_URL ?? 'https://mcpctl.ad.itaz.eu';
+const BACKEND_NAME = `smoke-sb-${Date.now().toString(36)}`;
+
+interface CliResult { code: number; stdout: string; stderr: string }
+
+function run(args: string): CliResult {
+  try {
+    const stdout = execSync(`mcpctl --direct ${args}`, {
+      encoding: 'utf-8',
+      timeout: 30_000,
+      stdio: ['ignore', 'pipe', 'pipe'],
+    });
+    return { code: 0, stdout: stdout.trim(), stderr: '' };
+  } catch (err) {
+    const e = err as { status?: number; stdout?: Buffer | string; stderr?: Buffer | string };
+    return {
+      code: e.status ?? 1,
+      stdout: e.stdout ? (typeof e.stdout === 'string' ? e.stdout : e.stdout.toString('utf-8')) : '',
+      stderr: e.stderr ? (typeof e.stderr === 'string' ? e.stderr : e.stderr.toString('utf-8')) : '',
+    };
+  }
+}
+
+function healthz(url: string, timeoutMs = 5000): Promise<boolean> {
+  return new Promise((resolve) => {
+    const parsed = new URL(`${url.replace(/\/$/, '')}/healthz`);
+    const driver = parsed.protocol === 'https:' ? https : http;
+    const req = driver.get(
+      {
+        hostname: parsed.hostname,
+        port: parsed.port || (parsed.protocol === 'https:' ? 443 : 80),
+        path: parsed.pathname,
+        timeout: timeoutMs,
+      },
+      (res) => { resolve((res.statusCode ?? 500) < 500); res.resume(); },
+    );
+    req.on('error', () => resolve(false));
+    req.on('timeout', () => { req.destroy(); resolve(false); });
+  });
+}
+
+let mcpdUp = false;
+
+describe('secretbackend smoke', () => {
+  beforeAll(async () => {
+    mcpdUp = await healthz(MCPD_URL);
+    if (!mcpdUp) {
+      // eslint-disable-next-line no-console
+      console.warn(`\n  ○ secretbackend smoke: skipped — ${MCPD_URL}/healthz unreachable. Set MCPD_URL to override.\n`);
+    }
+  }, 20_000);
+
+  afterAll(() => {
+    if (!mcpdUp) return;
+    run(`delete secretbackend ${BACKEND_NAME}`);
+  });
+
+  it('lists at least one secretbackend (the seeded plaintext default)', () => {
+    if (!mcpdUp) return;
+    const result = run('get secretbackends -o json');
+    expect(result.code, result.stderr).toBe(0);
+    const rows = JSON.parse(result.stdout) as Array<{ name: string; type: string; isDefault: boolean }>;
+    expect(rows.length).toBeGreaterThan(0);
+    const defaultRow = rows.find((r) => r.isDefault === true);
+    expect(defaultRow, 'a default backend must exist').toBeDefined();
+    expect(defaultRow!.type).toBe('plaintext');
+  });
+
+  it('creates a plaintext backend and round-trips it through describe', () => {
+    if (!mcpdUp) return;
+    // Idempotent cleanup in case a prior run left debris
+    run(`delete secretbackend ${BACKEND_NAME}`);
+
+    const created = run(`create secretbackend ${BACKEND_NAME} --type plaintext --description smoke-test`);
+    expect(created.code, created.stderr || created.stdout).toBe(0);
+    expect(created.stdout).toMatch(new RegExp(`secretbackend '${BACKEND_NAME}'`));
+
+    const described = run(`describe secretbackend ${BACKEND_NAME}`);
+    expect(described.code, described.stderr).toBe(0);
+    expect(described.stdout).toContain(`=== SecretBackend: ${BACKEND_NAME} ===`);
+    expect(described.stdout).toContain('Type:');
+    expect(described.stdout).toContain('plaintext');
+    expect(described.stdout).toContain('smoke-test');
+  });
+
+  it('refuses to delete the seeded default backend', () => {
+    if (!mcpdUp) return;
+    // Find whichever row is currently the default — we don't hard-code the name
+    // because operators may have renamed or swapped it.
+    const listed = run('get secretbackends -o json');
+    expect(listed.code).toBe(0);
+    const rows = JSON.parse(listed.stdout) as Array<{ name: string; isDefault: boolean }>;
+    const def = rows.find((r) => r.isDefault);
+    expect(def).toBeDefined();
+
+    const del = run(`delete secretbackend ${def!.name}`);
+    // 409 surfaces as exit 1 with a descriptive error
+    expect(del.code).toBe(1);
+    const combined = (del.stderr + del.stdout).toLowerCase();
+    expect(combined).toMatch(/default|in use|cannot delete/);
+  });
+
+  it('round-trips get -o yaml → apply -f', () => {
+    if (!mcpdUp) return;
+    const yaml = run(`get secretbackend ${BACKEND_NAME} -o yaml`);
+    expect(yaml.code).toBe(0);
+    // Apply-compatible output must start with `kind: secretbackend`
+    expect(yaml.stdout).toMatch(/kind:\s+secretbackend/);
+    expect(yaml.stdout).toContain(`name: ${BACKEND_NAME}`);
+    expect(yaml.stdout).toContain('type: plaintext');
+  });
+
+  it('deletes the test backend and confirms it is gone', () => {
+    if (!mcpdUp) return;
+    const del = run(`delete secretbackend ${BACKEND_NAME}`);
+    expect(del.code, del.stderr).toBe(0);
+
+    const listed = run('get secretbackends -o json');
+    const rows = JSON.parse(listed.stdout) as Array<{ name: string }>;
+    expect(rows.find((r) => r.name === BACKEND_NAME)).toBeUndefined();
+  });
+});
--- a/src/shared/src/index.ts
+++ b/src/shared/src/index.ts
@@ -5,3 +5,4 @@ export * from './utils/index.js';
 export * from './secrets/index.js';
 export * from './tokens/index.js';
 export * from './mcp-http/index.js';
+export * from './vault/index.js';
--- a/src/shared/src/vault/client.ts
+++ b/src/shared/src/vault/client.ts
@@ -0,0 +1,308 @@
+/**
+ * Thin HTTP wrappers around the OpenBao / Vault REST API.
+ *
+ * Used by:
+ *   - the CLI wizard (admin-token-scoped calls: enable engine, write policy,
+ *     create role, mint first token, smoke-test write/read)
+ *   - the mcpd rotator (caller-token-scoped calls: mint successor, revoke
+ *     predecessor, lookup-self for verification)
+ *
+ * Plain `fetch()` — no SDK dep, consistent with the OpenBaoDriver. All
+ * functions accept an injectable `fetch` in a deps arg so tests can mock.
+ */
+
+export interface VaultDeps {
+  fetch?: typeof globalThis.fetch;
+  /** Optional Vault Enterprise namespace (X-Vault-Namespace header). */
+  namespace?: string;
+}
+
+export interface VaultHealth {
+  initialized: boolean;
+  sealed: boolean;
+  standby: boolean;
+  version: string;
+}
+
+export interface MintedToken {
+  /** The raw client token (treat as secret — surface to user only in wizard transcript). */
+  clientToken: string;
+  /** Accessor used to revoke without knowing the token value. */
+  accessor: string;
+  /** TTL in seconds reported by Vault. For periodic tokens this is the period. */
+  leaseDuration: number;
+  /** True iff Vault said the token is renewable. The wizard bails if false. */
+  renewable: boolean;
+  policies: string[];
+}
+
+function baseUrl(url: string): string {
+  return url.replace(/\/+$/, '');
+}
+
+function headers(token: string | undefined, ns: string | undefined, withBody: boolean): Record<string, string> {
+  const h: Record<string, string> = {};
+  if (token !== undefined && token !== '') h['X-Vault-Token'] = token;
+  if (ns !== undefined && ns !== '') h['X-Vault-Namespace'] = ns;
+  if (withBody) h['Content-Type'] = 'application/json';
+  return h;
+}
+
+async function readError(res: Response): Promise<string> {
+  const text = await res.text().catch(() => '');
+  try {
+    const parsed = JSON.parse(text) as { errors?: string[] };
+    if (Array.isArray(parsed.errors) && parsed.errors.length > 0) return parsed.errors.join('; ');
+  } catch { /* fall through */ }
+  return text;
+}
+
+/** GET /v1/sys/health. Returns a normalised shape; throws on network error. */
+export async function verifyHealth(
+  url: string,
+  adminToken: string,
+  deps: VaultDeps = {},
+): Promise<VaultHealth> {
+  const fetchImpl = deps.fetch ?? globalThis.fetch;
+  // /sys/health returns 200/429/472/473/501/503 depending on state. All are
+  // valid responses to parse; anything else is a hard error.
+  const res = await fetchImpl(`${baseUrl(url)}/v1/sys/health`, {
+    method: 'GET',
+    headers: headers(adminToken, deps.namespace, false),
+  });
+  if (res.status >= 500 && res.status !== 501 && res.status !== 503) {
+    throw new Error(`vault health: HTTP ${String(res.status)} ${await readError(res)}`);
+  }
+  const body = await res.json() as Partial<VaultHealth> & { version?: string };
+  return {
+    initialized: body.initialized ?? false,
+    sealed: body.sealed ?? false,
+    standby: body.standby ?? false,
+    version: body.version ?? 'unknown',
+  };
+}
+
+/**
+ * Enable KV v2 at `mount` if not already mounted there. Idempotent.
+ * Returns `true` if a mount was created, `false` if it was already present.
+ */
+export async function ensureKvV2(
+  url: string,
+  adminToken: string,
+  mount: string,
+  deps: VaultDeps = {},
+): Promise<boolean> {
+  const fetchImpl = deps.fetch ?? globalThis.fetch;
+  const clean = mount.replace(/^\/|\/$/g, '');
+  // Check existing mounts
+  const listRes = await fetchImpl(`${baseUrl(url)}/v1/sys/mounts`, {
+    method: 'GET',
+    headers: headers(adminToken, deps.namespace, false),
+  });
+  if (!listRes.ok) {
+    throw new Error(`vault list mounts: HTTP ${String(listRes.status)} ${await readError(listRes)}`);
+  }
+  const mounts = await listRes.json() as Record<string, { type?: string; options?: { version?: string } }>;
+  const key = `${clean}/`;
+  const existing = mounts[key];
+  if (existing !== undefined) {
+    if (existing.type !== 'kv') {
+      throw new Error(`mount at '${clean}/' exists but is type '${String(existing.type)}', not kv`);
+    }
+    // Accept either v2 or unspecified (older Vault treats kv without options as v1 — surface a clear error).
+    if (existing.options?.version !== '2') {
+      throw new Error(`mount '${clean}/' is KV but not v2 (version='${String(existing.options?.version)}'). Use a different mount.`);
+    }
+    return false;
+  }
+  // Mount it
+  const mountRes = await fetchImpl(`${baseUrl(url)}/v1/sys/mounts/${clean}`, {
+    method: 'POST',
+    headers: headers(adminToken, deps.namespace, true),
+    body: JSON.stringify({ type: 'kv', options: { version: '2' } }),
+  });
+  if (!mountRes.ok) {
+    throw new Error(`vault mount ${clean}: HTTP ${String(mountRes.status)} ${await readError(mountRes)}`);
+  }
+  return true;
+}
+
+/** PUT /v1/sys/policies/acl/<name> with the provided HCL. Idempotent. */
+export async function writePolicy(
+  url: string,
+  adminToken: string,
+  name: string,
+  hcl: string,
+  deps: VaultDeps = {},
+): Promise<void> {
+  const fetchImpl = deps.fetch ?? globalThis.fetch;
+  const res = await fetchImpl(`${baseUrl(url)}/v1/sys/policies/acl/${encodeURIComponent(name)}`, {
+    method: 'PUT',
+    headers: headers(adminToken, deps.namespace, true),
+    body: JSON.stringify({ policy: hcl }),
+  });
+  if (!res.ok) {
+    throw new Error(`vault write policy ${name}: HTTP ${String(res.status)} ${await readError(res)}`);
+  }
+}
+
+export interface TokenRoleConfig {
+  allowedPolicies: string[];
+  /** Seconds. For `period`, pass 0 to omit. */
+  period?: number;
+  renewable?: boolean;
+  orphan?: boolean;
+}
+
+/** POST /v1/auth/token/roles/<role>. Idempotent: upserts the role config. */
+export async function ensureTokenRole(
+  url: string,
+  adminToken: string,
+  role: string,
+  cfg: TokenRoleConfig,
+  deps: VaultDeps = {},
+): Promise<void> {
+  const fetchImpl = deps.fetch ?? globalThis.fetch;
+  const body: Record<string, unknown> = {
+    allowed_policies: cfg.allowedPolicies,
+    renewable: cfg.renewable ?? true,
+    orphan: cfg.orphan ?? false,
+  };
+  if (cfg.period !== undefined && cfg.period > 0) body.period = cfg.period;
+  const res = await fetchImpl(`${baseUrl(url)}/v1/auth/token/roles/${encodeURIComponent(role)}`, {
+    method: 'POST',
+    headers: headers(adminToken, deps.namespace, true),
+    body: JSON.stringify(body),
+  });
+  if (!res.ok) {
+    throw new Error(`vault ensure role ${role}: HTTP ${String(res.status)} ${await readError(res)}`);
+  }
+}
+
+/**
+ * POST /v1/auth/token/create/<role>. Caller must hold a token with
+ * `create` on that path (admin, or a previously-minted successor).
+ */
+export async function mintRoleToken(
+  url: string,
+  callerToken: string,
+  role: string,
+  deps: VaultDeps = {},
+): Promise<MintedToken> {
+  const fetchImpl = deps.fetch ?? globalThis.fetch;
+  const res = await fetchImpl(`${baseUrl(url)}/v1/auth/token/create/${encodeURIComponent(role)}`, {
+    method: 'POST',
+    headers: headers(callerToken, deps.namespace, true),
+    body: JSON.stringify({}),
+  });
+  if (!res.ok) {
+    throw new Error(`vault mint role-token ${role}: HTTP ${String(res.status)} ${await readError(res)}`);
+  }
+  const body = await res.json() as {
+    auth?: {
+      client_token?: string;
+      accessor?: string;
+      lease_duration?: number;
+      renewable?: boolean;
+      policies?: string[];
+    };
+  };
+  const a = body.auth;
+  if (a?.client_token === undefined || a?.accessor === undefined) {
+    throw new Error(`vault mint role-token ${role}: response missing auth.client_token or accessor`);
+  }
+  return {
+    clientToken: a.client_token,
+    accessor: a.accessor,
+    leaseDuration: a.lease_duration ?? 0,
+    renewable: a.renewable ?? false,
+    policies: a.policies ?? [],
+  };
+}
+
+/** POST /v1/auth/token/revoke-accessor. Idempotent — revoking an unknown accessor returns 204. */
+export async function revokeAccessor(
+  url: string,
+  callerToken: string,
+  accessor: string,
+  deps: VaultDeps = {},
+): Promise<void> {
+  const fetchImpl = deps.fetch ?? globalThis.fetch;
+  const res = await fetchImpl(`${baseUrl(url)}/v1/auth/token/revoke-accessor`, {
+    method: 'POST',
+    headers: headers(callerToken, deps.namespace, true),
+    body: JSON.stringify({ accessor }),
+  });
+  // 204 = revoked, 400 = already revoked/unknown (treat as noop)
+  if (!res.ok && res.status !== 400) {
+    throw new Error(`vault revoke-accessor: HTTP ${String(res.status)} ${await readError(res)}`);
+  }
+}
+
+/** GET /v1/auth/token/lookup-self. Returns accessor + remaining TTL on the caller's token. */
+export async function lookupSelf(
+  url: string,
+  callerToken: string,
+  deps: VaultDeps = {},
+): Promise<{ accessor: string; ttl: number; policies: string[] }> {
+  const fetchImpl = deps.fetch ?? globalThis.fetch;
+  const res = await fetchImpl(`${baseUrl(url)}/v1/auth/token/lookup-self`, {
+    method: 'GET',
+    headers: headers(callerToken, deps.namespace, false),
+  });
+  if (!res.ok) {
+    throw new Error(`vault lookup-self: HTTP ${String(res.status)} ${await readError(res)}`);
+  }
+  const body = await res.json() as { data?: { accessor?: string; ttl?: number; policies?: string[] } };
+  return {
+    accessor: body.data?.accessor ?? '',
+    ttl: body.data?.ttl ?? 0,
+    policies: body.data?.policies ?? [],
+  };
+}
+
+/**
+ * Round-trip smoke test: write a marker secret, read it back, delete metadata.
+ * Used by the wizard to prove the minted token's policy is wired correctly
+ * before reporting success to the user.
+ */
+export async function testWriteReadDelete(
+  url: string,
+  callerToken: string,
+  mount: string,
+  relPath: string,
+  deps: VaultDeps = {},
+): Promise<void> {
+  const fetchImpl = deps.fetch ?? globalThis.fetch;
+  const dataUrl = `${baseUrl(url)}/v1/${mount}/data/${relPath.replace(/^\//, '')}`;
+  const metaUrl = `${baseUrl(url)}/v1/${mount}/metadata/${relPath.replace(/^\//, '')}`;
+
+  const writeRes = await fetchImpl(dataUrl, {
+    method: 'POST',
+    headers: headers(callerToken, deps.namespace, true),
+    body: JSON.stringify({ data: { marker: 'mcpctl-smoke', at: new Date().toISOString() } }),
+  });
+  if (!writeRes.ok) {
+    throw new Error(`vault smoke write ${relPath}: HTTP ${String(writeRes.status)} ${await readError(writeRes)}`);
+  }
+
+  const readRes = await fetchImpl(dataUrl, {
+    method: 'GET',
+    headers: headers(callerToken, deps.namespace, false),
+  });
+  if (!readRes.ok) {
+    throw new Error(`vault smoke read ${relPath}: HTTP ${String(readRes.status)} ${await readError(readRes)}`);
+  }
+  const readBody = await readRes.json() as { data?: { data?: { marker?: string } } };
+  if (readBody.data?.data?.marker !== 'mcpctl-smoke') {
+    throw new Error(`vault smoke: read-back didn't match written marker`);
+  }
+
+  const delRes = await fetchImpl(metaUrl, {
+    method: 'DELETE',
+    headers: headers(callerToken, deps.namespace, false),
+  });
+  if (!delRes.ok && delRes.status !== 404) {
+    throw new Error(`vault smoke delete ${relPath}: HTTP ${String(delRes.status)} ${await readError(delRes)}`);
+  }
+}
--- a/src/shared/src/vault/index.ts
+++ b/src/shared/src/vault/index.ts
@@ -0,0 +1,2 @@
+export * from './client.js';
+export * from './policy.js';
--- a/src/shared/src/vault/policy.ts
+++ b/src/shared/src/vault/policy.ts
@@ -0,0 +1,35 @@
+/**
+ * OpenBao / Vault policy template for mcpd's wizard-provisioned backend.
+ *
+ * The policy is deliberately narrow:
+ *   - Read/write/list/delete under `<mount>/{data,metadata}/<pathPrefix>/*`
+ *   - Self-rotation: mcpd can mint its successor via the dedicated token role
+ *     and revoke its predecessor by accessor.
+ *
+ * Keeping the paths in one place lets the wizard and the rotator agree on
+ * exactly which capabilities the stored token has, and lets tests assert the
+ * generated HCL is stable.
+ */
+
+export interface AppMcpdPolicyConfig {
+  /** KV v2 mount name. Default: 'secret'. */
+  mount: string;
+  /** Path prefix under the mount (the directory mcpd is confined to). Default: 'mcpd'. */
+  pathPrefix: string;
+  /** Token role name the policy allows self-rotation against. Default: 'app-mcpd-role'. */
+  tokenRole: string;
+}
+
+export function buildAppMcpdPolicyHcl(cfg: AppMcpdPolicyConfig): string {
+  const { mount, pathPrefix, tokenRole } = cfg;
+  const prefix = pathPrefix.replace(/^\/|\/$/g, '');
+  return [
+    `path "${mount}/data/${prefix}/*"      { capabilities = ["create", "read", "update"] }`,
+    `path "${mount}/metadata/${prefix}/*"  { capabilities = ["list", "delete"] }`,
+    `path "${mount}/metadata/${prefix}/"   { capabilities = ["list"] }`,
+    `path "auth/token/create/${tokenRole}" { capabilities = ["create", "update"] }`,
+    `path "auth/token/revoke-accessor"     { capabilities = ["update"] }`,
+    `path "auth/token/lookup-self"         { capabilities = ["read"] }`,
+    '',
+  ].join('\n');
+}
--- a/src/shared/tests/vault-client.test.ts
+++ b/src/shared/tests/vault-client.test.ts
@@ -0,0 +1,183 @@
+import { describe, it, expect, vi } from 'vitest';
+import {
+  buildAppMcpdPolicyHcl,
+  verifyHealth,
+  ensureKvV2,
+  writePolicy,
+  ensureTokenRole,
+  mintRoleToken,
+  revokeAccessor,
+  lookupSelf,
+  testWriteReadDelete,
+} from '../src/vault/index.js';
+
+function mockFetch(responses: Array<{ match: RegExp; status: number; body?: unknown; text?: string }>): ReturnType<typeof vi.fn> {
+  return vi.fn(async (url: string | URL, init?: RequestInit) => {
+    const u = String(url);
+    const method = init?.method ?? 'GET';
+    const match = responses.find((r) => r.match.test(`${method} ${u}`) || r.match.test(u));
+    if (!match) throw new Error(`unexpected fetch: ${method} ${u}`);
+    const body = match.body !== undefined ? JSON.stringify(match.body) : (match.text ?? '');
+    return new Response(body, { status: match.status, headers: { 'Content-Type': 'application/json' } });
+  });
+}
+
+describe('buildAppMcpdPolicyHcl', () => {
+  it('emits stable HCL for the documented default', () => {
+    const hcl = buildAppMcpdPolicyHcl({ mount: 'secret', pathPrefix: 'mcpd', tokenRole: 'app-mcpd-role' });
+    expect(hcl).toContain('path "secret/data/mcpd/*"');
+    expect(hcl).toContain('path "secret/metadata/mcpd/*"');
+    expect(hcl).toContain('path "auth/token/create/app-mcpd-role"');
+    expect(hcl).toContain('path "auth/token/revoke-accessor"');
+    expect(hcl).toContain('capabilities = ["read"]');
+  });
+
+  it('normalises leading/trailing slashes in pathPrefix', () => {
+    const hcl = buildAppMcpdPolicyHcl({ mount: 'secret', pathPrefix: '/mcpd/', tokenRole: 'r' });
+    expect(hcl).not.toContain('//');
+    expect(hcl).toContain('path "secret/data/mcpd/*"');
+  });
+});
+
+describe('verifyHealth', () => {
+  it('returns normalised shape for a healthy unsealed vault', async () => {
+    const fetchFn = mockFetch([{ match: /\/v1\/sys\/health$/, status: 200, body: { initialized: true, sealed: false, standby: false, version: '2.5.2' } }]);
+    const h = await verifyHealth('http://bao.example:8200', 'root', { fetch: fetchFn as unknown as typeof fetch });
+    expect(h).toEqual({ initialized: true, sealed: false, standby: false, version: '2.5.2' });
+  });
+
+  it('throws on non-standard 5xx', async () => {
+    const fetchFn = vi.fn(async () => new Response('boom', { status: 502 }));
+    await expect(verifyHealth('http://x', 'root', { fetch: fetchFn as unknown as typeof fetch })).rejects.toThrow(/HTTP 502/);
+  });
+});
+
+describe('ensureKvV2', () => {
+  it('returns false when mount already exists as kv v2', async () => {
+    const fetchFn = mockFetch([
+      { match: /GET .*\/v1\/sys\/mounts$/, status: 200, body: { 'secret/': { type: 'kv', options: { version: '2' } } } },
+    ]);
+    const created = await ensureKvV2('http://x', 'root', 'secret', { fetch: fetchFn as unknown as typeof fetch });
+    expect(created).toBe(false);
+  });
+
+  it('mounts KV v2 when mount is missing', async () => {
+    const fetchFn = mockFetch([
+      { match: /GET .*\/v1\/sys\/mounts$/, status: 200, body: {} },
+      { match: /POST .*\/v1\/sys\/mounts\/secret$/, status: 200 },
+    ]);
+    const created = await ensureKvV2('http://x', 'root', 'secret', { fetch: fetchFn as unknown as typeof fetch });
+    expect(created).toBe(true);
+  });
+
+  it('rejects when mount exists but is kv v1', async () => {
+    const fetchFn = mockFetch([
+      { match: /GET .*\/v1\/sys\/mounts$/, status: 200, body: { 'secret/': { type: 'kv', options: { version: '1' } } } },
+    ]);
+    await expect(ensureKvV2('http://x', 'root', 'secret', { fetch: fetchFn as unknown as typeof fetch }))
+      .rejects.toThrow(/not v2/);
+  });
+});
+
+describe('writePolicy', () => {
+  it('PUTs the HCL to /v1/sys/policies/acl/<name>', async () => {
+    const fetchFn = mockFetch([{ match: /PUT .*\/v1\/sys\/policies\/acl\/app-mcpd$/, status: 200 }]);
+    await writePolicy('http://x', 'root', 'app-mcpd', 'path "x" {}', { fetch: fetchFn as unknown as typeof fetch });
+    const [, init] = fetchFn.mock.calls[0] as [string, RequestInit];
+    expect(init.method).toBe('PUT');
+    const body = JSON.parse(init.body as string) as { policy: string };
+    expect(body.policy).toContain('path "x"');
+  });
+});
+
+describe('ensureTokenRole', () => {
+  it('POSTs the role config with period + renewable', async () => {
+    const fetchFn = mockFetch([{ match: /POST .*\/v1\/auth\/token\/roles\/app-mcpd-role$/, status: 200 }]);
+    await ensureTokenRole('http://x', 'root', 'app-mcpd-role', {
+      allowedPolicies: ['app-mcpd'],
+      period: 720 * 3600,
+      renewable: true,
+    }, { fetch: fetchFn as unknown as typeof fetch });
+    const [, init] = fetchFn.mock.calls[0] as [string, RequestInit];
+    const sent = JSON.parse(init.body as string) as Record<string, unknown>;
+    expect(sent.allowed_policies).toEqual(['app-mcpd']);
+    expect(sent.period).toBe(720 * 3600);
+    expect(sent.renewable).toBe(true);
+    expect(sent.orphan).toBe(false);
+  });
+});
+
+describe('mintRoleToken', () => {
+  it('parses the auth block into a MintedToken', async () => {
+    const fetchFn = mockFetch([{
+      match: /POST .*\/v1\/auth\/token\/create\/app-mcpd-role$/,
+      status: 200,
+      body: { auth: { client_token: 'hvs.CAE.xyz', accessor: 'acc-1', lease_duration: 2592000, renewable: true, policies: ['app-mcpd', 'default'] } },
+    }]);
+    const m = await mintRoleToken('http://x', 'caller', 'app-mcpd-role', { fetch: fetchFn as unknown as typeof fetch });
+    expect(m.clientToken).toBe('hvs.CAE.xyz');
+    expect(m.accessor).toBe('acc-1');
+    expect(m.leaseDuration).toBe(2592000);
+    expect(m.renewable).toBe(true);
+    expect(m.policies).toEqual(['app-mcpd', 'default']);
+  });
+
+  it('throws when the response is missing auth.client_token', async () => {
+    const fetchFn = mockFetch([{ match: /create\/r$/, status: 200, body: { auth: { accessor: 'acc' } } }]);
+    await expect(mintRoleToken('http://x', 'caller', 'r', { fetch: fetchFn as unknown as typeof fetch }))
+      .rejects.toThrow(/missing auth.client_token/);
+  });
+});
+
+describe('revokeAccessor', () => {
+  it('swallows 400 (already revoked/unknown)', async () => {
+    const fetchFn = vi.fn(async () => new Response('{}', { status: 400 }));
+    await expect(revokeAccessor('http://x', 'caller', 'acc', { fetch: fetchFn as unknown as typeof fetch }))
+      .resolves.toBeUndefined();
+  });
+});
+
+describe('lookupSelf', () => {
+  it('extracts accessor + ttl from data block', async () => {
+    const fetchFn = mockFetch([{
+      match: /lookup-self$/,
+      status: 200,
+      body: { data: { accessor: 'acc-7', ttl: 2591000, policies: ['app-mcpd'] } },
+    }]);
+    const r = await lookupSelf('http://x', 'caller', { fetch: fetchFn as unknown as typeof fetch });
+    expect(r).toEqual({ accessor: 'acc-7', ttl: 2591000, policies: ['app-mcpd'] });
+  });
+});
+
+describe('testWriteReadDelete', () => {
+  it('runs write→read→delete and succeeds on round-trip match', async () => {
+    const calls: string[] = [];
+    const fetchFn = vi.fn(async (url: string | URL, init?: RequestInit) => {
+      const u = String(url);
+      const m = init?.method ?? 'GET';
+      calls.push(`${m} ${u}`);
+      if (m === 'POST') return new Response('{}', { status: 200 });
+      if (m === 'GET') {
+        return new Response(JSON.stringify({ data: { data: { marker: 'mcpctl-smoke' } } }), { status: 200 });
+      }
+      // DELETE
+      return new Response(null, { status: 200 });
+    });
+    await testWriteReadDelete('http://x', 'caller', 'secret', 'mcpd/smoke', { fetch: fetchFn as unknown as typeof fetch });
+    expect(calls).toHaveLength(3);
+    expect(calls[0]).toMatch(/POST .*\/v1\/secret\/data\/mcpd\/smoke$/);
+    expect(calls[1]).toMatch(/GET .*\/v1\/secret\/data\/mcpd\/smoke$/);
+    expect(calls[2]).toMatch(/DELETE .*\/v1\/secret\/metadata\/mcpd\/smoke$/);
+  });
+
+  it('throws when read-back marker does not match', async () => {
+    const fetchFn = vi.fn(async (_u: string | URL, init?: RequestInit) => {
+      if ((init?.method ?? 'GET') === 'GET') {
+        return new Response(JSON.stringify({ data: { data: { marker: 'wrong' } } }), { status: 200 });
+      }
+      return new Response('{}', { status: 200 });
+    });
+    await expect(testWriteReadDelete('http://x', 'c', 'secret', 'p', { fetch: fetchFn as unknown as typeof fetch }))
+      .rejects.toThrow(/didn't match written marker/);
+  });
+});
Author	SHA1	Message	Date
Michal	dd4246878d	feat(openbao): wizard-provisioning + daily token rotation Some checks failed CI/CD / typecheck (pull_request) Successful in 55s Details CI/CD / test (pull_request) Successful in 1m4s Details CI/CD / lint (pull_request) Successful in 2m2s Details CI/CD / smoke (pull_request) Failing after 1m36s Details CI/CD / build (pull_request) Successful in 4m13s Details CI/CD / publish (pull_request) Has been skipped Details One-command setup replaces the 6-step manual flow — `mcpctl create secretbackend bao --type openbao --wizard` takes the OpenBao admin token once, provisions a narrow policy + token role, mints the first periodic token, stores it on mcpd, verifies end-to-end, and prints the migration command. The admin token is NEVER persisted. The stored credential auto-rotates daily: mcpd mints a successor via the token role (self-rotation capability is part of the policy it was issued with), verifies the successor, writes it over the backing Secret, then revokes the predecessor by accessor. TTL 720h means a week of rotation failures still leaves 20+ days of runway. Shared: - New `@mcpctl/shared/vault` — pure HTTP wrappers (verifyHealth, ensureKvV2, writePolicy, ensureTokenRole, mintRoleToken, revokeAccessor, lookupSelf, testWriteReadDelete) and policy HCL builder. mcpd: - `tokenMeta Json @default("{}")` on SecretBackend. Self-healing schema migration — empty default lets `prisma db push` add the column cleanly. - SecretBackendRotator.rotateOne: mint → verify → persist → revoke-old → update tokenMeta. Failures surface via `lastRotationError` on the row; the old token keeps working. - SecretBackendRotatorLoop: on startup rotates overdue backends, schedules per-backend timers with ±10min jitter. Stops cleanly on shutdown. - New `POST /api/v1/secretbackends/:id/rotate` (operation `rotate-secretbackend` — added to bootstrap-admin's auto-migrated ops alongside migrate-secrets, which was previously missing too). CLI: - `--wizard` on `create secretbackend` delegates to the interactive flow. All prompts can be pre-answered via flags (--url, --admin-token, --mount, --path-prefix, --policy-name, --token-role, --no-promote-default) for CI. - `mcpctl rotate secretbackend <name>` — convenience verb; hits the new rotate endpoint. - `describe secretbackend` renders a Token health section (healthy / STALE / WARNING / ERROR) with generated/renewal/expiry timestamps and last rotation error. Only shown when tokenMeta.rotatable is true — the existing k8s-auth + static-token backends don't surface it. Tests: 15 vault-client unit tests (shared), 8 rotator unit tests (mcpd), 3 wizard flow tests (cli, including a regression test that the admin token never appears in stdout). Full suite 1885/1885 (+32). Completions regenerated for the new flags. Out of scope (explicit): kubernetes-auth wizard, Vault Enterprise namespaces in the wizard path, rotation for non-wizard static-token backends. See plan file for details. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-20 17:20:37 +01:00
Michal	515206685b	feat(openbao): kubernetes ServiceAccount auth — no static token in DB Some checks failed CI/CD / lint (push) Successful in 52s Details CI/CD / test (push) Successful in 1m5s Details CI/CD / typecheck (push) Successful in 2m8s Details CI/CD / smoke (push) Failing after 3m38s Details CI/CD / build (push) Successful in 4m15s Details CI/CD / publish (push) Has been skipped Details Why: requiring a static OpenBao root token to live (even once-bootstrap) on the plaintext backend is the weakest link in the chain. With the bao-side Kubernetes auth method enabled, mcpd's pod can authenticate using its own projected SA token, exchange it for a short-lived Vault client token, and keep the database free of any vault credentials at all. Driver changes (src/mcpd/src/services/secret-backends/openbao.ts): - New `OpenBaoConfig.auth = 'token' \| 'kubernetes'`. Defaults to 'token' so existing rows keep working. Both shapes share url + mount + pathPrefix + namespace; auth-specific fields are mutually exclusive in the config schema. - Kubernetes auth flow: read JWT from /var/run/secrets/.../token, POST to /v1/auth/<authMount>/login {role, jwt}, cache the returned client_token for `lease_duration - 60s` (grace window), then re-login. - One-shot 403-retry: if a request comes back 403 (revoked / clock skew), purge cache and retry the original request once with a fresh login. - Reads + writes go through the same getToken() path so token-auth is unchanged for existing deployments. CLI (src/cli/src/commands/create.ts): - `mcpctl create secretbackend bao --type openbao --auth kubernetes \ --url https://bao.example:8200 --role mcpctl` - Optional `--auth-mount` (default 'kubernetes') + `--sa-token-path` (default the standard projected-token path) for non-default deployments. - Token-auth path unchanged: `--auth token --token-secret SECRET/KEY` (or omit `--auth` since 'token' is the default). Validation (factory.ts) gates on the auth strategy: each path enforces its own required fields and produces a clear error if misconfigured. Tests: 6 new k8s-auth unit cases (login wire shape, lease-based caching, custom authMount, 403-on-login, missing-role rejection, missing-tokenSecretRef rejection). Full suite 1859/1859. Completions regenerated for the new flags. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 23:23:05 +01:00
Michal	a21220b6f6	fix(deploy): self-healing pre-migrate bootstrap for SecretBackend rollout Some checks failed CI/CD / typecheck (push) Successful in 51s Details CI/CD / lint (push) Successful in 1m42s Details CI/CD / test (push) Successful in 1m6s Details CI/CD / smoke (push) Failing after 3m41s Details CI/CD / build (push) Successful in 4m31s Details CI/CD / publish (push) Has been skipped Details Why: clusters upgrading from the pre-SecretBackend schema crash-loop on the first rollout. `prisma db push` applies the Phase 0 migration as three sequential steps — add Secret.backendId column (default ''), create SecretBackend table, add FK — and the FK fails because empty-string values reference no row in the empty SecretBackend table. This happened on the live cluster today; I fixed it by hand with psql. This PR makes the fix automatic so a fresh cluster or anyone replaying the migration doesn't hit the same trap. - New `src/db/src/scripts/pre-migrate-bootstrap.ts` — idempotent node script. Checks if SecretBackend table exists; if so, ensures a default row exists (insert on conflict noop), then backfills any Secret.backendId = '' to point at it. Uses Prisma raw queries so it runs against a partially- migrated schema. - `deploy/entrypoint.sh` now catches a failed first push, runs the bootstrap, and retries. Fresh installs and fully-migrated clusters take the happy path (one push, no bootstrap needed). Pre-Phase-0 upgrades take the healing path (push fails → bootstrap seeds → retry succeeds). - The bootstrap is deliberately non-fatal — even on unexpected errors it logs and exits 0 so the retry still runs. If that retry also fails, the push error surfaces normally and the pod crash-loops visibly rather than silently starting in a half-migrated state. Verified the idempotent path logically: on the already-bootstrapped cluster (1 backend row, 0 empty-backendId Secrets), the script's UPDATE matches zero rows and the INSERT hits ON CONFLICT DO NOTHING — pure no-op. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 22:59:07 +01:00
Michal	d5236171cc	fix(smoke): use json output for llm apiKeyRef assertion Some checks failed CI/CD / lint (push) Successful in 51s Details CI/CD / typecheck (push) Successful in 1m42s Details CI/CD / test (push) Successful in 1m5s Details CI/CD / smoke (push) Has started running Details CI/CD / publish (push) Has been cancelled Details CI/CD / build (push) Has been cancelled Details The table KEY column truncates at ~34 chars so `secret://<name>/<key>` wasn't appearing verbatim in stdout — the assertion was correct but brittle against presentation choices. Switched to `-o json` where the ref round-trips as a structured object, which is what actually matters. Caught by the live-cluster smoke run right after Phase 0-4 rolled out. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 22:55:39 +01:00
Michal	860033d3de	fix(db): make Secret.backendId default to empty string for rollout migration Some checks failed CI/CD / typecheck (push) Successful in 53s Details CI/CD / lint (push) Successful in 1m44s Details CI/CD / test (push) Successful in 1m5s Details CI/CD / smoke (push) Failing after 3m43s Details CI/CD / build (push) Failing after 6m52s Details CI/CD / publish (push) Has been skipped Details Why: `prisma db push` refused to add the required `backendId` column on clusters with pre-existing Secret rows — it can't assign NOT NULL without a default, and the cluster DB had 9 live rows. The mcpd pod crash-looped during the Phase 0 rollout because of this. Empty-string default lets the schema apply cleanly; `bootstrapSecretBackends` (which runs on every startup) then rewrites those empty values to the seeded `default` plaintext backend's id. New writes via SecretService always carry a real FK immediately, so the empty-string state only exists during the one-shot migration window. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 22:45:08 +01:00
michal	e27a0e695e	Merge pull request 'feat(project): Project.llmProvider as Llm reference' (#55 ) from feat/project-llm-ref into main Some checks failed CI/CD / lint (push) Successful in 52s Details CI/CD / test (push) Successful in 1m4s Details CI/CD / typecheck (push) Successful in 1m52s Details CI/CD / build (push) Has been cancelled Details CI/CD / publish (push) Has been cancelled Details CI/CD / smoke (push) Has been cancelled Details	2026-04-19 21:39:54 +00:00
michal	2155910f1c	Merge pull request 'feat(mcplocal): RBAC-bounded vllm-managed failover' (#54 ) from feat/llm-failover into main Some checks failed CI/CD / typecheck (push) Has been cancelled Details CI/CD / test (push) Has been cancelled Details CI/CD / smoke (push) Has been cancelled Details CI/CD / build (push) Has been cancelled Details CI/CD / lint (push) Has been cancelled Details CI/CD / publish (push) Has been cancelled Details	2026-04-19 21:39:47 +00:00
michal	d217eadd13	Merge pull request 'feat(mcpd): LLM inference proxy + OpenAI/Anthropic adapters' (#53 ) from feat/llm-infer into main Some checks failed CI/CD / lint (push) Has started running Details CI/CD / typecheck (push) Has started running Details CI/CD / test (push) Has been cancelled Details CI/CD / smoke (push) Has been cancelled Details CI/CD / build (push) Has been cancelled Details CI/CD / publish (push) Has been cancelled Details	2026-04-19 21:39:39 +00:00
michal	9e3507752f	Merge pull request 'feat(mcpd): Llm resource — CRUD + CLI + apply' (#52 ) from feat/llm into main Some checks failed CI/CD / lint (push) Has started running Details CI/CD / typecheck (push) Has been cancelled Details CI/CD / test (push) Has been cancelled Details CI/CD / smoke (push) Has been cancelled Details CI/CD / build (push) Has been cancelled Details CI/CD / publish (push) Has been cancelled Details	2026-04-19 21:39:27 +00:00
michal	97ac1e75ef	Merge pull request 'feat(mcpd): pluggable SecretBackend + OpenBao driver + migrate' (#51 ) from feat/secretbackend into main Some checks failed CI/CD / lint (push) Has started running Details CI/CD / test (push) Has been cancelled Details CI/CD / typecheck (push) Has been cancelled Details CI/CD / smoke (push) Has been cancelled Details CI/CD / build (push) Has been cancelled Details CI/CD / publish (push) Has been cancelled Details	2026-04-19 21:39:17 +00:00
Michal	58788bc120	test(smoke): end-to-end coverage for SecretBackend, Llm, infer proxy, project-llm-ref Covers the Phase 0-4 CLI contract against live mcpd. Matches the existing mcptoken.smoke pattern: skip gracefully on unreachable /healthz, cleanup fixtures in afterAll, use --direct to bypass mcplocal for admin operations. - secretbackend.smoke.test.ts · seeded plaintext default exists + isDefault · create/describe/delete round-trip · refuses to delete the default backend (409 shape) · get -o yaml output starts with `kind: secretbackend` (apply-compatible) - llm.smoke.test.ts · create secret + llm with --api-key-ref, verify describe hides the raw value but surfaces secret://name/key · yaml round-trip: get -o yaml > file → amend → apply -f → describe shows change · deleting the llm leaves the underlying Secret intact (onDelete: SetNull) - llm-infer.smoke.test.ts · 404 for unknown name, 400 for missing messages · 5xx when upstream url is unreachable (proxy returns a structured error) · opt-in happy-path gated on LLM_INFER_SMOKE_REAL=1 + LLM_INFER_SMOKE_LLM=<name> so CI doesn't need a real provider key - project-llm-ref.smoke.test.ts · describe project with --llm <registered> — no warning · describe project with --llm <nonexistent> — shows "warning: …registry default" · describe project with --llm none — explicit disable, no warning These require PRs #51-55 to be merged and fulldeploy.sh run before they'll find the new endpoints on live mcpd. Until then they skip or fail with "Not Found". Unit tests for the same code paths (1853 total) continue to pass against mocks. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 22:09:41 +01:00
Michal	de854b1944	feat(project): Project.llmProvider semantically names an Llm resource Why: Phases 0-3 built the server-managed Llm registry; this phase pivots the existing Project.llmProvider column from "local provider hint" to "named Llm reference" so operators can pick a centralised Llm per project. No schema change — the column stays a free-form string for backward compat. - `mcpctl create project --llm <name>` (+ `--llm-model <override>`) sets llmProvider/llmModel to a centralised Llm reference, or 'none' to disable. - `mcpctl describe project` fetches the Llm catalogue alongside prompts and flags values that don't resolve with a visible warning. 'none' is treated as an explicit disable, not an orphan. - `apply -f` doc comments updated; --llm-provider still accepted but now documented as naming an Llm resource. - New `resolveProjectLlmReference(mcpdClient, name)` helper in mcplocal's discovery: returns `registered`/`disabled`/`unregistered`/`unreachable`. The HTTP-mode proxy-model pipeline will consume this when it pivots to mcpd's /api/v1/llms/:name/infer proxy. - project-mcp-endpoint.ts cache-namespace path gets a comment explaining the new resolution order — behavior unchanged, just clarified. Tests: 6 resolver unit tests + 3 new describe-warning cases. Full suite 1853/1853 (+9 from Phase 3's 1844). TypeScript clean; completions regenerated for the new create-project flags. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 18:28:46 +01:00
Michal	4d8ee23d0e	feat(mcplocal): RBAC-bounded vllm-managed failover + name-based llm lookup Why: when mcpd's inference proxy is unreachable, clients with a local vllm-managed provider should be able to substitute — but only if they still have view permission on the centralized Llm. Otherwise revoking an Llm wouldn't actually stop a misbehaving client. Infrastructure (the agent + mcplocal HTTP-mode wire-up will land separately when those clients pivot to mcpd's proxy): - LlmProviderFileEntry gains optional `failoverFor: <central llm name>`. The entry is otherwise the same local provider it always was; the new field just declares which central Llm it can substitute for. - ProviderRegistry tracks a failover map (registerFailover / getFailoverFor / listFailovers). Unregister removes any failover entry pointing at the removed provider so we don't end up with dangling references. - New FailoverRouter wraps a primary inference call. On primary failure: if a local provider is registered for the Llm, HEAD-probe `mcpd /api/v1/llms/ :name` with the caller's bearer to verify view permission, then either invoke the local provider (allowed) or re-throw the primary error (403, 401, network unreachable, anything else — all fail-closed). - Server: GET /api/v1/llms/:idOrName accepts both CUID and human name. Lets FailoverRouter probe by name without a separate id-resolution call. HEAD derives automatically from GET in Fastify, which runs the same RBAC hook and drops the body — exactly what the probe needs. Tests: 11 failover unit tests (registry map, decision flow, fail-closed for forbidden + unreachable, checkAuth status mapping) + 4 new route tests (name lookup, HEAD existing/missing). Full suite 1844/1844 (+14 from Phase 2's 1830). TypeScript clean across mcpd + mcplocal. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 13:05:43 +01:00
Michal	23f53a0798	feat(mcpd): inference proxy — POST /api/v1/llms/:name/infer Why: the point of the Llm resource (Phase 1) is that credentials never leave the server. This lands the proxy: clients POST OpenAI chat/completions to mcpd, mcpd attaches the provider API key server-side, and the response streams back as OpenAI-format SSE. Design: - Wire format client-side is always OpenAI chat/completions — every existing SDK speaks it. Adapters translate on the provider side. - `openai \| vllm \| deepseek \| ollama` → pure passthrough (they already speak OpenAI). `anthropic` → translator to/from Anthropic Messages API (system-string extraction, content-block flattening, SSE event remap). - Plain fetch; no @anthropic-ai/sdk dep. Consistent with the OpenBao driver shape and keeps the proxy layer thin. - `gemini-cli` intentionally rejected — subprocess providers need extra lifecycle plumbing; deferred to a follow-up. - Streaming: adapters yield `StreamingChunk`s; the route frames them as `data: <json>\n\n` + terminal `data: [DONE]\n\n` so any OpenAI client works unchanged. RBAC: - New URL special-case in mapUrlToPermission: `POST /api/v1/llms/:name/infer` → `run:llms:<name>` (not the default create:llms). Users need an explicit `{role: 'run', resource: 'llms', [name: X]}` binding to call infer. - Possession of `edit:llms` does NOT imply `run` — keeps catalogue management separate from spend. Audit: route emits an `llm_inference_call` event per request (llm name, model, user/tokenSha, streaming, duration, status). main.ts wires it to the structured logger for now; hook is in place for a richer audit sink later. Tests: - 11 adapter tests (passthrough POST shape + default URLs + no-auth ollama + SSE forwarding; anthropic translate request/response + non-2xx wrap + SSE event translation; registry dispatch + caching + unsupported-provider). - 7 route tests (404, 400, non-streaming dispatch + audit, apiKey failure, null apiKeyRef path, streaming SSE output, 502 on adapter error). - Full suite 1830/1830 (+18 from Phase 1's 1812). TypeScript clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 22:43:55 +01:00