feat(openbao): wizard-provisioning + daily token rotation

One-command setup replaces the 6-step manual flow — `mcpctl create secretbackend bao --type openbao --wizard` takes the OpenBao admin token once, provisions a narrow policy + token role, mints the first periodic token, stores it on mcpd, verifies end-to-end, and prints the migration command. The admin token is NEVER persisted. The stored credential auto-rotates daily: mcpd mints a successor via the token role (self-rotation capability is part of the policy it was issued with), verifies the successor, writes it over the backing Secret, then revokes the predecessor by accessor. TTL 720h means a week of rotation failures still leaves 20+ days of runway. Shared: - New `@mcpctl/shared/vault` — pure HTTP wrappers (verifyHealth, ensureKvV2, writePolicy, ensureTokenRole, mintRoleToken, revokeAccessor, lookupSelf, testWriteReadDelete) and policy HCL builder. mcpd: - `tokenMeta Json @default("{}")` on SecretBackend. Self-healing schema migration — empty default lets `prisma db push` add the column cleanly. - SecretBackendRotator.rotateOne: mint → verify → persist → revoke-old → update tokenMeta. Failures surface via `lastRotationError` on the row; the old token keeps working. - SecretBackendRotatorLoop: on startup rotates overdue backends, schedules per-backend timers with ±10min jitter. Stops cleanly on shutdown. - New `POST /api/v1/secretbackends/:id/rotate` (operation `rotate-secretbackend` — added to bootstrap-admin's auto-migrated ops alongside migrate-secrets, which was previously missing too). CLI: - `--wizard` on `create secretbackend` delegates to the interactive flow. All prompts can be pre-answered via flags (--url, --admin-token, --mount, --path-prefix, --policy-name, --token-role, --no-promote-default) for CI. - `mcpctl rotate secretbackend <name>` — convenience verb; hits the new rotate endpoint. - `describe secretbackend` renders a Token health section (healthy / STALE / WARNING / ERROR) with generated/renewal/expiry timestamps and last rotation error. Only shown when tokenMeta.rotatable is true — the existing k8s-auth + static-token backends don't surface it. Tests: 15 vault-client unit tests (shared), 8 rotator unit tests (mcpd), 3 wizard flow tests (cli, including a regression test that the admin token never appears in stdout). Full suite 1885/1885 (+32). Completions regenerated for the new flags. Out of scope (explicit): kubernetes-auth wizard, Vault Enterprise namespaces in the wizard path, rotation for non-wizard static-token backends. See plan file for details. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
feat(openbao): kubernetes ServiceAccount auth — no static token in DB
2026-04-20 17:20:37 +01:00 · 2026-04-19 23:23:05 +01:00 · 2026-04-19 22:59:07 +01:00 · 2026-04-19 22:55:39 +01:00 · 2026-04-19 22:45:08 +01:00 · 2026-04-19 21:39:54 +00:00
128 changed files with 11117 additions and 331 deletions
--- a/.dockerignore
+++ b/.dockerignore
@@ -12,4 +12,3 @@ dist
 .env.*
 deploy/docker-compose.yml
 src/cli
-src/mcplocal
--- a/completions/mcpctl.bash
+++ b/completions/mcpctl.bash
@@ -5,11 +5,11 @@ _mcpctl() {
  local cur prev words cword
  _init_completion || return

-  local commands="status login logout config get describe delete logs create edit apply patch backup approve console cache"
+  local commands="status login logout config get describe delete logs create edit apply patch backup approve console cache test migrate rotate"
  local project_commands="get describe delete logs create edit attach-server detach-server"
  local global_opts="-v --version --daemon-url --direct -p --project -h --help"
-  local resources="servers instances secrets templates projects users groups rbac prompts promptrequests serverattachments proxymodels all"
-  local resource_aliases="servers instances secrets templates projects users groups rbac prompts promptrequests serverattachments proxymodels all server srv instance inst secret sec template tpl project proj user group rbac-definition rbac-binding prompt promptrequest pr serverattachment sa proxymodel pm"
+  local resources="servers instances secrets secretbackends llms templates projects users groups rbac prompts promptrequests serverattachments proxymodels all"
+  local resource_aliases="servers instances secrets secretbackends llms templates projects users groups rbac prompts promptrequests serverattachments proxymodels all server srv instance inst secret sec secretbackend sb llm template tpl project proj user group rbac-definition rbac-binding prompt promptrequest pr serverattachment sa proxymodel pm"

  # Check if --project/-p was given
  local has_project=false
@@ -175,7 +175,7 @@ _mcpctl() {
    create)
      local create_sub=$(_mcpctl_get_subcmd $subcmd_pos)
      if [[ -z "$create_sub" ]]; then
-        COMPREPLY=($(compgen -W "server secret project user group rbac prompt serverattachment promptrequest help" -- "$cur"))
+        COMPREPLY=($(compgen -W "server secret llm secretbackend project user group rbac mcptoken prompt serverattachment promptrequest help" -- "$cur"))
      else
        case "$create_sub" in
          server)
@@ -184,8 +184,14 @@ _mcpctl() {
          secret)
            COMPREPLY=($(compgen -W "--data --force -h --help" -- "$cur"))
            ;;
+          llm)
+            COMPREPLY=($(compgen -W "--type --model --url --tier --description --api-key-ref --extra --force -h --help" -- "$cur"))
+            ;;
+          secretbackend)
+            COMPREPLY=($(compgen -W "--type --description --default --url --namespace --mount --path-prefix --auth --token-secret --role --auth-mount --sa-token-path --config --wizard --admin-token --policy-name --token-role --no-promote-default --force -h --help" -- "$cur"))
+            ;;
          project)
-            COMPREPLY=($(compgen -W "-d --description --proxy-model --prompt --gated --no-gated --server --force -h --help" -- "$cur"))
+            COMPREPLY=($(compgen -W "-d --description --proxy-model --prompt --llm --llm-model --gated --no-gated --server --force -h --help" -- "$cur"))
            ;;
          user)
            COMPREPLY=($(compgen -W "--password --name --force -h --help" -- "$cur"))
@@ -194,7 +200,10 @@ _mcpctl() {
            COMPREPLY=($(compgen -W "--description --member --force -h --help" -- "$cur"))
            ;;
          rbac)
-            COMPREPLY=($(compgen -W "--subject --binding --operation --force -h --help" -- "$cur"))
+            COMPREPLY=($(compgen -W "--subject --roleBindings --force -h --help" -- "$cur"))
+            ;;
+          mcptoken)
+            COMPREPLY=($(compgen -W "-p --project --rbac --bind --ttl --description --force -h --help" -- "$cur"))
            ;;
          prompt)
            COMPREPLY=($(compgen -W "-p --project --content --content-file --priority --link -h --help" -- "$cur"))
@@ -311,6 +320,51 @@ _mcpctl() {
        esac
      fi
      return ;;
+    test)
+      local test_sub=$(_mcpctl_get_subcmd $subcmd_pos)
+      if [[ -z "$test_sub" ]]; then
+        COMPREPLY=($(compgen -W "mcp help" -- "$cur"))
+      else
+        case "$test_sub" in
+          mcp)
+            COMPREPLY=($(compgen -W "--token --tool --args --expect-tools --timeout -o --output --no-health -h --help" -- "$cur"))
+            ;;
+          *)
+            COMPREPLY=($(compgen -W "-h --help" -- "$cur"))
+            ;;
+        esac
+      fi
+      return ;;
+    migrate)
+      local migrate_sub=$(_mcpctl_get_subcmd $subcmd_pos)
+      if [[ -z "$migrate_sub" ]]; then
+        COMPREPLY=($(compgen -W "secrets help" -- "$cur"))
+      else
+        case "$migrate_sub" in
+          secrets)
+            COMPREPLY=($(compgen -W "--from --to --names --keep-source --dry-run -h --help" -- "$cur"))
+            ;;
+          *)
+            COMPREPLY=($(compgen -W "-h --help" -- "$cur"))
+            ;;
+        esac
+      fi
+      return ;;
+    rotate)
+      local rotate_sub=$(_mcpctl_get_subcmd $subcmd_pos)
+      if [[ -z "$rotate_sub" ]]; then
+        COMPREPLY=($(compgen -W "secretbackend help" -- "$cur"))
+      else
+        case "$rotate_sub" in
+          secretbackend)
+            COMPREPLY=($(compgen -W "-h --help" -- "$cur"))
+            ;;
+          *)
+            COMPREPLY=($(compgen -W "-h --help" -- "$cur"))
+            ;;
+        esac
+      fi
+      return ;;
    help)
      COMPREPLY=($(compgen -W "$commands" -- "$cur"))
      return ;;
--- a/completions/mcpctl.fish
+++ b/completions/mcpctl.fish
@@ -4,7 +4,7 @@
 # Erase any stale completions from previous versions
 complete -c mcpctl -e

-set -l commands status login logout config get describe delete logs create edit apply patch backup approve console cache
+set -l commands status login logout config get describe delete logs create edit apply patch backup approve console cache test migrate rotate
 set -l project_commands get describe delete logs create edit attach-server detach-server

 # Disable file completions by default
@@ -31,10 +31,10 @@ function __mcpctl_has_project
 end

 # Resource type detection
-set -l resources servers instances secrets templates projects users groups rbac prompts promptrequests serverattachments proxymodels all
+set -l resources servers instances secrets secretbackends llms templates projects users groups rbac prompts promptrequests serverattachments proxymodels all

 function __mcpctl_needs_resource_type
-    set -l resource_aliases servers instances secrets templates projects users groups rbac prompts promptrequests serverattachments proxymodels all server srv instance inst secret sec template tpl project proj user group rbac-definition rbac-binding prompt promptrequest pr serverattachment sa proxymodel pm
+    set -l resource_aliases servers instances secrets secretbackends llms templates projects users groups rbac prompts promptrequests serverattachments proxymodels all server srv instance inst secret sec secretbackend sb llm template tpl project proj user group rbac-definition rbac-binding prompt promptrequest pr serverattachment sa proxymodel pm
    set -l tokens (commandline -opc)
    set -l found_cmd false
    for tok in $tokens
@@ -59,6 +59,8 @@ function __mcpctl_resolve_resource
        case server srv servers;      echo servers
        case instance inst instances; echo instances
        case secret sec secrets;      echo secrets
+        case secretbackend sb secretbackends; echo secretbackends
+        case llm llms;                echo llms
        case template tpl templates;  echo templates
        case project proj projects;   echo projects
        case user users;              echo users
@@ -74,7 +76,7 @@ function __mcpctl_resolve_resource
 end

 function __mcpctl_get_resource_type
-    set -l resource_aliases servers instances secrets templates projects users groups rbac prompts promptrequests serverattachments proxymodels all server srv instance inst secret sec template tpl project proj user group rbac-definition rbac-binding prompt promptrequest pr serverattachment sa proxymodel pm
+    set -l resource_aliases servers instances secrets secretbackends llms templates projects users groups rbac prompts promptrequests serverattachments proxymodels all server srv instance inst secret sec secretbackend sb llm template tpl project proj user group rbac-definition rbac-binding prompt promptrequest pr serverattachment sa proxymodel pm
    set -l tokens (commandline -opc)
    set -l found_cmd false
    for tok in $tokens
@@ -223,7 +225,7 @@ complete -c mcpctl -n "not __mcpctl_has_project; and not __fish_seen_subcommand_
 complete -c mcpctl -n "not __mcpctl_has_project; and not __fish_seen_subcommand_from $commands" -a describe -d 'Show detailed information about a resource'
 complete -c mcpctl -n "not __mcpctl_has_project; and not __fish_seen_subcommand_from $commands" -a delete -d 'Delete a resource (server, instance, secret, project, user, group, rbac)'
 complete -c mcpctl -n "not __mcpctl_has_project; and not __fish_seen_subcommand_from $commands" -a logs -d 'Get logs from an MCP server instance'
-complete -c mcpctl -n "not __mcpctl_has_project; and not __fish_seen_subcommand_from $commands" -a create -d 'Create a resource (server, secret, project, user, group, rbac, serverattachment, prompt)'
+complete -c mcpctl -n "not __mcpctl_has_project; and not __fish_seen_subcommand_from $commands" -a create -d 'Create a resource (server, secret, secretbackend, llm, project, user, group, rbac, serverattachment, prompt)'
 complete -c mcpctl -n "not __mcpctl_has_project; and not __fish_seen_subcommand_from $commands" -a edit -d 'Edit a resource in your default editor (server, project)'
 complete -c mcpctl -n "not __mcpctl_has_project; and not __fish_seen_subcommand_from $commands" -a apply -d 'Apply declarative configuration from a YAML or JSON file'
 complete -c mcpctl -n "not __mcpctl_has_project; and not __fish_seen_subcommand_from $commands" -a patch -d 'Patch a resource field (e.g. mcpctl patch project myproj llmProvider=none)'
@@ -231,13 +233,16 @@ complete -c mcpctl -n "not __mcpctl_has_project; and not __fish_seen_subcommand_
 complete -c mcpctl -n "not __mcpctl_has_project; and not __fish_seen_subcommand_from $commands" -a approve -d 'Approve a pending prompt request (atomic: delete request, create prompt)'
 complete -c mcpctl -n "not __mcpctl_has_project; and not __fish_seen_subcommand_from $commands" -a console -d 'Interactive MCP console — unified timeline with tools, provenance, and lab replay'
 complete -c mcpctl -n "not __mcpctl_has_project; and not __fish_seen_subcommand_from $commands" -a cache -d 'Manage ProxyModel pipeline cache'
+complete -c mcpctl -n "not __mcpctl_has_project; and not __fish_seen_subcommand_from $commands" -a test -d 'Utilities for testing MCP endpoints and config'
+complete -c mcpctl -n "not __mcpctl_has_project; and not __fish_seen_subcommand_from $commands" -a migrate -d 'Move resources between backends (currently: secrets between SecretBackends)'
+complete -c mcpctl -n "not __mcpctl_has_project; and not __fish_seen_subcommand_from $commands" -a rotate -d 'Force rotation of a credential-rotating resource (currently: secretbackend)'

 # Project-scoped commands (with --project)
 complete -c mcpctl -n "__mcpctl_has_project; and not __fish_seen_subcommand_from $project_commands" -a get -d 'List resources (servers, projects, instances, all)'
 complete -c mcpctl -n "__mcpctl_has_project; and not __fish_seen_subcommand_from $project_commands" -a describe -d 'Show detailed information about a resource'
 complete -c mcpctl -n "__mcpctl_has_project; and not __fish_seen_subcommand_from $project_commands" -a delete -d 'Delete a resource (server, instance, secret, project, user, group, rbac)'
 complete -c mcpctl -n "__mcpctl_has_project; and not __fish_seen_subcommand_from $project_commands" -a logs -d 'Get logs from an MCP server instance'
-complete -c mcpctl -n "__mcpctl_has_project; and not __fish_seen_subcommand_from $project_commands" -a create -d 'Create a resource (server, secret, project, user, group, rbac, serverattachment, prompt)'
+complete -c mcpctl -n "__mcpctl_has_project; and not __fish_seen_subcommand_from $project_commands" -a create -d 'Create a resource (server, secret, secretbackend, llm, project, user, group, rbac, serverattachment, prompt)'
 complete -c mcpctl -n "__mcpctl_has_project; and not __fish_seen_subcommand_from $project_commands" -a edit -d 'Edit a resource in your default editor (server, project)'
 complete -c mcpctl -n "__mcpctl_has_project; and not __fish_seen_subcommand_from $project_commands" -a attach-server -d 'Attach a server to a project (requires --project)'
 complete -c mcpctl -n "__mcpctl_has_project; and not __fish_seen_subcommand_from $project_commands" -a detach-server -d 'Detach a server from a project (requires --project)'
@@ -280,13 +285,16 @@ complete -c mcpctl -n "__mcpctl_subcmd_active config claude-generate" -l stdout
 complete -c mcpctl -n "__mcpctl_subcmd_active config impersonate" -l quit -d 'Stop impersonating and return to original identity'

 # create subcommands
-set -l create_cmds server secret project user group rbac prompt serverattachment promptrequest
+set -l create_cmds server secret llm secretbackend project user group rbac mcptoken prompt serverattachment promptrequest
 complete -c mcpctl -n "__fish_seen_subcommand_from create; and not __fish_seen_subcommand_from $create_cmds" -a server -d 'Create an MCP server definition'
 complete -c mcpctl -n "__fish_seen_subcommand_from create; and not __fish_seen_subcommand_from $create_cmds" -a secret -d 'Create a secret'
+complete -c mcpctl -n "__fish_seen_subcommand_from create; and not __fish_seen_subcommand_from $create_cmds" -a llm -d 'Register a server-managed LLM (anthropic, openai, vllm, ollama, deepseek, gemini-cli)'
+complete -c mcpctl -n "__fish_seen_subcommand_from create; and not __fish_seen_subcommand_from $create_cmds" -a secretbackend -d 'Create a secret backend (plaintext, openbao)'
 complete -c mcpctl -n "__fish_seen_subcommand_from create; and not __fish_seen_subcommand_from $create_cmds" -a project -d 'Create a project'
 complete -c mcpctl -n "__fish_seen_subcommand_from create; and not __fish_seen_subcommand_from $create_cmds" -a user -d 'Create a user'
 complete -c mcpctl -n "__fish_seen_subcommand_from create; and not __fish_seen_subcommand_from $create_cmds" -a group -d 'Create a group'
 complete -c mcpctl -n "__fish_seen_subcommand_from create; and not __fish_seen_subcommand_from $create_cmds" -a rbac -d 'Create an RBAC binding definition'
+complete -c mcpctl -n "__fish_seen_subcommand_from create; and not __fish_seen_subcommand_from $create_cmds" -a mcptoken -d 'Create a project-scoped API token for HTTP-mode mcplocal. The raw token is printed once.'
 complete -c mcpctl -n "__fish_seen_subcommand_from create; and not __fish_seen_subcommand_from $create_cmds" -a prompt -d 'Create an approved prompt'
 complete -c mcpctl -n "__fish_seen_subcommand_from create; and not __fish_seen_subcommand_from $create_cmds" -a serverattachment -d 'Attach a server to a project'
 complete -c mcpctl -n "__fish_seen_subcommand_from create; and not __fish_seen_subcommand_from $create_cmds" -a promptrequest -d 'Create a prompt request (pending proposal that needs approval)'
@@ -311,10 +319,43 @@ complete -c mcpctl -n "__mcpctl_subcmd_active create server" -l force -d 'Update
 complete -c mcpctl -n "__mcpctl_subcmd_active create secret" -l data -d 'Secret data KEY=value (repeat for multiple)' -x
 complete -c mcpctl -n "__mcpctl_subcmd_active create secret" -l force -d 'Update if already exists'

+# create llm options
+complete -c mcpctl -n "__mcpctl_subcmd_active create llm" -l type -d 'Provider type (anthropic, openai, deepseek, vllm, ollama, gemini-cli)' -x
+complete -c mcpctl -n "__mcpctl_subcmd_active create llm" -l model -d 'Model identifier (e.g. claude-3-5-sonnet-20241022)' -x
+complete -c mcpctl -n "__mcpctl_subcmd_active create llm" -l url -d 'Endpoint URL (empty = provider default)' -x
+complete -c mcpctl -n "__mcpctl_subcmd_active create llm" -l tier -d 'Tier: fast or heavy' -x
+complete -c mcpctl -n "__mcpctl_subcmd_active create llm" -l description -d 'Description' -x
+complete -c mcpctl -n "__mcpctl_subcmd_active create llm" -l api-key-ref -d 'API key reference in SECRET/KEY form (e.g. anthropic-key/token)' -x
+complete -c mcpctl -n "__mcpctl_subcmd_active create llm" -l extra -d 'Extra config key=value (repeat)' -x
+complete -c mcpctl -n "__mcpctl_subcmd_active create llm" -l force -d 'Update if already exists'
+
+# create secretbackend options
+complete -c mcpctl -n "__mcpctl_subcmd_active create secretbackend" -l type -d 'Backend type (plaintext, openbao)' -x
+complete -c mcpctl -n "__mcpctl_subcmd_active create secretbackend" -l description -d 'Description' -x
+complete -c mcpctl -n "__mcpctl_subcmd_active create secretbackend" -l default -d 'Promote this backend to default (atomically demotes the current one)'
+complete -c mcpctl -n "__mcpctl_subcmd_active create secretbackend" -l url -d 'openbao: vault URL (e.g. http://bao.example:8200)' -x
+complete -c mcpctl -n "__mcpctl_subcmd_active create secretbackend" -l namespace -d 'openbao: X-Vault-Namespace header value' -x
+complete -c mcpctl -n "__mcpctl_subcmd_active create secretbackend" -l mount -d 'openbao: KV v2 mount point (default: secret)' -x
+complete -c mcpctl -n "__mcpctl_subcmd_active create secretbackend" -l path-prefix -d 'openbao: path prefix under mount (default: mcpctl)' -x
+complete -c mcpctl -n "__mcpctl_subcmd_active create secretbackend" -l auth -d 'openbao: auth method — \'token\' (default) or \'kubernetes\'' -x
+complete -c mcpctl -n "__mcpctl_subcmd_active create secretbackend" -l token-secret -d 'openbao token auth: token secret reference in SECRET/KEY form (e.g. bao-creds/token)' -x
+complete -c mcpctl -n "__mcpctl_subcmd_active create secretbackend" -l role -d 'openbao kubernetes auth: vault role to login as (e.g. \'mcpctl\')' -x
+complete -c mcpctl -n "__mcpctl_subcmd_active create secretbackend" -l auth-mount -d 'openbao kubernetes auth: vault auth method mount path (default: \'kubernetes\')' -x
+complete -c mcpctl -n "__mcpctl_subcmd_active create secretbackend" -l sa-token-path -d 'openbao kubernetes auth: filesystem path to projected SA token (default: \'/var/run/secrets/kubernetes.io/serviceaccount/token\')' -x
+complete -c mcpctl -n "__mcpctl_subcmd_active create secretbackend" -l config -d 'Extra config as key=value (repeat for multiple)' -x
+complete -c mcpctl -n "__mcpctl_subcmd_active create secretbackend" -l wizard -d 'Interactive wizard (openbao only): provision policy + token role, mint token, store on mcpd, suggest migration'
+complete -c mcpctl -n "__mcpctl_subcmd_active create secretbackend" -l admin-token -d 'openbao wizard: OpenBao admin/root token (prompted if omitted). Used only for provisioning; NEVER persisted.' -x
+complete -c mcpctl -n "__mcpctl_subcmd_active create secretbackend" -l policy-name -d 'openbao wizard: name for the policy created on OpenBao (default: \'app-mcpd\')' -x
+complete -c mcpctl -n "__mcpctl_subcmd_active create secretbackend" -l token-role -d 'openbao wizard: name for the token role created on OpenBao (default: \'app-mcpd-role\')' -x
+complete -c mcpctl -n "__mcpctl_subcmd_active create secretbackend" -l no-promote-default -d 'openbao wizard: do not promote this backend to default after creation'
+complete -c mcpctl -n "__mcpctl_subcmd_active create secretbackend" -l force -d 'Update if already exists'
+
 # create project options
 complete -c mcpctl -n "__mcpctl_subcmd_active create project" -s d -l description -d 'Project description' -x
 complete -c mcpctl -n "__mcpctl_subcmd_active create project" -l proxy-model -d 'Plugin name (default, content-pipeline, gate, none)' -x
 complete -c mcpctl -n "__mcpctl_subcmd_active create project" -l prompt -d 'Project-level prompt / instructions for the LLM' -x
+complete -c mcpctl -n "__mcpctl_subcmd_active create project" -l llm -d 'Name of an Llm resource (see \'mcpctl get llms\'), or \'none\' to disable' -x
+complete -c mcpctl -n "__mcpctl_subcmd_active create project" -l llm-model -d 'Override the model string for this project (defaults to the Llm\'s own model)' -x
 complete -c mcpctl -n "__mcpctl_subcmd_active create project" -l gated -d '[deprecated: use --proxy-model default]'
 complete -c mcpctl -n "__mcpctl_subcmd_active create project" -l no-gated -d '[deprecated: use --proxy-model content-pipeline]'
 complete -c mcpctl -n "__mcpctl_subcmd_active create project" -l server -d 'Server name (repeat for multiple)' -x
@@ -332,10 +373,17 @@ complete -c mcpctl -n "__mcpctl_subcmd_active create group" -l force -d 'Update

 # create rbac options
 complete -c mcpctl -n "__mcpctl_subcmd_active create rbac" -l subject -d 'Subject as Kind:name (repeat for multiple)' -x
-complete -c mcpctl -n "__mcpctl_subcmd_active create rbac" -l binding -d 'Role binding as role:resource (e.g. edit:servers, run:projects)' -x
-complete -c mcpctl -n "__mcpctl_subcmd_active create rbac" -l operation -d 'Operation binding (e.g. logs, backup)' -x
+complete -c mcpctl -n "__mcpctl_subcmd_active create rbac" -l roleBindings -d 'Role binding as key:value pairs, e.g. "role:view,resource:servers" or "role:view,resource:servers,name:my-ha" or "action:logs" (repeat for multiple)' -x
 complete -c mcpctl -n "__mcpctl_subcmd_active create rbac" -l force -d 'Update if already exists'

+# create mcptoken options
+complete -c mcpctl -n "__mcpctl_subcmd_active create mcptoken" -s p -l project -d 'Project this token is bound to' -xa '(__mcpctl_project_names)'
+complete -c mcpctl -n "__mcpctl_subcmd_active create mcptoken" -l rbac -d 'Base RBAC: \'empty\' (default, no bindings) or \'clone\' (snapshot creator\'s perms)' -x
+complete -c mcpctl -n "__mcpctl_subcmd_active create mcptoken" -l bind -d 'Additional role binding as key:value pairs, e.g. "role:view,resource:servers" or "action:logs" (repeat for multiple). Creator perms are the ceiling.' -x
+complete -c mcpctl -n "__mcpctl_subcmd_active create mcptoken" -l ttl -d 'Expiry: \'30d\', \'12h\', \'never\', or an ISO8601 datetime' -x
+complete -c mcpctl -n "__mcpctl_subcmd_active create mcptoken" -l description -d 'Freeform description' -x
+complete -c mcpctl -n "__mcpctl_subcmd_active create mcptoken" -l force -d 'Revoke any existing active token with this name, then create a new one'
+
 # create prompt options
 complete -c mcpctl -n "__mcpctl_subcmd_active create prompt" -s p -l project -d 'Project name to scope the prompt to' -xa '(__mcpctl_project_names)'
 complete -c mcpctl -n "__mcpctl_subcmd_active create prompt" -l content -d 'Prompt content text' -x
@@ -369,6 +417,34 @@ complete -c mcpctl -n "__fish_seen_subcommand_from cache; and not __fish_seen_su
 complete -c mcpctl -n "__mcpctl_subcmd_active cache clear" -l older-than -d 'Clear entries older than N days' -x
 complete -c mcpctl -n "__mcpctl_subcmd_active cache clear" -s y -l yes -d 'Skip confirmation'

+# test subcommands
+set -l test_cmds mcp
+complete -c mcpctl -n "__fish_seen_subcommand_from test; and not __fish_seen_subcommand_from $test_cmds" -a mcp -d 'Verify a Streamable-HTTP MCP endpoint: health, initialize, tools/list, optionally call a tool.'
+
+# test mcp options
+complete -c mcpctl -n "__mcpctl_subcmd_active test mcp" -l token -d 'Bearer token (also reads $MCPCTL_TOKEN)' -x
+complete -c mcpctl -n "__mcpctl_subcmd_active test mcp" -l tool -d 'Invoke a specific tool after listing' -x
+complete -c mcpctl -n "__mcpctl_subcmd_active test mcp" -l args -d 'JSON-encoded arguments for --tool' -x
+complete -c mcpctl -n "__mcpctl_subcmd_active test mcp" -l expect-tools -d 'Comma-separated tool names that MUST appear; fails otherwise' -x
+complete -c mcpctl -n "__mcpctl_subcmd_active test mcp" -l timeout -d 'Per-request timeout in seconds' -x
+complete -c mcpctl -n "__mcpctl_subcmd_active test mcp" -s o -l output -d 'Output format: text or json' -x
+complete -c mcpctl -n "__mcpctl_subcmd_active test mcp" -l no-health -d 'Skip the /healthz preflight check'
+
+# migrate subcommands
+set -l migrate_cmds secrets
+complete -c mcpctl -n "__fish_seen_subcommand_from migrate; and not __fish_seen_subcommand_from $migrate_cmds" -a secrets -d 'Migrate secrets from one SecretBackend to another'
+
+# migrate secrets options
+complete -c mcpctl -n "__mcpctl_subcmd_active migrate secrets" -l from -d 'Source SecretBackend name' -x
+complete -c mcpctl -n "__mcpctl_subcmd_active migrate secrets" -l to -d 'Destination SecretBackend name' -x
+complete -c mcpctl -n "__mcpctl_subcmd_active migrate secrets" -l names -d 'Comma-separated secret names (default: all)' -x
+complete -c mcpctl -n "__mcpctl_subcmd_active migrate secrets" -l keep-source -d 'Leave the source copy intact (default: delete from source after write+commit)'
+complete -c mcpctl -n "__mcpctl_subcmd_active migrate secrets" -l dry-run -d 'Show which secrets would be migrated without touching them'
+
+# rotate subcommands
+set -l rotate_cmds secretbackend
+complete -c mcpctl -n "__fish_seen_subcommand_from rotate; and not __fish_seen_subcommand_from $rotate_cmds" -a secretbackend -d 'Rotate the vault token on an OpenBao SecretBackend (wizard-provisioned)'
+
 # status options
 complete -c mcpctl -n "__fish_seen_subcommand_from status" -s o -l output -d 'output format (table, json, yaml)' -x

--- a/deploy/Dockerfile.mcplocal
+++ b/deploy/Dockerfile.mcplocal
@@ -0,0 +1,60 @@
+# HTTP-only mcplocal for k8s deploy (Service `mcp`, Ingress `mcp.ad.itaz.eu`).
+# Container CMD runs the `serve.ts` entry which — unlike the systemd/STDIO
+# entry — has no stdin/stdout MCP client and bootstraps exclusively from env.
+
+# Stage 1: Build TypeScript
+FROM node:20-alpine AS builder
+
+RUN corepack enable && corepack prepare pnpm@9.15.0 --activate
+
+WORKDIR /app
+
+# Copy workspace config and package manifests
+COPY pnpm-workspace.yaml pnpm-lock.yaml package.json tsconfig.base.json ./
+COPY src/mcplocal/package.json src/mcplocal/tsconfig.json src/mcplocal/
+COPY src/shared/package.json src/shared/tsconfig.json src/shared/
+COPY src/db/package.json src/db/tsconfig.json src/db/
+
+# Install all dependencies
+RUN pnpm install --frozen-lockfile
+
+# Copy source
+COPY src/mcplocal/src/ src/mcplocal/src/
+COPY src/shared/src/ src/shared/src/
+COPY src/db/src/ src/db/src/
+COPY src/db/prisma/ src/db/prisma/
+
+# Build (mcplocal depends on shared; db is pulled transitively by shared/... actually
+# mcplocal does not depend on db at runtime — prisma client is only used by mcpd).
+RUN pnpm -F @mcpctl/shared build && pnpm -F @mcpctl/mcplocal build
+
+# Stage 2: Production runtime
+FROM node:20-alpine
+
+RUN corepack enable && corepack prepare pnpm@9.15.0 --activate
+
+WORKDIR /app
+
+# Copy workspace config, manifests, and lockfile
+COPY pnpm-workspace.yaml pnpm-lock.yaml package.json ./
+COPY src/mcplocal/package.json src/mcplocal/
+COPY src/shared/package.json src/shared/
+
+# Install deps (production only — no db / prisma runtime here).
+RUN pnpm install --frozen-lockfile
+
+# Copy built output
+COPY --from=builder /app/src/shared/dist/ src/shared/dist/
+COPY --from=builder /app/src/mcplocal/dist/ src/mcplocal/dist/
+
+EXPOSE 3200
+
+# Cache directory — expected to be mounted as a PVC in k8s.
+VOLUME /var/lib/mcplocal/cache
+
+HEALTHCHECK --interval=10s --timeout=5s --retries=3 --start-period=10s \
+  CMD wget -q --spider http://localhost:3200/healthz || exit 1
+
+# MCPLOCAL_MCPD_URL and MCPLOCAL_MCPD_TOKEN are required and must come from
+# the Pulumi-managed Secret. Other env vars default sensibly.
+CMD ["node", "src/mcplocal/dist/serve.js"]
--- a/deploy/entrypoint.sh
+++ b/deploy/entrypoint.sh
@@ -1,8 +1,23 @@
 #!/bin/sh
 set -e

+# Self-healing schema push:
+#   1. Try once — for fresh installs and already-migrated clusters this is all
+#      that's needed.
+#   2. On failure (typically a Phase 0 upgrade where the new SecretBackend FK
+#      can't attach because pre-existing Secret rows reference nothing), run
+#      the pre-migrate bootstrap to seed a default SecretBackend + backfill
+#      Secret.backendId, then retry.
+#   3. If the retry still fails, let the error surface so the pod crashes
+#      visibly rather than starting in a half-migrated state.
 echo "mcpd: pushing database schema..."
-pnpm -F @mcpctl/db exec prisma db push --schema=prisma/schema.prisma --accept-data-loss 2>&1
+if pnpm -F @mcpctl/db exec prisma db push --schema=prisma/schema.prisma --accept-data-loss 2>&1; then
+  :
+else
+  echo "mcpd: schema push failed — running pre-migrate bootstrap + retrying..."
+  node src/db/dist/scripts/pre-migrate-bootstrap.js || true
+  pnpm -F @mcpctl/db exec prisma db push --schema=prisma/schema.prisma --accept-data-loss 2>&1
+fi

 echo "mcpd: seeding templates..."
 TEMPLATES_DIR=templates node src/mcpd/dist/seed-runner.js
--- a/docs/mcptoken-implementation.md
+++ b/docs/mcptoken-implementation.md
@@ -0,0 +1,174 @@
+# mcptoken + HTTP-mode mcplocal — implementation log
+
+Companion to the approved plan at `/home/michal/.claude/plans/lets-discuss-something-i-bright-lovelace.md`.
+This file is updated as each milestone lands, so you can review what was actually done vs. what was planned.
+
+## Context (why)
+
+You're running your own vLLM inference outside Claude Code and want it to consume mcpctl over MCP with the same UX Claude gets: project-scoped server discovery, proxy models, the pipeline cache. Today `mcplocal` is systemd-only and serves STDIO — unreachable from off-host and unauthenticated. This work adds:
+
+1. A containerized, network-accessible `mcplocal` serving Streamable HTTP.
+2. A new `McpToken` resource (CLI: `mcpctl get/create/delete mcptoken`) — project-scoped bearer tokens with the same RBAC stack as users. Hashed at rest; raw value shown once.
+3. Tokens as a first-class RBAC subject kind (`McpToken:<sha>`), with a creator-permission ceiling so non-admins cannot mint escalated tokens.
+4. k8s deploy (Service `mcp`, Ingress `mcp.ad.itaz.eu`, PVC-backed `FileCache`).
+5. A CLI breaking change: `mcpctl create rbac --binding edit:servers` → `--roleBindings role:edit,resource:servers`. You explicitly asked for this; only one command uses it.
+6. A product-grade `mcpctl test mcp <url>` verb for validating any Streamable-HTTP MCP endpoint, reused by smoke tests.
+
+## Branch
+
+All work lives on `feat/mcptoken` (off `main` at `3149ea3`).
+
+## Pre-work committed to main (outside this branch)
+
+Before starting the feature, we flushed your in-flight changes to main so they wouldn't travel with the branch:
+
+- **`3149ea3 fix: MCP proxy resilience — discovery cache, default liveness probes`** — per-server `tools/list` cache in `McpRouter` with positive+negative TTL so dead upstreams only stall the first call; default liveness probe (tools/list through the real production path) applied to any RUNNING instance without an explicit healthCheck. Already pushed to origin.
+
+## Status legend
+
+- ✅ done
+- 🚧 in progress
+- ⬜ not started
+
+## PR 1 — Schema + token helpers + mcpd CRUD routes ✅
+
+| # | Step | Status |
+|---|---|---|
+| 1 | `McpToken` Prisma model + Project/User reverse relations; `AuditEvent.tokenName` / `tokenSha` + index | ✅ |
+| 2 | `src/shared/src/tokens/index.ts` — `generateToken`, `hashToken`, `isMcpToken`, `timingSafeEqualHex`, `TOKEN_PREFIX` | ✅ |
+| 3 | `src/mcpd/src/repositories/mcp-token.repository.ts` + new interfaces in `repositories/interfaces.ts` | ✅ |
+| 4 | `src/mcpd/src/services/mcp-token.service.ts` — creator-ceiling via `rbacService.canAccess`/`canRunOperation`, raw token returned only once, auto-creates an `RbacDefinition` with subject `McpToken:<sha>` when bindings are non-empty | ✅ |
+| 5 | `src/mcpd/src/routes/mcp-tokens.ts` — POST / GET / GET:id / DELETE:id + POST:id/revoke + GET /introspect | ✅ |
+| 6 | Wired into `main.ts` — repo/service constructed, routes registered, `mcptokens` added to URL→permission map + name resolver; `/mcptokens/introspect` added to auth-skip list so mcplocal can call it with a raw McpToken bearer | ✅ |
+| 7 | RBAC extensions: new subject kind `McpToken` in `rbac-definition.schema.ts`; `mcptokens` added to `RBAC_RESOURCES` and `RESOURCE_ALIASES`; `rbac.service.ts` threads optional `mcpTokenSha` through `canAccess`, `canRunOperation`, `getAllowedScope`, `getPermissions`; resolver matches `{kind:'McpToken', name: sha}` | ✅ |
+| 8 | Unit tests — `tests/mcp-token-service.test.ts` covering: empty/clone modes, ceiling rejection, RbacDefinition auto-create with correct `McpToken:<sha>` subject, duplicate-name conflict, introspect valid/revoked/expired/unknown, revoke deletes the RbacDefinition. 11/11 green. Full mcpd suite still 648/648. | ✅ |
+
+### What this PR does NOT do yet (coming in PR 3)
+
+- The mcpd **auth middleware** does not yet dispatch on the token prefix. A raw `mcpctl_pat_…` bearer sent to any `/api/v1/*` endpoint (other than `/introspect`) is still rejected as an invalid session. That's intentional — PR 3 extends `middleware/auth.ts` to recognize both session bearers and McpToken bearers.
+- No CLI yet. Tokens can be created only via `POST /api/v1/mcptokens` for now.
+
+## PR 2 — RBAC CLI migration ✅
+
+Migrated `mcpctl create rbac` from positional flag syntax to the key=value form you asked for.
+
+Before:
+```
+mcpctl create rbac developers \
+  --subject User:alice@test.com \
+  --binding edit:servers \
+  --binding view:servers:my-ha \
+  --operation logs
+```
+After:
+```
+mcpctl create rbac developers \
+  --subject User:alice@test.com \
+  --roleBindings role:edit,resource:servers \
+  --roleBindings role:view,resource:servers,name:my-ha \
+  --roleBindings action:logs
+```
+
+| # | Step | Status |
+|---|---|---|
+| 1 | New shared parser at `src/cli/src/commands/rbac-bindings.ts` exporting `parseRoleBinding(entry)` | ✅ |
+| 2 | `src/cli/src/commands/create.ts` — old `--binding`/`--operation` flags replaced with one repeatable `--roleBindings <kv>`. Uses the new parser. | ✅ |
+| 3 | Tests in `src/cli/tests/commands/create.test.ts` rewritten to the new form (8 RBAC tests updated) | ✅ |
+| 4 | New dedicated unit test `src/cli/tests/commands/rbac-bindings.test.ts` — 9 cases covering unscoped / name-scoped / action / trim / empty-value / unknown-key / action-conflict / missing-role rejections | ✅ |
+| 5 | Shell completions regenerated via `pnpm completions:generate` — both `completions/mcpctl.{bash,fish}` now offer `--roleBindings`, no longer `--binding`/`--operation` | ✅ |
+| 6 | Nothing in `docs/` or `README.md` referenced the old flags | ✅ |
+
+Full CLI suite still 406/406 green. On-disk YAML shape (`roleBindings: [...]`) is unchanged, so backups and existing `apply -f` files keep working.
+
+The extracted `parseRoleBinding` helper is what PR 3's `mcpctl create mcptoken --bind <kv>` flag will reuse.
+
+## PR 3 — CLI mcptoken verbs + mcpd auth dispatch + audit ✅
+
+| # | Step | Status |
+|---|---|---|
+| 1 | `src/mcpd/src/middleware/auth.ts` — dispatch on the bearer prefix. `mcpctl_pat_…` → new `findMcpToken(hash)` dep → populates `request.mcpToken` + `request.userId = ownerId`. Other bearers → existing `findSession` path. Returns 401 for revoked, expired, or unknown tokens. Fastify module augmentation adds `request.mcpToken?: McpTokenPrincipal`. | ✅ |
+| 2 | `src/mcpd/src/main.ts` — wires `findMcpToken: mcpTokenRepo.findByHash`. Threads `mcpTokenSha` into `canAccess` / `canRunOperation` / `getAllowedScope`. Adds a second project-scope check: `McpToken` principals can only reach resources inside their bound project (additional guard on top of the route handler checks). | ✅ |
+| 3 | New auth tests (`tests/auth.test.ts`) — 3 McpToken dispatch cases: happy path sets userId + mcpToken, revoked → 401, no findMcpToken wired → 401. Session path unchanged. | ✅ |
+| 4 | `mcpctl create mcptoken <name> -p <proj> [--rbac empty\|clone] [--bind …] [--ttl …]` — new subcommand. Reuses `parseRoleBinding` from PR 2. `parseTtl` helper accepts `30d`/`12h`/`never`/ISO8601. `--force` revokes the existing active token and creates a new one. Raw token is printed once with a "copy now" banner. | ✅ |
+| 5 | `mcpctl get mcptokens` + `mcpctl get mcptoken <name> -p <proj>` + `mcpctl describe mcptoken <name> -p <proj>` + `mcpctl delete mcptoken <name> -p <proj>`. Names are project-scoped, so all verbs require `-p` unless a CUID is passed. Table columns: NAME / PROJECT / PREFIX / CREATED / LAST USED / EXPIRES / STATUS. Describe surfaces the auto-created RbacDefinition's bindings (matched by `mcptoken-<id>` name convention). | ✅ |
+| 6 | `mcpctl apply -f` — added `McpTokenSpecSchema`, `mcpton: 'mcptokens'` in `KIND_TO_RESOURCE`, and an applier that creates if missing or logs "already active — skipped" (tokens are immutable). Raw token printed on create. | ✅ |
+| 7 | Resource aliases — `mcptoken`/`mcptokens`/`token`/`tokens` all resolve to `mcptokens`. `stripInternalFields` scrubs the secret and derived fields and promotes `projectName` → `project` for YAML round-trip. | ✅ |
+| 8 | Audit pipeline — `src/mcplocal/src/audit/types.ts` gains `tokenName?`/`tokenSha?`; collector gets `setSessionMcpToken(sessionId, {tokenName, tokenSha})` alongside `setSessionUserName`, both merged into a per-session principal map. `src/mcpd/src/services/audit-event.service.ts` accepts `tokenName` and `tokenSha` query params (repo already extended in PR 1). `console/audit-types.ts` carries the new optional fields so the TUI can surface them in a follow-up. | ✅ |
+| 9 | Shell completions regenerated — `mcpctl create mcptoken` flags (`--project`, `--rbac`, `--bind`, `--ttl`, `--description`, `--force`) and the new resource alias land in both bash and fish completions. `completions.test.ts` freshness check passes. | ✅ |
+
+### What this PR does NOT do yet (coming in PR 4)
+
+- No HTTP-mode mcplocal binary yet. Tokens can be used to hit mcpd directly via `/api/v1/…` with `Authorization: Bearer mcpctl_pat_…`, but the containerized `/projects/<p>/mcp` endpoint and its token-auth preHandler don't exist yet.
+- The audit-console TUI still shows only `userName` columns; adding a `TOKEN` column is a UI polish follow-up.
+
+### Test stats
+
+- 1764/1764 tests pass workspace-wide (up from ~1750 before PR 3).
+- Build clean across all 5 packages.
+- Completions freshness check green.
+
+## PR 4 — HTTP-mode mcplocal + container + `mcpctl test mcp` + smoke ✅
+
+| # | Step | Status |
+|---|---|---|
+| 1 | **Shared HTTP MCP client** — `src/shared/src/mcp-http/index.ts`. `McpHttpSession(url, {bearer?, headers?, timeoutMs?})` with `initialize / listTools / callTool / close / send / sendNotification`. Handles http + https, multiplexed SSE bodies, JSON-RPC id correlation. Distinct `McpProtocolError` / `McpTransportError` classes for contract-vs-transport failures. Plus `deriveBaseUrl(url)` + `mcpHealthCheck(base)`. Exported from `@mcpctl/shared`. | ✅ |
+| 2 | **`mcpctl test mcp <url>`** — new CLI verb under `src/cli/src/commands/test-mcp.ts`. Flags: `--token` (also reads `$MCPCTL_TOKEN`), `--tool`, `--args` (JSON), `--expect-tools`, `--timeout`, `-o text\|json`, `--no-health`. Exit codes: 0 PASS, 1 TRANSPORT/AUTH FAIL, 2 CONTRACT FAIL (e.g. missing tool or `isError=true`). | ✅ |
+| 3 | **Unit tests** for the verb — `src/cli/tests/commands/test-mcp.test.ts`. 9 cases: happy path, health preflight failure, `--expect-tools` miss / hit, transport throw, `--tool` + `isError` → exit 2, `-o json` report, `$MCPCTL_TOKEN` env fallback, invalid `--args`. All green. | ✅ |
+| 4 | **`src/mcplocal/src/serve.ts`** — new HTTP-only entry. Drops `StdioProxyServer` and `--upstream`; forces host/port from `MCPLOCAL_HTTP_HOST`/`MCPLOCAL_HTTP_PORT`; requires `MCPLOCAL_MCPD_URL`. Registers a Fastify preHandler that runs the new `token-auth` middleware on `/projects/*` and `/mcp`. Preserves LLM provider loading + proxymodel hot-reload watchers. | ✅ |
+| 5 | **`src/mcplocal/src/http/token-auth.ts`** — Fastify preHandler that validates `mcpctl_pat_…` bearers by calling `GET <mcpd>/api/v1/mcptokens/introspect`. Cache: 30s positive / 5s negative TTL keyed on `hashToken(raw)`. Rejects non-Bearer, non-`mcpctl_pat_`, revoked, expired, and wrong-project (403 when path `projectName` ≠ token's bound project). Sets `request.mcpToken = { tokenName, tokenSha, projectName }` for the audit collector. | ✅ |
+| 6 | **FileCache PVC plumbing** — `src/mcplocal/src/http/project-mcp-endpoint.ts` now honours `process.env.MCPLOCAL_CACHE_DIR` at both `FileCache` construction sites (gated + dynamic). No constructor change needed — `FileCache` already accepted a `dir` config; we just wire the env-derived value through. | ✅ |
+| 7 | **Audit collector integration** — when `request.mcpToken` is set, the `onsessioninitialized` handler in `project-mcp-endpoint.ts` now also calls `collector.setSessionMcpToken(id, {tokenName, tokenSha})` alongside the existing `setSessionUserName`. Session map from PR 3 merges both principals. | ✅ |
+| 8 | **Container image** — `deploy/Dockerfile.mcplocal` mirrors `Dockerfile.mcpd` shape: multi-stage Node 20 Alpine, pnpm workspace build of `@mcpctl/shared` + `@mcpctl/mcplocal`, runtime `CMD node src/mcplocal/dist/serve.js`, `EXPOSE 3200`, `VOLUME /var/lib/mcplocal/cache`, `HEALTHCHECK` on `/healthz`. | ✅ |
+| 9 | **Build + push script** — `scripts/build-mcplocal.sh` (executable, 755) mirrors `build-mcpd.sh`. Pushes to `10.0.0.194:3012/michal/mcplocal:latest`. | ✅ |
+| 10 | **`fulldeploy.sh`** — now a 4-step pipeline: (1) build + push mcpd, (2) build + push mcplocal, (3) rollout both deployments on k8s (mcplocal gated behind a `kubectl get deployment/mcplocal` check so the script stays green before the Pulumi stack lands), (4) RPM release. Smoke suite runs at the end as before. | ✅ |
+| 11 | **`mcpctl test mcp` + new create flags in completions** — bash + fish regenerated. `src/mcplocal/package.json` gains a `serve` script for convenience. | ✅ |
+| 12 | **Smoke test** — `src/mcplocal/tests/smoke/mcptoken.smoke.test.ts`. Gated on `healthz($MCPGW_URL)`; skipped with a clear warning if the gateway is unreachable. Scenarios: happy path via `mcpctl test mcp` → exit 0; cross-project → exit 1 with a 403 message; `--expect-tools __nonexistent__` → exit 2; delete-then-retry after the 5s negative-cache window → exit 1 with 401. Cleans up both projects at the end. | ✅ |
+
+### Deploy-time steps still owed (outside this repo)
+
+- **Pulumi (`../kubernetes-deployment`, stack `homelab`)** — add a `Deployment` named `mcplocal` in ns `mcpctl` pointing at `10.0.0.194:3012/michal/mcplocal:latest` (internal registry), a `Service` named `mcp` (port 3200→80, ClusterIP), an `Ingress` for `mcp.ad.itaz.eu` with TLS via the existing cluster-issuer, a PVC `mcplocal-cache` (10Gi RWO, mounted `/var/lib/mcplocal/cache`), and a NetworkPolicy mirroring mcpd's. Required env: **just `MCPLOCAL_MCPD_URL`** (point at `http://mcpd.mcpctl.svc.cluster.local:3100`). Optionally `MCPLOCAL_TOKEN_POSITIVE_TTL_MS` / `MCPLOCAL_TOKEN_NEGATIVE_TTL_MS` for stricter revocation. `fulldeploy.sh` already runs `pulumi preview` first and halts on drift.
+- **No pod-level secret required** (revised from earlier draft) — the pod has no persistent identity to mcpd. Every inbound `Authorization: Bearer mcpctl_pat_…` is forwarded verbatim to mcpd, and mcpd's auth middleware resolves the McpToken principal. This eliminates the original `MCPLOCAL_MCPD_TOKEN` secret and its rotation story. Trade-off: a token with `--rbac=empty` can't read `/api/v1/projects/:name/servers`, but it also can't meaningfully serve MCP, so this is the right failure mode. See `src/mcplocal/src/serve.ts` header comment.
+- **LLM provider config** — if any project served by this pod is `gated: true`, mount your `~/.mcpctl/config.json` as a ConfigMap at `/root/.mcpctl/config.json`. Ungated projects (proxyModel `content-pipeline` or no LLM-driven stages) need nothing.
+
+### Test stats
+
+- 1773/1773 workspace tests pass (up from 1764 before PR 4).
+- All five packages build clean.
+- Shell completions fresh.
+- `mcpctl test mcp --help` and `mcpctl create mcptoken --help` render expected surfaces.
+
+## End-to-end verification (manual, after Pulumi resources land)
+
+```bash
+# From a workstation outside the k8s cluster:
+mcpctl create project vllm --force
+TOK=$(mcpctl create mcptoken vllm-token --project vllm --rbac clone | grep mcpctl_pat_)
+export MCPCTL_TOKEN="$TOK"
+
+# Probe the public gateway
+mcpctl test mcp https://mcp.ad.itaz.eu/projects/vllm/mcp --expect-tools begin_session
+
+# Negative: wrong project → exit 1
+mcpctl test mcp https://mcp.ad.itaz.eu/projects/other/mcp
+echo $?   # 1
+
+# Audit — the call should be tagged with tokenName=vllm-token
+mcpctl console --audit  # look for the TOKEN column once the TUI patch lands
+```
+
+## Design decisions recap (so you don't have to re-read the plan)
+
+| Decision | Choice |
+|---|---|
+| Transport | Streamable HTTP only |
+| Binary shape | Same `@mcpctl/mcplocal` package, two entry files (`main.ts` STDIO, `serve.ts` HTTP) |
+| Container runtime | Node (not bun-compiled) — mirrors mcpd |
+| Cache | PVC at `/var/lib/mcplocal/cache` |
+| Hostname | k8s Service `mcp`, Ingress `mcp.ad.itaz.eu` |
+| Token format | `mcpctl_pat_<32-byte base62>`, stored as SHA-256, shown-once at create |
+| Resource | `McpToken`, CLI noun `mcptoken`, one-project-per-token, FK cascade |
+| Subject kind | New `McpToken:<sha>` |
+| TTL | No default. Optional `--ttl 30d` / `never` / ISO date |
+| Default bindings | `--rbac=empty` (default), `--rbac=clone`, `--bind <kv>` — creator ceiling enforced server-side |
+| Binding CLI | `--roleBindings role:view,resource:servers[,name:foo]` or `--roleBindings action:logs` |
+| Project enforcement | Endpoint visibility only (no strict create-time check) — same mechanism Claude uses |
--- a/docs/secret-backends.md
+++ b/docs/secret-backends.md
@@ -0,0 +1,167 @@
+# Secret backends
+
+`mcpctl` stores the raw data for `Secret` resources in a pluggable **backend**.
+The default is `plaintext` — the secret payload lives in Postgres as plain JSON
+— which is fine for laptop development but a poor fit for shared clusters. For
+production, point at an external KV store and delete secrets from the DB after
+migration.
+
+This guide covers the model, the shipped drivers, and how to migrate without
+downtime.
+
+## Model
+
+- A `SecretBackend` resource is a single named driver instance (e.g. a pointer
+  at one OpenBao deployment).
+- Every `Secret` row carries a `backendId` FK — the backend that owns its data.
+- Exactly one `SecretBackend` has `isDefault: true`. New secrets created through
+  the API/CLI land on that backend.
+- The `plaintext` backend is seeded at startup and named `default`. It cannot
+  be deleted — there needs to always be one row where the driver's own
+  credentials can bootstrap from (see below).
+
+## CLI
+
+```bash
+mcpctl get secretbackends              # list backends
+mcpctl describe secretbackend <name>   # inspect config (credentials masked)
+mcpctl create secretbackend <name> --type plaintext [--default] [--description ...]
+mcpctl create secretbackend <name> --type openbao \
+  --url http://bao.example:8200 \
+  --token-secret bao-creds/token \
+  [--namespace <ns>] [--mount secret] [--path-prefix mcpctl] \
+  [--default]
+mcpctl delete secretbackend <name>     # blocked if any secret still points at it
+
+mcpctl migrate secrets --from default --to bao
+mcpctl migrate secrets --from default --to bao --names a,b --keep-source
+mcpctl migrate secrets --from default --to bao --dry-run
+```
+
+Anything you can do with `create secretbackend` also works via `apply -f`:
+
+```yaml
+kind: secretbackend
+name: bao
+type: openbao
+description: "shared cluster OpenBao"
+isDefault: true
+config:
+  url: http://bao.svc.cluster.local:8200
+  tokenSecretRef: { name: bao-creds, key: token }
+  namespace: platform
+```
+
+## Drivers
+
+### plaintext
+
+Trivial. `Secret.data` holds the JSON, `externalRef` is empty.
+
+- Storage: Postgres column.
+- Bootstrap: seeded as `default` at startup.
+- Cost: zero setup, zero encryption at rest, full access for any DB reader.
+
+Use for development, CI, or single-tenant self-hosts where the DB itself is
+treated as sensitive.
+
+### openbao
+
+Talks HTTP to an [OpenBao](https://openbao.org) (MPL 2.0 Vault fork) KV v2
+mount. Also compatible with HashiCorp Vault KV v2 — the wire protocol is the
+same.
+
+| Config key       | Required? | Description |
+|------------------|-----------|-------------|
+| `url`            | yes       | Base URL, e.g. `http://bao.svc.cluster.local:8200`. |
+| `tokenSecretRef` | yes       | `{ name, key }` pointing at a `Secret` on the **plaintext** backend that holds the bootstrap token. |
+| `mount`          | no        | KV v2 mount name. Default `secret`. |
+| `pathPrefix`     | no        | Path prefix under the mount. Default `mcpctl`. Secrets land at `<mount>/<pathPrefix>/<secretName>`. |
+| `namespace`      | no        | `X-Vault-Namespace` header for OpenBao/Vault Enterprise namespaces. |
+
+The driver only stores a reference in `Secret.externalRef` (`mount/path`). The
+`Secret.data` column is left empty for openbao-backed rows — you can safely
+drop DB-level access to secrets after migration.
+
+#### Required OpenBao policy
+
+Minimum token policy for a backend that lives at `secret/mcpctl/`:
+
+```hcl
+path "secret/data/mcpctl/*" {
+  capabilities = ["create", "read", "update"]
+}
+
+path "secret/metadata/mcpctl/*" {
+  capabilities = ["list", "delete"]
+}
+
+path "secret/metadata/mcpctl/" {
+  capabilities = ["list"]
+}
+```
+
+Grant `delete` on `metadata/...` only if you need mcpctl to fully remove
+secrets — OpenBao soft-deletes until the metadata is gone.
+
+#### Chicken-and-egg: where does the OpenBao token live?
+
+mcpd reads the OpenBao token from a `Secret` on the **plaintext** backend.
+That's the whole point of keeping plaintext around — it's the trust root:
+
+1. Operator creates a plaintext `Secret` holding the bootstrap token.
+2. Operator creates the `openbao` backend, pointing at that secret via
+   `tokenSecretRef`.
+3. Operator runs `mcpctl migrate secrets --from default --to bao` to move all
+   other secrets off plaintext.
+4. After migration, the only sensitive row left on plaintext is the OpenBao
+   token itself. DB access is now equivalent to OpenBao token access (a single
+   key), not equivalent to all API keys in the system.
+
+Follow-up work (not shipped yet) replaces static token auth with Kubernetes
+ServiceAccount auth so no bootstrap token is needed at all.
+
+## Migration — `mcpctl migrate secrets`
+
+Atomicity is **per secret**, not per batch. Remote writes can't roll back, so we
+don't pretend. For each secret the service:
+
+1. Reads the plaintext from the source driver.
+2. Writes it to the destination driver.
+3. Updates the `Secret` row: flips `backendId`, sets new `externalRef`, clears
+   `data`.
+4. Deletes from source (skipped with `--keep-source`).
+
+If the command is interrupted between step 2 and 3, the destination has an
+orphan entry but the source still owns the row. Re-running is idempotent — the
+service skips secrets that are already on the destination and picks up the
+rest.
+
+```bash
+# Dry-run first: see what would move.
+mcpctl migrate secrets --from default --to bao --dry-run
+
+# Migrate everything.
+mcpctl migrate secrets --from default --to bao
+
+# Migrate a subset only.
+mcpctl migrate secrets --from default --to bao --names api-keys,oauth-client
+
+# Leave the source copy in place (useful for A/B validation).
+mcpctl migrate secrets --from default --to bao --keep-source
+```
+
+The command prints a per-secret summary (migrated / skipped / failed) and exits
+non-zero if any secret failed. Ctrl-C during the run is safe — restart when you
+want, no duplicate writes.
+
+## RBAC
+
+- `resource: secretbackends` — gated like any other resource (`view`,
+  `create`, `edit`, `delete`).
+- `role: run, action: migrate-secrets` — required to call
+  `POST /api/v1/secrets/migrate`.
+
+Describe output masks config values whose keys look like credentials
+(`token`, `secret`, `password`, `key`), so `mcpctl describe secretbackend` is
+safe to paste into tickets.
--- a/fulldeploy.sh
+++ b/fulldeploy.sh
@@ -1,5 +1,13 @@
 #!/bin/bash
-# Full deployment: Docker image → Portainer stack → RPM build/publish/install
+# Full deployment: mcpd image → k8s rollout → RPM build/publish/install
+#
+# Production runtime is Kubernetes (context: worker0-k8s0, namespace: mcpctl).
+# The docker-compose stack under stack/ + deploy/ is kept for local/VM testing
+# only and is no longer invoked from here.
+#
+# Infra (Deployment shape, env, RBAC, NetworkPolicies) is managed by Pulumi
+# in ../kubernetes-deployment. This script runs `pulumi preview` before the
+# rollout; if there is infra drift it halts so you can `pulumi up` first.
 set -e

 SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
@@ -10,22 +18,65 @@ if [ -f .env ]; then
  set -a; source .env; set +a
 fi

+KUBE_CONTEXT="${KUBE_CONTEXT:-worker0-k8s0}"
+KUBE_NAMESPACE="${KUBE_NAMESPACE:-mcpctl}"
+KUBE_DEPLOYMENT="${KUBE_DEPLOYMENT:-mcpd}"
+PULUMI_DIR="${PULUMI_DIR:-$SCRIPT_DIR/../kubernetes-deployment}"
+PULUMI_STACK="${PULUMI_STACK:-homelab}"
+
 echo "========================================"
 echo "  mcpctl Full Deploy"
 echo "========================================"

+# --- Pre-flight: Pulumi drift check ---
 echo ""
-echo ">>> Step 1/3: Build & push mcpd Docker image"
+echo ">>> Pre-flight: checking for Pulumi infra drift"
+echo ""
+if [ -d "$PULUMI_DIR" ]; then
+  if [ -z "$PULUMI_CONFIG_PASSPHRASE" ]; then
+    echo "  WARNING: PULUMI_CONFIG_PASSPHRASE not set — skipping drift check."
+    echo "           Set it in .env or export it to enable."
+  else
+  preview_output=$(cd "$PULUMI_DIR" && pulumi preview --stack "$PULUMI_STACK" --non-interactive --diff 2>&1) || true
+  if echo "$preview_output" | grep -qE '^\s+[-+~]'; then
+    echo "$preview_output"
+    echo ""
+    echo "ERROR: Pulumi detected infra changes that have not been applied."
+    echo "       Run: cd $PULUMI_DIR && pulumi up -s $PULUMI_STACK"
+    echo "       Then re-run this script."
+    exit 1
+  fi
+  echo "  No drift — infra is in sync."
+  fi # passphrase check
+else
+  echo "  WARNING: Pulumi repo not found at $PULUMI_DIR — skipping drift check."
+fi
+
+echo ""
+echo ">>> Step 1/4: Build & push mcpd Docker image"
 echo ""
 bash scripts/build-mcpd.sh "$@"

 echo ""
-echo ">>> Step 2/3: Deploy stack to production"
+echo ">>> Step 2/4: Build & push mcplocal (HTTP-mode) Docker image"
 echo ""
-bash deploy.sh
+bash scripts/build-mcplocal.sh "$@"

 echo ""
-echo ">>> Step 3/3: Build, publish & install RPM"
+echo ">>> Step 3/4: Roll out mcpd + mcplocal on k8s ($KUBE_CONTEXT / $KUBE_NAMESPACE)"
+echo ""
+kubectl --context "$KUBE_CONTEXT" -n "$KUBE_NAMESPACE" rollout restart "deployment/$KUBE_DEPLOYMENT"
+kubectl --context "$KUBE_CONTEXT" -n "$KUBE_NAMESPACE" rollout status "deployment/$KUBE_DEPLOYMENT" --timeout=3m
+if kubectl --context "$KUBE_CONTEXT" -n "$KUBE_NAMESPACE" get deployment/mcplocal >/dev/null 2>&1; then
+  kubectl --context "$KUBE_CONTEXT" -n "$KUBE_NAMESPACE" rollout restart deployment/mcplocal
+  kubectl --context "$KUBE_CONTEXT" -n "$KUBE_NAMESPACE" rollout status deployment/mcplocal --timeout=3m
+else
+  echo "  NOTE: deployment/mcplocal does not exist in the cluster yet — skipping rollout."
+  echo "        Apply the Pulumi stack in ../kubernetes-deployment to create it."
+fi
+
+echo ""
+echo ">>> Step 4/4: Build, publish & install RPM"
 echo ""
 bash scripts/release.sh

--- a/scripts/build-mcplocal.sh
+++ b/scripts/build-mcplocal.sh
@@ -0,0 +1,83 @@
+#!/bin/bash
+# Build mcplocal (HTTP-only) Docker image and push to Gitea container registry.
+#
+# Usage:
+#   ./build-mcplocal.sh [tag]                    # Build for native arch
+#   ./build-mcplocal.sh [tag] --platform linux/amd64
+#   ./build-mcplocal.sh [tag] --multi-arch
+set -e
+
+SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
+PROJECT_ROOT="$(dirname "$SCRIPT_DIR")"
+cd "$PROJECT_ROOT"
+
+# Load .env for GITEA_TOKEN
+if [ -f .env ]; then
+  set -a; source .env; set +a
+fi
+
+# Push directly to internal address (external proxy has body size limit)
+REGISTRY="10.0.0.194:3012"
+IMAGE="mcplocal"
+TAG="${1:-latest}"
+
+PLATFORM=""
+MULTI_ARCH=false
+shift 2>/dev/null || true
+while [[ $# -gt 0 ]]; do
+  case "$1" in
+    --platform)
+      PLATFORM="$2"
+      shift 2
+      ;;
+    --multi-arch)
+      MULTI_ARCH=true
+      shift
+      ;;
+    *)
+      shift
+      ;;
+  esac
+done
+
+if [ "$MULTI_ARCH" = true ]; then
+  echo "==> Building multi-arch $IMAGE image (linux/amd64 + linux/arm64)..."
+  podman build --platform linux/amd64,linux/arm64 \
+    --manifest "$IMAGE:$TAG" -f deploy/Dockerfile.mcplocal .
+
+  echo "==> Tagging manifest as $REGISTRY/michal/$IMAGE:$TAG..."
+  podman tag "$IMAGE:$TAG" "$REGISTRY/michal/$IMAGE:$TAG"
+
+  echo "==> Logging in to $REGISTRY..."
+  podman login --tls-verify=false -u michal -p "$GITEA_TOKEN" "$REGISTRY"
+
+  echo "==> Pushing manifest to $REGISTRY/michal/$IMAGE:$TAG..."
+  podman manifest push --tls-verify=false --all \
+    "$REGISTRY/michal/$IMAGE:$TAG" "docker://$REGISTRY/michal/$IMAGE:$TAG"
+else
+  PLATFORM_FLAG=""
+  if [ -n "$PLATFORM" ]; then
+    PLATFORM_FLAG="--platform $PLATFORM"
+    echo "==> Building $IMAGE image for $PLATFORM..."
+  else
+    echo "==> Building $IMAGE image (native arch)..."
+  fi
+
+  podman build $PLATFORM_FLAG -t "$IMAGE:$TAG" -f deploy/Dockerfile.mcplocal .
+
+  echo "==> Tagging as $REGISTRY/michal/$IMAGE:$TAG..."
+  podman tag "$IMAGE:$TAG" "$REGISTRY/michal/$IMAGE:$TAG"
+
+  echo "==> Logging in to $REGISTRY..."
+  podman login --tls-verify=false -u michal -p "$GITEA_TOKEN" "$REGISTRY"
+
+  echo "==> Pushing to $REGISTRY/michal/$IMAGE:$TAG..."
+  podman push --tls-verify=false "$REGISTRY/michal/$IMAGE:$TAG"
+fi
+
+# Ensure package is linked to the repository
+source "$SCRIPT_DIR/link-package.sh"
+link_package "container" "$IMAGE"
+
+echo "==> Done!"
+echo "    Image: $REGISTRY/michal/$IMAGE:$TAG"
--- a/scripts/demo-mcp-call.py
+++ b/scripts/demo-mcp-call.py
@@ -0,0 +1,169 @@
+#!/usr/bin/env python3
+"""
+Demo: make an MCP request against mcplocal using an McpToken bearer.
+
+This is the standalone counterpart to `mcpctl test mcp` — intended to show
+exactly what a non-Claude client (e.g. a vLLM-driven agent) would do.
+
+Usage:
+    # Default: localhost mcplocal, sre project, token from $MCPCTL_TOKEN
+    export MCPCTL_TOKEN=mcpctl_pat_...
+    python3 scripts/demo-mcp-call.py
+
+    # Custom URL/project/tool
+    python3 scripts/demo-mcp-call.py \\
+        --url https://mcp.ad.itaz.eu \\
+        --project sre \\
+        --token "$MCPCTL_TOKEN" \\
+        --tool begin_session \\
+        --args '{"description":"hello"}'
+
+No third-party deps — pure stdlib. Mirrors the protocol that
+src/shared/src/mcp-http/index.ts implements on the TypeScript side.
+"""
+from __future__ import annotations
+
+import argparse
+import json
+import os
+import sys
+import urllib.error
+import urllib.request
+from typing import Any
+
+
+def _parse_sse(body: str) -> list[dict[str, Any]]:
+    """Parse a text/event-stream body into a list of JSON-RPC messages."""
+    out: list[dict[str, Any]] = []
+    for line in body.splitlines():
+        if line.startswith("data: "):
+            try:
+                out.append(json.loads(line[6:]))
+            except json.JSONDecodeError:
+                pass
+    return out
+
+
+class McpSession:
+    def __init__(self, url: str, bearer: str | None = None, timeout: float = 30.0):
+        self.url = url
+        self.bearer = bearer
+        self.timeout = timeout
+        self.session_id: str | None = None
+        self._next_id = 1
+
+    def _headers(self) -> dict[str, str]:
+        h = {
+            "Content-Type": "application/json",
+            "Accept": "application/json, text/event-stream",
+        }
+        if self.bearer:
+            h["Authorization"] = f"Bearer {self.bearer}"
+        if self.session_id:
+            h["mcp-session-id"] = self.session_id
+        return h
+
+    def send(self, method: str, params: dict[str, Any] | None = None) -> Any:
+        rid = self._next_id
+        self._next_id += 1
+        payload = {"jsonrpc": "2.0", "id": rid, "method": method, "params": params or {}}
+        req = urllib.request.Request(
+            self.url,
+            data=json.dumps(payload).encode("utf-8"),
+            headers=self._headers(),
+            method="POST",
+        )
+        try:
+            with urllib.request.urlopen(req, timeout=self.timeout) as resp:
+                body = resp.read().decode("utf-8")
+                content_type = resp.headers.get("content-type", "")
+                # First successful response carries the session id.
+                if self.session_id is None:
+                    sid = resp.headers.get("mcp-session-id")
+                    if sid:
+                        self.session_id = sid
+                messages: list[dict[str, Any]] = (
+                    _parse_sse(body) if "text/event-stream" in content_type else [json.loads(body)]
+                )
+        except urllib.error.HTTPError as e:
+            err_body = e.read().decode("utf-8", errors="replace")
+            raise SystemExit(f"HTTP {e.code} from {self.url}: {err_body}") from None
+        except urllib.error.URLError as e:
+            raise SystemExit(f"transport error reaching {self.url}: {e.reason}") from None
+
+        # Pick the response matching our id; fall back to first message.
+        matched = next((m for m in messages if m.get("id") == rid), messages[0] if messages else None)
+        if matched is None:
+            raise SystemExit(f"no response for {method}")
+        if "error" in matched:
+            err = matched["error"]
+            raise SystemExit(f"MCP error {err.get('code')}: {err.get('message')}")
+        return matched.get("result")
+
+    def initialize(self) -> dict[str, Any]:
+        return self.send(
+            "initialize",
+            {
+                "protocolVersion": "2024-11-05",
+                "capabilities": {},
+                "clientInfo": {"name": "demo-mcp-call.py", "version": "1.0.0"},
+            },
+        )
+
+    def list_tools(self) -> list[dict[str, Any]]:
+        result = self.send("tools/list")
+        return result.get("tools", []) if isinstance(result, dict) else []
+
+    def call_tool(self, name: str, args: dict[str, Any]) -> Any:
+        return self.send("tools/call", {"name": name, "arguments": args})
+
+
+def main() -> int:
+    ap = argparse.ArgumentParser(description="Demo MCP request via McpToken bearer.")
+    ap.add_argument("--url", default=os.environ.get("MCPGW_URL", "http://localhost:3200"),
+                    help="Base URL of mcplocal (default: $MCPGW_URL or http://localhost:3200)")
+    ap.add_argument("--project", default="sre",
+                    help="Project name (default: sre). Must match the token's bound project.")
+    ap.add_argument("--token", default=os.environ.get("MCPCTL_TOKEN"),
+                    help="Raw mcpctl_pat_* bearer (default: $MCPCTL_TOKEN)")
+    ap.add_argument("--tool", help="Optionally call a tool after tools/list")
+    ap.add_argument("--args", default="{}", help="JSON-encoded arguments for --tool")
+    ap.add_argument("--timeout", type=float, default=30.0)
+    opts = ap.parse_args()
+
+    if not opts.token:
+        ap.error("--token or $MCPCTL_TOKEN required")
+
+    endpoint = f"{opts.url.rstrip('/')}/projects/{opts.project}/mcp"
+    print(f"→ POST {endpoint}")
+    print(f"  Bearer: {opts.token[:16]}…")
+    print()
+
+    sess = McpSession(endpoint, bearer=opts.token, timeout=opts.timeout)
+
+    info = sess.initialize()
+    server_info = info.get("serverInfo", {}) if isinstance(info, dict) else {}
+    print(f"initialize:  protocol={info.get('protocolVersion') if isinstance(info, dict) else '?'} "
+          f"server={server_info.get('name', '?')}/{server_info.get('version', '?')} "
+          f"sessionId={sess.session_id}")
+
+    tools = sess.list_tools()
+    print(f"tools/list:  {len(tools)} tool(s)")
+    for t in tools:
+        desc = (t.get("description") or "").splitlines()[0][:80]
+        print(f"  - {t['name']}  {desc}")
+
+    if opts.tool:
+        try:
+            args = json.loads(opts.args)
+        except json.JSONDecodeError as e:
+            raise SystemExit(f"--args must be valid JSON: {e}")
+        print(f"\ntools/call: {opts.tool} {args}")
+        result = sess.call_tool(opts.tool, args)
+        print(json.dumps(result, indent=2)[:2000])
+
+    return 0
+
+
+if __name__ == "__main__":
+    sys.exit(main())
--- a/scripts/generate-completions.ts
+++ b/scripts/generate-completions.ts
@@ -184,7 +184,7 @@ async function extractTree(): Promise<CmdInfo> {
 // ============================================================

 const CANONICAL_RESOURCES = [
-  'servers', 'instances', 'secrets', 'templates', 'projects',
+  'servers', 'instances', 'secrets', 'secretbackends', 'llms', 'templates', 'projects',
  'users', 'groups', 'rbac', 'prompts', 'promptrequests',
  'serverattachments', 'proxymodels', 'all',
 ];
@@ -193,6 +193,8 @@ const ALIAS_ENTRIES: [string, string][] = [
  ['server', 'servers'], ['srv', 'servers'],
  ['instance', 'instances'], ['inst', 'instances'],
  ['secret', 'secrets'], ['sec', 'secrets'],
+  ['secretbackend', 'secretbackends'], ['sb', 'secretbackends'],
+  ['llm', 'llms'], ['llms', 'llms'],
  ['template', 'templates'], ['tpl', 'templates'],
  ['project', 'projects'], ['proj', 'projects'],
  ['user', 'users'],
--- a/scripts/release.sh
+++ b/scripts/release.sh
@@ -54,7 +54,7 @@ if command -v dpkg &>/dev/null && ! command -v dnf &>/dev/null; then
  sudo dpkg -i "$DEB_FILE" || sudo apt-get install -f -y
 else
  # RPM filenames use x86_64/aarch64, not amd64/arm64
-  local rpm_arch
+  rpm_arch=""
  case "$NATIVE_ARCH" in amd64) rpm_arch="x86_64" ;; arm64) rpm_arch="aarch64" ;; *) rpm_arch="$NATIVE_ARCH" ;; esac
  RPM_FILE=$(ls dist/mcpctl-*.rpm 2>/dev/null | grep -E "[._]${rpm_arch}[._]" | head -1)
  sudo rpm -U --force "$RPM_FILE"
--- a/src/cli/src/api-client.ts
+++ b/src/cli/src/api-client.ts
@@ -1,4 +1,5 @@
 import http from 'node:http';
+import https from 'node:https';

 export interface ApiClientOptions {
  baseUrl: string;
@@ -31,16 +32,18 @@ function request<T>(method: string, url: string, timeout: number, body?: unknown
    if (token) {
      headers['Authorization'] = `Bearer ${token}`;
    }
+    const isHttps = parsed.protocol === 'https:';
    const opts: http.RequestOptions = {
      hostname: parsed.hostname,
-      port: parsed.port,
+      port: parsed.port || (isHttps ? 443 : 80),
      path: parsed.pathname + parsed.search,
      method,
      timeout,
      headers,
    };

-    const req = http.request(opts, (res) => {
+    const driver = isHttps ? https : http;
+    const req = driver.request(opts, (res) => {
      const chunks: Buffer[] = [];
      res.on('data', (chunk: Buffer) => chunks.push(chunk));
      res.on('end', () => {
--- a/src/cli/src/commands/apply.ts
+++ b/src/cli/src/commands/apply.ts
@@ -41,6 +41,28 @@ const SecretSpecSchema = z.object({
  data: z.record(z.string()).default({}),
 });

+const SecretBackendSpecSchema = z.object({
+  name: z.string().min(1),
+  type: z.string().min(1),
+  description: z.string().default(''),
+  isDefault: z.boolean().optional(),
+  config: z.record(z.unknown()).default({}),
+});
+
+const LlmSpecSchema = z.object({
+  name: z.string().min(1).max(100).regex(/^[a-z0-9-]+$/),
+  type: z.enum(['anthropic', 'openai', 'deepseek', 'vllm', 'ollama', 'gemini-cli']),
+  model: z.string().min(1),
+  url: z.string().url().optional(),
+  tier: z.enum(['fast', 'heavy']).default('fast'),
+  description: z.string().max(500).default(''),
+  apiKeyRef: z.object({
+    name: z.string().min(1),
+    key: z.string().min(1),
+  }).nullable().optional(),
+  extraConfig: z.record(z.unknown()).default({}),
+});
+
 const TemplateEnvEntrySchema = z.object({
  name: z.string().min(1),
  description: z.string().optional(),
@@ -127,13 +149,29 @@ const ProjectSpecSchema = z.object({
  prompt: z.string().max(10000).default(''),
  proxyModel: z.string().optional(),
  gated: z.boolean().optional(),
+  // Name of an `Llm` resource (see `mcpctl get llms`), or the literal 'none'
+  // to disable LLM features for this project. Unknown names fall back to the
+  // consumer's registry default — `mcpctl describe project` will flag that.
  llmProvider: z.string().optional(),
+  // Override the model string for this project; defaults to the Llm's own
+  // model when unset.
  llmModel: z.string().optional(),
  servers: z.array(z.string()).default([]),
 });

+const McpTokenSpecSchema = z.object({
+  name: z.string().min(1).max(100).regex(/^[a-z0-9-]+$/),
+  project: z.string().min(1),
+  description: z.string().default(''),
+  expiresAt: z.union([z.string().datetime(), z.null()]).optional(),
+  rbacMode: z.enum(['empty', 'clone']).default('empty'),
+  bindings: z.array(RbacRoleBindingSchema).default([]),
+});
+
 const ApplyConfigSchema = z.object({
+  secretbackends: z.array(SecretBackendSpecSchema).default([]),
  secrets: z.array(SecretSpecSchema).default([]),
+  llms: z.array(LlmSpecSchema).default([]),
  servers: z.array(ServerSpecSchema).default([]),
  users: z.array(UserSpecSchema).default([]),
  groups: z.array(GroupSpecSchema).default([]),
@@ -143,6 +181,7 @@ const ApplyConfigSchema = z.object({
  rbacBindings: z.array(RbacBindingSpecSchema).default([]),
  rbac: z.array(RbacBindingSpecSchema).default([]),
  prompts: z.array(PromptSpecSchema).default([]),
+  mcptokens: z.array(McpTokenSpecSchema).default([]),
 }).transform((data) => ({
  ...data,
  // Merge rbac into rbacBindings so both keys work
@@ -173,7 +212,9 @@ export function createApplyCommand(deps: ApplyCommandDeps): Command {

      if (opts.dryRun) {
        log('Dry run - would apply:');
+        if (config.secretbackends.length > 0) log(`  ${config.secretbackends.length} secretbackend(s)`);
        if (config.secrets.length > 0) log(`  ${config.secrets.length} secret(s)`);
+        if (config.llms.length > 0) log(`  ${config.llms.length} llm(s)`);
        if (config.servers.length > 0) log(`  ${config.servers.length} server(s)`);
        if (config.users.length > 0) log(`  ${config.users.length} user(s)`);
        if (config.groups.length > 0) log(`  ${config.groups.length} group(s)`);
@@ -182,6 +223,7 @@ export function createApplyCommand(deps: ApplyCommandDeps): Command {
        if (config.serverattachments.length > 0) log(`  ${config.serverattachments.length} serverattachment(s)`);
        if (config.rbacBindings.length > 0) log(`  ${config.rbacBindings.length} rbacBinding(s)`);
        if (config.prompts.length > 0) log(`  ${config.prompts.length} prompt(s)`);
+        if (config.mcptokens.length > 0) log(`  ${config.mcptokens.length} mcptoken(s)`);
        return;
      }

@@ -217,6 +259,9 @@ const KIND_TO_RESOURCE: Record<string, string> = {
  prompt: 'prompts',
  promptrequest: 'promptrequests',
  serverattachment: 'serverattachments',
+  mcptoken: 'mcptokens',
+  secretbackend: 'secretbackends',
+  llm: 'llms',
 };

 /**
@@ -312,6 +357,30 @@ async function applyConfig(client: ApiClient, config: ApplyConfig, log: (...args
    }
  }

+  // Apply secret backends first — secrets reference them.
+  // When multiple backends claim isDefault: true, the server's atomic swap will
+  // leave whichever was applied last as the effective default.
+  for (const sb of config.secretbackends) {
+    try {
+      const existing = await cachedFindByName('secretbackends', sb.name);
+      if (existing) {
+        const updateBody: Record<string, unknown> = {
+          config: sb.config,
+          description: sb.description,
+        };
+        if (sb.isDefault !== undefined) updateBody.isDefault = sb.isDefault;
+        await withRetry(() => client.put(`/api/v1/secretbackends/${existing.id}`, updateBody));
+        log(`Updated secretbackend: ${sb.name}`);
+      } else {
+        await withRetry(() => client.post('/api/v1/secretbackends', sb));
+        invalidateCache('secretbackends');
+        log(`Created secretbackend: ${sb.name}`);
+      }
+    } catch (err) {
+      log(`Error applying secretbackend '${sb.name}': ${err instanceof Error ? err.message : err}`);
+    }
+  }
+
  // Apply secrets
  for (const secret of config.secrets) {
    try {
@@ -329,6 +398,25 @@ async function applyConfig(client: ApiClient, config: ApplyConfig, log: (...args
    }
  }

+  // Apply LLMs (after secrets — apiKeyRef resolves to an existing Secret)
+  for (const llm of config.llms) {
+    try {
+      const existing = await cachedFindByName('llms', llm.name);
+      if (existing) {
+        // Exclude type on update — type is immutable.
+        const { name: _n, type: _t, ...updateBody } = llm;
+        await withRetry(() => client.put(`/api/v1/llms/${existing.id}`, updateBody));
+        log(`Updated llm: ${llm.name}`);
+      } else {
+        await withRetry(() => client.post('/api/v1/llms', llm));
+        invalidateCache('llms');
+        log(`Created llm: ${llm.name}`);
+      }
+    } catch (err) {
+      log(`Error applying llm '${llm.name}': ${err instanceof Error ? err.message : err}`);
+    }
+  }
+
  // Apply servers
  for (const server of config.servers) {
    try {
@@ -529,6 +617,46 @@ async function applyConfig(client: ApiClient, config: ApplyConfig, log: (...args
      log(`Error applying prompt '${prompt.name}': ${err instanceof Error ? err.message : err}`);
    }
  }
+
+  // --- McpTokens ---
+  // Apply semantics: tokens are immutable (their secret is minted once). If an
+  // active token with the same name+project already exists we skip, logging the
+  // state. Otherwise we create and log the raw token (shown exactly once).
+  for (const tok of config.mcptokens) {
+    try {
+      const proj = await cachedFindByName('projects', tok.project);
+      if (!proj) {
+        log(`Error applying mcptoken '${tok.name}': project '${tok.project}' not found`);
+        continue;
+      }
+
+      // Check if an active one already exists
+      const existing = await client
+        .get<Array<{ id: string; name: string; status: string }>>(`/api/v1/mcptokens?projectName=${encodeURIComponent(tok.project)}`)
+        .catch(() => []);
+      const active = existing.find((t) => t.name === tok.name && t.status === 'active');
+      if (active) {
+        log(`mcptoken '${tok.name}' already active in project '${tok.project}' — skipped (tokens are immutable)`);
+        continue;
+      }
+
+      const body: Record<string, unknown> = {
+        name: tok.name,
+        projectId: proj.id,
+        description: tok.description,
+        rbacMode: tok.rbacMode,
+        bindings: tok.bindings,
+      };
+      if (tok.expiresAt !== undefined) body.expiresAt = tok.expiresAt;
+
+      const created = await withRetry(() => client.post<{ id: string; name: string; token: string }>('/api/v1/mcptokens', body));
+      log(`Created mcptoken: ${tok.name} (project: ${tok.project})`);
+      log(`  token: ${created.token}`);
+      log('  (raw token shown once — copy it now)');
+    } catch (err) {
+      log(`Error applying mcptoken '${tok.name}': ${err instanceof Error ? err.message : err}`);
+    }
+  }
 }

 async function findByField<T extends string>(client: ApiClient, resource: string, field: T, value: string): Promise<unknown | null> {
--- a/src/cli/src/commands/config-setup.ts
+++ b/src/cli/src/commands/config-setup.ts
@@ -153,7 +153,7 @@ async function defaultConfirm(message: string, defaultValue?: boolean): Promise<
  return answer as boolean;
 }

-const defaultPrompt: ConfigSetupPrompt = {
+export const defaultPrompt: ConfigSetupPrompt = {
  select: defaultSelect,
  input: defaultInput,
  password: defaultPassword,
--- a/src/cli/src/commands/console/audit-types.ts
+++ b/src/cli/src/commands/console/audit-types.ts
@@ -23,6 +23,9 @@ export interface AuditEvent {
  serverName: string | null;
  correlationId: string | null;
  parentEventId: string | null;
+  userName?: string | null;
+  tokenName?: string | null;
+  tokenSha?: string | null;
  payload: Record<string, unknown>;
 }

--- a/src/cli/src/commands/create-secretbackend-wizard.ts
+++ b/src/cli/src/commands/create-secretbackend-wizard.ts
@@ -0,0 +1,231 @@
+/**
+ * Interactive wizard that provisions an OpenBao backend end-to-end:
+ *
+ *   1. Asks the user for the OpenBao URL + admin/root token.
+ *   2. Verifies connectivity (`/sys/health`).
+ *   3. Ensures KV v2 is mounted at `<mount>/`.
+ *   4. Writes policy `app-mcpd` scoped to `<mount>/{data,metadata}/<prefix>/*`
+ *      plus the self-rotation paths.
+ *   5. Ensures a token role `app-mcpd-role` with `period=720h, renewable=true`.
+ *   6. Mints the first periodic token via that role.
+ *   7. Stores the token as a plaintext `Secret` on mcpd.
+ *   8. Creates the `SecretBackend` row with rotation config pointing at the role.
+ *   9. Kicks an initial rotate via `POST /api/v1/secretbackends/:id/rotate`
+ *      to seed `tokenMeta` + prove the self-rotation policy works.
+ *  10. (Optional) promotes the new backend to default.
+ *  11. Prints the migration command for the user to run.
+ *
+ * Admin token is used only for steps 2–6 and is never persisted.
+ *
+ * All prompts go through `ConfigSetupPrompt` (from `config-setup.ts`) so the
+ * wizard is testable without real stdin.
+ */
+import type { ApiClient } from '../api-client.js';
+import {
+  verifyHealth,
+  ensureKvV2,
+  writePolicy,
+  ensureTokenRole,
+  mintRoleToken,
+  testWriteReadDelete,
+  buildAppMcpdPolicyHcl,
+  type VaultDeps,
+} from '@mcpctl/shared';
+import { type ConfigSetupPrompt, defaultPrompt } from './config-setup.js';
+
+export interface WizardDeps {
+  client: ApiClient;
+  log: (...args: unknown[]) => void;
+  prompt?: ConfigSetupPrompt;
+  /** Overridable for tests. Forwarded to all vault HTTP calls. */
+  fetch?: typeof globalThis.fetch;
+}
+
+export interface WizardInput {
+  /** Backend name. Required — supplied via `mcpctl create secretbackend <name> --wizard`. */
+  name: string;
+  /** Pre-filled via flags for CI; falls back to prompt. */
+  url?: string | undefined;
+  adminToken?: string | undefined;
+  mount?: string | undefined;
+  pathPrefix?: string | undefined;
+  policyName?: string | undefined;
+  tokenRole?: string | undefined;
+  promoteToDefault?: boolean | undefined;
+  /** If set, skip the test write/read/delete (for dev/debugging only). */
+  skipSmoke?: boolean | undefined;
+}
+
+export async function runSecretBackendOpenbaoWizard(
+  input: WizardInput,
+  deps: WizardDeps,
+): Promise<void> {
+  const prompt = deps.prompt ?? defaultPrompt;
+  const log = deps.log;
+
+  const url = input.url ?? await prompt.input('OpenBao URL', 'https://bao.ad.itaz.eu');
+  const adminToken = input.adminToken ?? await prompt.password('OpenBao admin / root token');
+  if (adminToken === '') throw new Error('admin token is required');
+
+  const vaultDeps: VaultDeps = {};
+  if (deps.fetch !== undefined) vaultDeps.fetch = deps.fetch;
+
+  // 1. Health check.
+  log('  → checking OpenBao health …');
+  const health = await verifyHealth(url, adminToken, vaultDeps);
+  if (!health.initialized || health.sealed) {
+    throw new Error(`OpenBao is not ready (initialized=${String(health.initialized)}, sealed=${String(health.sealed)})`);
+  }
+  log(`    ok (version ${health.version})`);
+
+  const mount = input.mount ?? await prompt.input('KV v2 mount', 'secret');
+  const pathPrefix = input.pathPrefix ?? await prompt.input('Path prefix under mount', 'mcpd');
+  const policyName = input.policyName ?? await prompt.input('Policy name', 'app-mcpd');
+  const tokenRole = input.tokenRole ?? await prompt.input('Token role name', 'app-mcpd-role');
+
+  // 2. Enable KV v2 if needed.
+  log(`  → ensuring KV v2 at ${mount}/ …`);
+  const created = await ensureKvV2(url, adminToken, mount, vaultDeps);
+  log(`    ${created ? 'mounted' : 'already mounted'}`);
+
+  // 3. Write policy.
+  log(`  → writing policy '${policyName}' …`);
+  const hcl = buildAppMcpdPolicyHcl({ mount, pathPrefix, tokenRole });
+  await writePolicy(url, adminToken, policyName, hcl, vaultDeps);
+  log(`    written (scope: ${mount}/{data,metadata}/${pathPrefix}/* + self-rotation paths)`);
+
+  // 4. Ensure token role.
+  log(`  → ensuring token role '${tokenRole}' (period=720h, renewable) …`);
+  await ensureTokenRole(url, adminToken, tokenRole, {
+    allowedPolicies: [policyName],
+    period: 720 * 3600,
+    renewable: true,
+    orphan: false,
+  }, vaultDeps);
+  log('    ok');
+
+  // 5. Mint the first periodic token using the admin token.
+  log('  → minting first periodic token …');
+  const minted = await mintRoleToken(url, adminToken, tokenRole, vaultDeps);
+  if (!minted.renewable) {
+    throw new Error(`minted token is not renewable — the role '${tokenRole}' config is wrong`);
+  }
+  log(`    minted (accessor ${minted.accessor.slice(0, 12)}…)`);
+
+  // 6. Smoke test with the minted token before committing to mcpd.
+  if (input.skipSmoke !== true) {
+    log('  → smoke-testing write/read/delete with the minted token …');
+    await testWriteReadDelete(url, minted.clientToken, mount, `${pathPrefix}/.__mcpctl_wizard_smoke__`, vaultDeps);
+    log('    ok');
+  }
+
+  // 7. Store token on mcpd as a plaintext Secret.
+  const credsSecretName = `${input.name}-creds`;
+  log(`  → creating Secret '${credsSecretName}' on mcpd (plaintext) …`);
+  await createSecret(deps.client, credsSecretName, { token: minted.clientToken });
+
+  // 8. Create SecretBackend row (non-default by default; promote later).
+  log(`  → creating SecretBackend '${input.name}' …`);
+  const backendBody = {
+    name: input.name,
+    type: 'openbao',
+    config: {
+      url,
+      auth: 'token',
+      mount,
+      pathPrefix,
+      tokenSecretRef: { name: credsSecretName, key: 'token' },
+      rotation: {
+        enabled: true,
+        tokenRole,
+        intervalHours: 24,
+      },
+    },
+  };
+  const backend = await deps.client.post<{ id: string; name: string }>('/api/v1/secretbackends', backendBody);
+  log(`    created (id: ${backend.id})`);
+
+  // 9. Kick initial rotation so tokenMeta is populated + self-rotation is proven.
+  //    This uses the FIRST token (just-minted) to mint its successor. The old
+  //    first token is then revoked by accessor.
+  log('  → running initial rotation (seeds tokenMeta) …');
+  try {
+    await deps.client.post(`/api/v1/secretbackends/${backend.id}/rotate`, {});
+    log('    rotated — tokenMeta populated');
+  } catch (err) {
+    log(`    warn: initial rotation failed: ${err instanceof Error ? err.message : String(err)}`);
+    log('         backend is still usable; rotation will retry on the 24h loop');
+  }
+
+  // 10. Optional promote.
+  const promote = input.promoteToDefault
+    ?? await prompt.confirm(`Promote '${input.name}' to default backend?`, true);
+  if (promote) {
+    await deps.client.post(`/api/v1/secretbackends/${backend.id}/default`, {});
+    log(`    promoted '${input.name}' to default`);
+  }
+
+  // 11. Migration hint.
+  log('');
+  await printMigrationHint(deps.client, input.name, log);
+
+  log('');
+  log(`Describe the new backend:   mcpctl --direct describe secretbackend ${input.name}`);
+  log(`Force a rotation manually:  mcpctl --direct rotate secretbackend ${input.name}`);
+}
+
+async function createSecret(
+  client: ApiClient,
+  name: string,
+  data: Record<string, string>,
+): Promise<void> {
+  try {
+    await client.post('/api/v1/secrets', { name, data });
+  } catch (err) {
+    // 409 → secret already exists with this name. Update its data instead so
+    // re-running the wizard with the same --name is idempotent.
+    const status = (err as { status?: number }).status;
+    if (status !== 409) throw err;
+    const existing = (await client.get<Array<{ id: string; name: string }>>('/api/v1/secrets'))
+      .find((s) => s.name === name);
+    if (existing === undefined) throw err;
+    await client.put(`/api/v1/secrets/${existing.id}`, { data });
+  }
+}
+
+async function printMigrationHint(
+  client: ApiClient,
+  newBackendName: string,
+  log: (...args: unknown[]) => void,
+): Promise<void> {
+  // Find the current default backend name (likely 'default') so the hint
+  // points at a real source.
+  let defaultName = 'default';
+  try {
+    const rows = await client.get<Array<{ name: string; isDefault: boolean }>>('/api/v1/secretbackends');
+    const d = rows.find((r) => r.isDefault);
+    if (d !== undefined && d.name !== newBackendName) defaultName = d.name;
+  } catch {
+    /* fall through with 'default' guess */
+  }
+
+  // Count candidate secrets.
+  try {
+    const body = await client.post<{ candidates: Array<{ name: string }> }>(
+      '/api/v1/secrets/migrate',
+      { from: defaultName, to: newBackendName, dryRun: true },
+    );
+    const n = body.candidates.length;
+    if (n === 0) {
+      log(`No secrets to migrate — '${defaultName}' is empty.`);
+      return;
+    }
+    log(`You have ${String(n)} secret(s) on '${defaultName}'. To migrate them to '${newBackendName}':`);
+    log('');
+    log(`    mcpctl --direct migrate secrets --from ${defaultName} --to ${newBackendName} --dry-run`);
+    log(`    mcpctl --direct migrate secrets --from ${defaultName} --to ${newBackendName}`);
+  } catch (err) {
+    log(`(could not dry-run migration: ${err instanceof Error ? err.message : String(err)})`);
+    log(`Manual command:  mcpctl --direct migrate secrets --from ${defaultName} --to ${newBackendName}`);
+  }
+}
--- a/src/cli/src/commands/create.ts
+++ b/src/cli/src/commands/create.ts
@@ -1,6 +1,7 @@
 import { Command } from 'commander';
 import { type ApiClient, ApiError } from '../api-client.js';
 import { resolveNameOrId } from './shared.js';
+import { parseRoleBinding } from './rbac-bindings.js';
 export interface CreateCommandDeps {
  client: ApiClient;
  log: (...args: unknown[]) => void;
@@ -10,6 +11,37 @@ function collect(value: string, prev: string[]): string[] {
  return [...prev, value];
 }

+/**
+ * Parse a `--ttl` value.
+ *
+ * - `"never"` → null (no expiry)
+ * - `"30d"`, `"12h"`, `"2w"`, `"90m"`, `"60s"` → ISO8601 string relative to now
+ * - An ISO8601 datetime → returned as-is
+ */
+function parseTtl(value: string): string | null {
+  const trimmed = value.trim();
+  if (trimmed.toLowerCase() === 'never') return null;
+  const match = trimmed.match(/^(\d+)([smhdw])$/i);
+  if (match) {
+    const amount = Number(match[1]);
+    const unit = match[2]!.toLowerCase();
+    const multipliers: Record<string, number> = {
+      s: 1000,
+      m: 60 * 1000,
+      h: 3600 * 1000,
+      d: 86400 * 1000,
+      w: 7 * 86400 * 1000,
+    };
+    return new Date(Date.now() + amount * multipliers[unit]!).toISOString();
+  }
+  // Try to parse as ISO8601
+  const parsed = new Date(trimmed);
+  if (isNaN(parsed.getTime())) {
+    throw new Error(`Invalid --ttl '${value}'. Expected 'never', a duration like '30d' / '12h', or an ISO8601 datetime.`);
+  }
+  return parsed.toISOString();
+}
+
 interface ServerEnvEntry {
  name: string;
  value?: string;
@@ -56,7 +88,7 @@ export function createCreateCommand(deps: CreateCommandDeps): Command {
  const { client, log } = deps;

  const cmd = new Command('create')
-    .description('Create a resource (server, secret, project, user, group, rbac, serverattachment, prompt)');
+    .description('Create a resource (server, secret, secretbackend, llm, project, user, group, rbac, serverattachment, prompt)');

  // --- create server ---
  cmd.command('server')
@@ -220,6 +252,166 @@ export function createCreateCommand(deps: CreateCommandDeps): Command {
      }
    });

+  // --- create llm ---
+  cmd.command('llm')
+    .description('Register a server-managed LLM (anthropic, openai, vllm, ollama, deepseek, gemini-cli)')
+    .argument('<name>', 'LLM name (lowercase alphanumeric with hyphens)')
+    .requiredOption('--type <type>', 'Provider type (anthropic, openai, deepseek, vllm, ollama, gemini-cli)')
+    .requiredOption('--model <model>', 'Model identifier (e.g. claude-3-5-sonnet-20241022)')
+    .option('--url <url>', 'Endpoint URL (empty = provider default)')
+    .option('--tier <tier>', 'Tier: fast or heavy', 'fast')
+    .option('--description <text>', 'Description')
+    .option('--api-key-ref <ref>', 'API key reference in SECRET/KEY form (e.g. anthropic-key/token)')
+    .option('--extra <entry>', 'Extra config key=value (repeat)', collect, [])
+    .option('--force', 'Update if already exists')
+    .action(async (name: string, opts) => {
+      const body: Record<string, unknown> = {
+        name,
+        type: opts.type,
+        model: opts.model,
+        tier: opts.tier,
+      };
+      if (opts.url) body.url = opts.url;
+      if (opts.description !== undefined) body.description = opts.description;
+      if (opts.apiKeyRef) {
+        const slashIdx = (opts.apiKeyRef as string).indexOf('/');
+        if (slashIdx < 1) throw new Error(`Invalid --api-key-ref '${opts.apiKeyRef as string}'. Expected SECRET_NAME/KEY_NAME`);
+        body.apiKeyRef = {
+          name: (opts.apiKeyRef as string).slice(0, slashIdx),
+          key: (opts.apiKeyRef as string).slice(slashIdx + 1),
+        };
+      }
+      if (opts.extra && (opts.extra as string[]).length > 0) {
+        const extra: Record<string, unknown> = {};
+        for (const entry of opts.extra as string[]) {
+          const eqIdx = entry.indexOf('=');
+          if (eqIdx === -1) throw new Error(`Invalid --extra '${entry}'. Expected key=value`);
+          extra[entry.slice(0, eqIdx)] = entry.slice(eqIdx + 1);
+        }
+        body.extraConfig = extra;
+      }
+
+      try {
+        const row = await client.post<{ id: string; name: string }>('/api/v1/llms', body);
+        log(`llm '${row.name}' created (id: ${row.id})`);
+      } catch (err) {
+        if (err instanceof ApiError && err.status === 409 && opts.force) {
+          const existing = (await client.get<Array<{ id: string; name: string }>>('/api/v1/llms')).find((l) => l.name === name);
+          if (!existing) throw err;
+          const { name: _n, type: _t, ...updateBody } = body;
+          await client.put(`/api/v1/llms/${existing.id}`, updateBody);
+          log(`llm '${name}' updated (id: ${existing.id})`);
+        } else {
+          throw err;
+        }
+      }
+    });
+
+  // --- create secretbackend ---
+  cmd.command('secretbackend')
+    .alias('sb')
+    .description('Create a secret backend (plaintext, openbao)')
+    .argument('<name>', 'Backend name (lowercase, hyphens allowed)')
+    .requiredOption('--type <type>', 'Backend type (plaintext, openbao)')
+    .option('--description <text>', 'Description')
+    .option('--default', 'Promote this backend to default (atomically demotes the current one)')
+    .option('--url <url>', 'openbao: vault URL (e.g. http://bao.example:8200)')
+    .option('--namespace <ns>', 'openbao: X-Vault-Namespace header value')
+    .option('--mount <mount>', 'openbao: KV v2 mount point (default: secret)')
+    .option('--path-prefix <prefix>', 'openbao: path prefix under mount (default: mcpctl)')
+    .option('--auth <method>', "openbao: auth method — 'token' (default) or 'kubernetes'")
+    .option('--token-secret <ref>', 'openbao token auth: token secret reference in SECRET/KEY form (e.g. bao-creds/token)')
+    .option('--role <name>', "openbao kubernetes auth: vault role to login as (e.g. 'mcpctl')")
+    .option('--auth-mount <path>', "openbao kubernetes auth: vault auth method mount path (default: 'kubernetes')")
+    .option('--sa-token-path <path>', "openbao kubernetes auth: filesystem path to projected SA token (default: '/var/run/secrets/kubernetes.io/serviceaccount/token')")
+    .option('--config <entry>', 'Extra config as key=value (repeat for multiple)', collect, [])
+    .option('--wizard', 'Interactive wizard (openbao only): provision policy + token role, mint token, store on mcpd, suggest migration')
+    .option('--admin-token <token>', "openbao wizard: OpenBao admin/root token (prompted if omitted). Used only for provisioning; NEVER persisted.")
+    .option('--policy-name <name>', "openbao wizard: name for the policy created on OpenBao (default: 'app-mcpd')")
+    .option('--token-role <name>', "openbao wizard: name for the token role created on OpenBao (default: 'app-mcpd-role')")
+    .option('--no-promote-default', 'openbao wizard: do not promote this backend to default after creation')
+    .option('--force', 'Update if already exists')
+    .action(async (name: string, opts) => {
+      const type = opts.type as string;
+      // Wizard path — delegates to create-secretbackend-wizard.ts.
+      if (opts.wizard === true) {
+        if (type !== 'openbao') {
+          throw new Error(`--wizard is only supported for --type openbao (got '${type}')`);
+        }
+        const { runSecretBackendOpenbaoWizard } = await import('./create-secretbackend-wizard.js');
+        const wizardInput: Parameters<typeof runSecretBackendOpenbaoWizard>[0] = { name };
+        if (opts.url !== undefined) wizardInput.url = opts.url as string;
+        if (opts.adminToken !== undefined) wizardInput.adminToken = opts.adminToken as string;
+        if (opts.mount !== undefined) wizardInput.mount = opts.mount as string;
+        if (opts.pathPrefix !== undefined) wizardInput.pathPrefix = opts.pathPrefix as string;
+        if (opts.policyName !== undefined) wizardInput.policyName = opts.policyName as string;
+        if (opts.tokenRole !== undefined) wizardInput.tokenRole = opts.tokenRole as string;
+        // `--no-promote-default` → opts.promoteDefault === false (commander negated flag)
+        if (opts.promoteDefault !== undefined) wizardInput.promoteToDefault = opts.promoteDefault as boolean;
+        await runSecretBackendOpenbaoWizard(wizardInput, { client, log });
+        return;
+      }
+      const config: Record<string, unknown> = {};
+
+      if (type === 'openbao') {
+        if (!opts.url) throw new Error('--url is required for openbao backend');
+        const auth = (opts.auth as string | undefined) ?? 'token';
+        if (auth !== 'token' && auth !== 'kubernetes') {
+          throw new Error(`--auth must be 'token' or 'kubernetes' (got '${auth}')`);
+        }
+        config.url = opts.url;
+        config.auth = auth;
+
+        if (auth === 'token') {
+          if (!opts.tokenSecret) throw new Error('--token-secret is required for openbao token auth (format: SECRET/KEY)');
+          const slashIdx = (opts.tokenSecret as string).indexOf('/');
+          if (slashIdx < 1) throw new Error(`Invalid --token-secret '${opts.tokenSecret as string}'. Expected SECRET_NAME/KEY_NAME`);
+          config.tokenSecretRef = {
+            name: (opts.tokenSecret as string).slice(0, slashIdx),
+            key: (opts.tokenSecret as string).slice(slashIdx + 1),
+          };
+        } else {
+          if (!opts.role) throw new Error("--role is required for openbao kubernetes auth (the vault role bound to this pod's ServiceAccount)");
+          config.role = opts.role;
+          if (opts.authMount) config.authMount = opts.authMount;
+          if (opts.saTokenPath) config.serviceAccountTokenPath = opts.saTokenPath;
+        }
+
+        if (opts.namespace) config.namespace = opts.namespace;
+        if (opts.mount) config.mount = opts.mount;
+        if (opts.pathPrefix) config.pathPrefix = opts.pathPrefix;
+      }
+
+      // Extra config key=value pairs (overwrite/extend above)
+      for (const entry of opts.config as string[]) {
+        const eqIdx = entry.indexOf('=');
+        if (eqIdx === -1) throw new Error(`Invalid --config '${entry}'. Expected key=value`);
+        config[entry.slice(0, eqIdx)] = entry.slice(eqIdx + 1);
+      }
+
+      const body: Record<string, unknown> = { name, type, config };
+      if (opts.description !== undefined) body.description = opts.description;
+      if (opts.default) body.isDefault = true;
+
+      try {
+        const row = await client.post<{ id: string; name: string }>('/api/v1/secretbackends', body);
+        log(`secretbackend '${row.name}' created (id: ${row.id})`);
+        if (opts.default) log(`  promoted to default backend`);
+      } catch (err) {
+        if (err instanceof ApiError && err.status === 409 && opts.force) {
+          const existing = (await client.get<Array<{ id: string; name: string }>>('/api/v1/secretbackends')).find((b) => b.name === name);
+          if (!existing) throw err;
+          const updateBody: Record<string, unknown> = { config };
+          if (opts.description !== undefined) updateBody.description = opts.description;
+          if (opts.default) updateBody.isDefault = true;
+          await client.put(`/api/v1/secretbackends/${existing.id}`, updateBody);
+          log(`secretbackend '${name}' updated (id: ${existing.id})`);
+        } else {
+          throw err;
+        }
+      }
+    });
+
  // --- create project ---
  cmd.command('project')
    .description('Create a project')
@@ -227,6 +419,8 @@ export function createCreateCommand(deps: CreateCommandDeps): Command {
    .option('-d, --description <text>', 'Project description', '')
    .option('--proxy-model <name>', 'Plugin name (default, content-pipeline, gate, none)')
    .option('--prompt <text>', 'Project-level prompt / instructions for the LLM')
+    .option('--llm <name>', "Name of an Llm resource (see 'mcpctl get llms'), or 'none' to disable")
+    .option('--llm-model <model>', 'Override the model string for this project (defaults to the Llm\'s own model)')
    .option('--gated', '[deprecated: use --proxy-model default]')
    .option('--no-gated', '[deprecated: use --proxy-model content-pipeline]')
    .option('--server <name>', 'Server name (repeat for multiple)', collect, [])
@@ -246,6 +440,8 @@ export function createCreateCommand(deps: CreateCommandDeps): Command {
      // Pass gated for backward compat with older mcpd
      if (opts.gated !== undefined) body.gated = opts.gated as boolean;
      if (opts.server.length > 0) body.servers = opts.server;
+      if (opts.llm) body.llmProvider = opts.llm;
+      if (opts.llmModel) body.llmModel = opts.llmModel;

      try {
        const project = await client.post<{ id: string; name: string }>('/api/v1/projects', body);
@@ -331,8 +527,12 @@ export function createCreateCommand(deps: CreateCommandDeps): Command {
    .description('Create an RBAC binding definition')
    .argument('<name>', 'RBAC binding name')
    .option('--subject <entry>', 'Subject as Kind:name (repeat for multiple)', collect, [])
-    .option('--binding <entry>', 'Role binding as role:resource (e.g. edit:servers, run:projects)', collect, [])
-    .option('--operation <action>', 'Operation binding (e.g. logs, backup)', collect, [])
+    .option(
+      '--roleBindings <entry>',
+      'Role binding as key:value pairs, e.g. "role:view,resource:servers" or "role:view,resource:servers,name:my-ha" or "action:logs" (repeat for multiple)',
+      collect,
+      [],
+    )
    .option('--force', 'Update if already exists')
    .action(async (name: string, opts) => {
      const subjects = (opts.subject as string[]).map((entry: string) => {
@@ -343,24 +543,7 @@ export function createCreateCommand(deps: CreateCommandDeps): Command {
        return { kind: entry.slice(0, colonIdx), name: entry.slice(colonIdx + 1) };
      });

-      const roleBindings: Array<Record<string, string>> = [];
-
-      // Resource bindings from --binding flag (role:resource or role:resource:name)
-      for (const entry of opts.binding as string[]) {
-        const parts = entry.split(':');
-        if (parts.length === 2) {
-          roleBindings.push({ role: parts[0]!, resource: parts[1]! });
-        } else if (parts.length === 3) {
-          roleBindings.push({ role: parts[0]!, resource: parts[1]!, name: parts[2]! });
-        } else {
-          throw new Error(`Invalid binding format '${entry}'. Expected role:resource or role:resource:name (e.g. edit:servers, view:servers:my-ha)`);
-        }
-      }
-
-      // Operation bindings from --operation flag
-      for (const action of opts.operation as string[]) {
-        roleBindings.push({ role: 'run', action });
-      }
+      const roleBindings = (opts.roleBindings as string[]).map((entry: string) => parseRoleBinding(entry));

      const body: Record<string, unknown> = {
        name,
@@ -384,6 +567,83 @@ export function createCreateCommand(deps: CreateCommandDeps): Command {
      }
    });

+  // --- create mcptoken ---
+  cmd.command('mcptoken')
+    .description('Create a project-scoped API token for HTTP-mode mcplocal. The raw token is printed once.')
+    .argument('<name>', 'Token name (unique within a project)')
+    .requiredOption('-p, --project <name>', 'Project this token is bound to')
+    .option('--rbac <mode>', "Base RBAC: 'empty' (default, no bindings) or 'clone' (snapshot creator's perms)", 'empty')
+    .option(
+      '--bind <entry>',
+      'Additional role binding as key:value pairs, e.g. "role:view,resource:servers" or "action:logs" (repeat for multiple). Creator perms are the ceiling.',
+      collect,
+      [],
+    )
+    .option('--ttl <duration>', "Expiry: '30d', '12h', 'never', or an ISO8601 datetime")
+    .option('--description <text>', 'Freeform description')
+    .option('--force', 'Revoke any existing active token with this name, then create a new one')
+    .action(async (name: string, opts) => {
+      // Resolve project name → id (mcpd's create route accepts either, but resolve client-side for clearer errors)
+      const projectId = await resolveNameOrId(client, 'projects', opts.project as string);
+
+      const bindings = (opts.bind as string[]).map((entry: string) => parseRoleBinding(entry));
+
+      const rbacMode = (opts.rbac as string).toLowerCase();
+      if (rbacMode !== 'empty' && rbacMode !== 'clone') {
+        throw new Error(`--rbac must be 'empty' or 'clone' (got '${opts.rbac as string}')`);
+      }
+
+      let expiresAt: string | null | undefined;
+      if (opts.ttl !== undefined) {
+        expiresAt = parseTtl(opts.ttl as string);
+      }
+
+      const body: Record<string, unknown> = {
+        name,
+        projectId,
+        rbacMode,
+        bindings,
+      };
+      if (expiresAt !== undefined) body.expiresAt = expiresAt;
+      if (opts.description !== undefined) body.description = opts.description;
+
+      type Created = {
+        id: string;
+        name: string;
+        projectName: string;
+        tokenPrefix: string;
+        token: string;
+        expiresAt: string | null;
+      };
+
+      const doCreate = async (): Promise<Created> => client.post<Created>('/api/v1/mcptokens', body);
+
+      let created: Created;
+      try {
+        created = await doCreate();
+      } catch (err) {
+        if (err instanceof ApiError && err.status === 409 && opts.force) {
+          // Find the existing active token by name+project and revoke it, then retry.
+          const existing = (await client.get<Array<{ id: string; name: string }>>(
+            `/api/v1/mcptokens?projectName=${encodeURIComponent(opts.project as string)}`,
+          )).find((r) => r.name === name);
+          if (!existing) throw err;
+          await client.post(`/api/v1/mcptokens/${existing.id}/revoke`, {});
+          created = await doCreate();
+        } else {
+          throw err;
+        }
+      }
+
+      log(`mcptoken '${created.name}' created (project: ${created.projectName}, id: ${created.id})`);
+      log('');
+      log('Copy this token now — it will NOT be shown again:');
+      log('');
+      log(`  ${created.token}`);
+      log('');
+      log(`Export it with:  export MCPCTL_TOKEN=${created.token}`);
+    });
+
  // --- create prompt ---
  cmd.command('prompt')
    .description('Create an approved prompt')
--- a/src/cli/src/commands/delete.ts
+++ b/src/cli/src/commands/delete.ts
@@ -29,6 +29,27 @@ export function createDeleteCommand(deps: DeleteCommandDeps): Command {
        return;
      }

+      // Mcptokens: names are scoped to a project, so require --project unless the caller passes a CUID
+      if (resource === 'mcptokens') {
+        let tokenId: string;
+        if (/^c[a-z0-9]{24}/.test(idOrName)) {
+          tokenId = idOrName;
+        } else {
+          if (!opts.project) {
+            throw new Error('--project is required to delete an mcptoken by name (or pass the id).');
+          }
+          const items = await client.get<Array<{ id: string; name: string }>>(
+            `/api/v1/mcptokens?projectName=${encodeURIComponent(opts.project)}`,
+          );
+          const match = items.find((i) => i.name === idOrName);
+          if (!match) throw new Error(`mcptoken '${idOrName}' not found in project '${opts.project}'`);
+          tokenId = match.id;
+        }
+        await client.delete(`/api/v1/mcptokens/${tokenId}`);
+        log(`mcptoken '${idOrName}' deleted.`);
+        return;
+      }
+
      // Resolve name → ID for any resource type
      let id: string;
      try {
--- a/src/cli/src/commands/describe.ts
+++ b/src/cli/src/commands/describe.ts
@@ -137,6 +137,7 @@ function formatInstanceDetail(instance: Record<string, unknown>, inspect?: Recor
 function formatProjectDetail(
  project: Record<string, unknown>,
  prompts: Array<{ name: string; priority: number; linkTarget: string | null }> = [],
+  knownLlmNames?: Set<string>,
 ): string {
  const lines: string[] = [];
  lines.push(`=== Project: ${project.name} ===`);
@@ -151,8 +152,21 @@ function formatProjectDetail(
  lines.push('');
  lines.push('Plugin Config:');
  lines.push(`  ${pad('Plugin:', 18)}${proxyModel}`);
-  if (llmProvider) lines.push(`  ${pad('LLM Provider:', 18)}${llmProvider}`);
-  if (llmModel) lines.push(`  ${pad('LLM Model:', 18)}${llmModel}`);
+  if (llmProvider) {
+    // As of Phase 4, llmProvider names a centralized Llm resource (see
+    // `mcpctl get llms`). A value like "none" disables LLM for the project;
+    // anything else that doesn't match a registered Llm falls back to the
+    // registry default on consumers — flag it so operators notice.
+    const resolvable = knownLlmNames === undefined
+      || llmProvider === 'none'
+      || knownLlmNames.has(llmProvider);
+    if (resolvable) {
+      lines.push(`  ${pad('LLM:', 18)}${llmProvider}`);
+    } else {
+      lines.push(`  ${pad('LLM:', 18)}${llmProvider}  [warning: no Llm registered with this name — will fall back to registry default]`);
+    }
+  }
+  if (llmModel) lines.push(`  ${pad('LLM Model:', 18)}${llmModel} (override)`);

  // Servers section
  const servers = project.servers as Array<{ server: { name: string } }> | undefined;
@@ -218,6 +232,146 @@ function formatSecretDetail(secret: Record<string, unknown>, showValues: boolean
  return lines.join('\n');
 }

+function formatLlmDetail(llm: Record<string, unknown>): string {
+  const lines: string[] = [];
+  lines.push(`=== LLM: ${llm.name} ===`);
+  lines.push(`${pad('Name:')}${llm.name}`);
+  lines.push(`${pad('Type:')}${llm.type}`);
+  lines.push(`${pad('Model:')}${llm.model}`);
+  lines.push(`${pad('Tier:')}${llm.tier ?? 'fast'}`);
+  if (llm.url) lines.push(`${pad('URL:')}${llm.url}`);
+  if (llm.description) lines.push(`${pad('Description:')}${llm.description}`);
+
+  const ref = llm.apiKeyRef as { name: string; key: string } | null | undefined;
+  lines.push('');
+  lines.push('API Key:');
+  if (ref) {
+    lines.push(`  ${pad('Secret:', 12)}${ref.name}`);
+    lines.push(`  ${pad('Key:', 12)}${ref.key}`);
+  } else {
+    lines.push('  (none)');
+  }
+
+  const extra = llm.extraConfig as Record<string, unknown> | undefined;
+  if (extra && Object.keys(extra).length > 0) {
+    lines.push('');
+    lines.push('Extra Config:');
+    const keyW = Math.max(6, ...Object.keys(extra).map((k) => k.length)) + 2;
+    for (const [k, v] of Object.entries(extra)) {
+      let display: string;
+      if (v === null || v === undefined) display = '-';
+      else if (typeof v === 'object') display = JSON.stringify(v);
+      else display = String(v);
+      lines.push(`  ${k.padEnd(keyW)}${display}`);
+    }
+  }
+
+  lines.push('');
+  lines.push('Metadata:');
+  lines.push(`  ${pad('ID:', 12)}${llm.id}`);
+  if (llm.createdAt) lines.push(`  ${pad('Created:', 12)}${llm.createdAt}`);
+  if (llm.updatedAt) lines.push(`  ${pad('Updated:', 12)}${llm.updatedAt}`);
+
+  return lines.join('\n');
+}
+
+function formatSecretBackendDetail(backend: Record<string, unknown>): string {
+  const lines: string[] = [];
+  lines.push(`=== SecretBackend: ${backend.name} ===`);
+  lines.push(`${pad('Name:')}${backend.name}`);
+  lines.push(`${pad('Type:')}${backend.type}`);
+  lines.push(`${pad('Default:')}${backend.isDefault ? 'yes' : 'no'}`);
+  if (backend.description) lines.push(`${pad('Description:')}${backend.description}`);
+
+  const config = backend.config as Record<string, unknown> | undefined;
+  if (config && Object.keys(config).length > 0) {
+    lines.push('');
+    lines.push('Config:');
+    const keyW = Math.max(6, ...Object.keys(config).map((k) => k.length)) + 2;
+    for (const [key, value] of Object.entries(config)) {
+      let display: string;
+      if (value === null || value === undefined) display = '-';
+      else if (typeof value === 'object') display = JSON.stringify(value);
+      else display = String(value);
+      lines.push(`  ${key.padEnd(keyW)}${display}`);
+    }
+  }
+
+  const tokenMeta = (backend.tokenMeta ?? {}) as Record<string, unknown>;
+  if (tokenMeta.rotatable === true) {
+    lines.push('');
+    lines.push(...formatTokenHealth(tokenMeta));
+  }
+
+  lines.push('');
+  lines.push('Metadata:');
+  lines.push(`  ${pad('ID:', 12)}${backend.id}`);
+  if (backend.createdAt) lines.push(`  ${pad('Created:', 12)}${backend.createdAt}`);
+  if (backend.updatedAt) lines.push(`  ${pad('Updated:', 12)}${backend.updatedAt}`);
+
+  return lines.join('\n');
+}
+
+/**
+ * Render the Token health section for a wizard-provisioned openbao backend.
+ * Returns an array of lines (caller pushes them). Stale = no successful
+ * rotation in >26h (2h grace over the nominal 24h cadence).
+ */
+function formatTokenHealth(meta: Record<string, unknown>): string[] {
+  const lines: string[] = [];
+  const generatedAt = parseIso(meta.generatedAt);
+  const nextRenewalAt = parseIso(meta.nextRenewalAt);
+  const validUntil = parseIso(meta.validUntil);
+  const lastRotationAt = parseIso(meta.lastRotationAt);
+  const lastError = meta.lastRotationError as string | null | undefined;
+  const now = Date.now();
+
+  const STALE_GRACE_MS = 26 * 3600 * 1000;
+  const staleByAge = lastRotationAt !== null && (now - lastRotationAt.getTime()) > STALE_GRACE_MS;
+  const hasError = typeof lastError === 'string' && lastError !== '';
+
+  let status: string;
+  if (hasError && staleByAge) status = 'ERROR (stale)';
+  else if (staleByAge) status = 'STALE — no successful rotation in the last cycle';
+  else if (hasError) status = 'WARNING — last rotation hit an error but token is still fresh';
+  else status = 'healthy';
+
+  lines.push(`Token health:   ${status}`);
+  if (generatedAt !== null) {
+    lines.push(`  ${pad('Generated:', 16)}${generatedAt.toISOString()}${describeAge(generatedAt, now)}`);
+  }
+  if (nextRenewalAt !== null) {
+    lines.push(`  ${pad('Next renewal:', 16)}${nextRenewalAt.toISOString()}${describeAge(nextRenewalAt, now)}`);
+  }
+  if (validUntil !== null) {
+    lines.push(`  ${pad('Valid until:', 16)}${validUntil.toISOString()}${describeAge(validUntil, now)}`);
+  }
+  if (lastRotationAt !== null) {
+    lines.push(`  ${pad('Last rotation:', 16)}${lastRotationAt.toISOString()}${describeAge(lastRotationAt, now)}`);
+  }
+  if (hasError) {
+    lines.push(`  ${pad('Last error:', 16)}${lastError}`);
+  }
+  return lines;
+}
+
+function parseIso(v: unknown): Date | null {
+  if (typeof v !== 'string' || v === '') return null;
+  const d = new Date(v);
+  return Number.isNaN(d.getTime()) ? null : d;
+}
+
+function describeAge(target: Date, now: number): string {
+  const diffMs = target.getTime() - now;
+  const abs = Math.abs(diffMs);
+  const hours = Math.round(abs / 3600_000);
+  const days = Math.round(abs / 86_400_000);
+  if (abs < 60_000) return '  (just now)';
+  if (abs < 3600_000) return `  (${String(Math.round(abs / 60_000))} min ${diffMs < 0 ? 'ago' : 'away'})`;
+  if (hours < 48) return `  (${String(hours)}h ${diffMs < 0 ? 'ago' : 'away'})`;
+  return `  (${String(days)}d ${diffMs < 0 ? 'ago' : 'away'})`;
+}
+
 function formatTemplateDetail(template: Record<string, unknown>): string {
  const lines: string[] = [];
  lines.push(`=== Template: ${template.name} ===`);
@@ -503,6 +657,42 @@ function formatRbacDetail(rbac: Record<string, unknown>): string {
  return lines.join('\n');
 }

+function formatMcpTokenDetail(token: Record<string, unknown>, allRbac: RbacDef[]): string {
+  const lines: string[] = [];
+  lines.push(`=== McpToken: ${token.name} ===`);
+  lines.push(`${pad('Name:')}${token.name}`);
+  lines.push(`${pad('Project:')}${token.projectName ?? token.projectId ?? '-'}`);
+  lines.push(`${pad('Status:')}${token.status ?? '-'}`);
+  lines.push(`${pad('Prefix:')}${token.tokenPrefix ?? '-'}`);
+  if (token.description) lines.push(`${pad('Description:')}${token.description}`);
+  lines.push(`${pad('Owner:')}${token.ownerEmail ?? token.ownerId ?? '-'}`);
+  lines.push(`${pad('Created:')}${token.createdAt ?? '-'}`);
+  lines.push(`${pad('Last Used:')}${token.lastUsedAt ?? 'never'}`);
+  lines.push(`${pad('Expires:')}${token.expiresAt ?? 'never'}`);
+  if (token.revokedAt) lines.push(`${pad('Revoked At:')}${token.revokedAt}`);
+
+  // Find the auto-created RbacDefinition (subject McpToken:<sha>) to surface bindings.
+  // We don't know the sha from the describe response — match by convention: name 'mcptoken-<id>'.
+  const rbacDef = allRbac.find((r) => r.name === `mcptoken-${token.id as string}`);
+  if (rbacDef && Array.isArray(rbacDef.roleBindings) && rbacDef.roleBindings.length > 0) {
+    lines.push('');
+    lines.push('Bindings:');
+    for (const b of rbacDef.roleBindings as Array<{ role: string; resource?: string; action?: string; name?: string }>) {
+      if (b.action !== undefined) {
+        lines.push(`  run ${b.action}`);
+      } else if (b.resource !== undefined) {
+        lines.push(`  ${b.role} ${b.resource}${b.name !== undefined ? `/${b.name}` : ''}`);
+      }
+    }
+  }
+
+  lines.push('');
+  lines.push('Metadata:');
+  lines.push(`  ${pad('ID:', 12)}${token.id}`);
+
+  return lines.join('\n');
+}
+
 async function formatPromptDetail(prompt: Record<string, unknown>, client?: ApiClient): Promise<string> {
  const lines: string[] = [];
  lines.push(`=== Prompt: ${prompt.name} ===`);
@@ -770,11 +960,23 @@ export function createDescribeCommand(deps: DescribeCommandDeps): Command {
          case 'templates':
            deps.log(formatTemplateDetail(item));
            break;
+          case 'secretbackends':
+            deps.log(formatSecretBackendDetail(item));
+            break;
+          case 'llms':
+            deps.log(formatLlmDetail(item));
+            break;
          case 'projects': {
-            const projectPrompts = await deps.client
-              .get<Array<{ name: string; priority: number; linkTarget: string | null }>>(`/api/v1/prompts?projectId=${item.id as string}`)
-              .catch(() => []);
-            deps.log(formatProjectDetail(item, projectPrompts));
+            const [projectPrompts, llms] = await Promise.all([
+              deps.client
+                .get<Array<{ name: string; priority: number; linkTarget: string | null }>>(`/api/v1/prompts?projectId=${item.id as string}`)
+                .catch(() => []),
+              deps.client
+                .get<Array<{ name: string }>>('/api/v1/llms')
+                .catch(() => [] as Array<{ name: string }>),
+            ]);
+            const llmNames = new Set(llms.map((l) => l.name));
+            deps.log(formatProjectDetail(item, projectPrompts, llmNames));
            break;
          }
          case 'users': {
@@ -801,6 +1003,14 @@ export function createDescribeCommand(deps: DescribeCommandDeps): Command {
          case 'prompts':
            deps.log(await formatPromptDetail(item, deps.client));
            break;
+          case 'mcptokens': {
+            // Fetch the auto-created RbacDefinition (if any) so bindings are visible in describe.
+            const rbacForToken = await deps.client
+              .get<RbacDef[]>('/api/v1/rbac')
+              .catch(() => [] as RbacDef[]);
+            deps.log(formatMcpTokenDetail(item, rbacForToken));
+            break;
+          }
          default:
            deps.log(formatGenericDetail(item));
        }
--- a/src/cli/src/commands/get.ts
+++ b/src/cli/src/commands/get.ts
@@ -119,6 +119,64 @@ const rbacColumns: Column<RbacRow>[] = [
  { header: 'ID', key: 'id' },
 ];

+interface LlmRow {
+  id: string;
+  name: string;
+  type: string;
+  model: string;
+  tier: string;
+  url: string;
+  description: string;
+  apiKeyRef: { name: string; key: string } | null;
+}
+
+const llmColumns: Column<LlmRow>[] = [
+  { header: 'NAME', key: 'name' },
+  { header: 'TYPE', key: 'type', width: 12 },
+  { header: 'MODEL', key: 'model', width: 28 },
+  { header: 'TIER', key: 'tier', width: 8 },
+  { header: 'KEY', key: (r) => r.apiKeyRef ? `secret://${r.apiKeyRef.name}/${r.apiKeyRef.key}` : '-', width: 34 },
+  { header: 'ID', key: 'id' },
+];
+
+interface SecretBackendRow {
+  id: string;
+  name: string;
+  type: string;
+  isDefault: boolean;
+  description: string;
+  config?: Record<string, unknown>;
+}
+
+const secretBackendColumns: Column<SecretBackendRow>[] = [
+  { header: 'NAME', key: 'name' },
+  { header: 'TYPE', key: 'type', width: 14 },
+  { header: 'DEFAULT', key: (r) => r.isDefault ? '*' : '', width: 8 },
+  { header: 'DESCRIPTION', key: (r) => r.description || '-', width: 30 },
+  { header: 'ID', key: 'id' },
+];
+
+interface McpTokenRow {
+  id: string;
+  name: string;
+  projectName: string;
+  tokenPrefix: string;
+  createdAt: string;
+  lastUsedAt: string | null;
+  expiresAt: string | null;
+  status: 'active' | 'revoked' | 'expired';
+}
+
+const mcpTokenColumns: Column<McpTokenRow>[] = [
+  { header: 'NAME', key: 'name', width: 24 },
+  { header: 'PROJECT', key: 'projectName', width: 20 },
+  { header: 'PREFIX', key: 'tokenPrefix', width: 18 },
+  { header: 'CREATED', key: (r) => new Date(r.createdAt).toLocaleString(), width: 20 },
+  { header: 'LAST USED', key: (r) => r.lastUsedAt ? new Date(r.lastUsedAt).toLocaleString() : '-', width: 20 },
+  { header: 'EXPIRES', key: (r) => r.expiresAt ? new Date(r.expiresAt).toLocaleString() : 'never', width: 20 },
+  { header: 'STATUS', key: 'status', width: 10 },
+];
+
 const secretColumns: Column<SecretRow>[] = [
  { header: 'NAME', key: 'name' },
  { header: 'KEYS', key: (r) => Object.keys(r.data).join(', ') || '-', width: 40 },
@@ -174,7 +232,7 @@ const promptRequestColumns: Column<PromptRequestRow>[] = [
 const instanceColumns: Column<InstanceRow>[] = [
  { header: 'NAME', key: (r) => r.server?.name ?? '-', width: 20 },
  { header: 'STATUS', key: 'status', width: 10 },
-  { header: 'HEALTH', key: (r) => r.healthStatus ?? '-', width: 10 },
+  { header: 'HEALTH', key: (r) => r.healthStatus ?? 'unknown', width: 10 },
  { header: 'PORT', key: (r) => r.port != null ? String(r.port) : '-', width: 6 },
  { header: 'CONTAINER', key: (r) => r.containerId ? r.containerId.slice(0, 12) : '-', width: 14 },
  { header: 'ID', key: 'id' },
@@ -242,6 +300,12 @@ function getColumnsForResource(resource: string): Column<Record<string, unknown>
      return serverAttachmentColumns as unknown as Column<Record<string, unknown>>[];
    case 'proxymodels':
      return proxymodelColumns as unknown as Column<Record<string, unknown>>[];
+    case 'mcptokens':
+      return mcpTokenColumns as unknown as Column<Record<string, unknown>>[];
+    case 'secretbackends':
+      return secretBackendColumns as unknown as Column<Record<string, unknown>>[];
+    case 'llms':
+      return llmColumns as unknown as Column<Record<string, unknown>>[];
    default:
      return [
        { header: 'ID', key: 'id' as keyof Record<string, unknown> },
@@ -263,6 +327,9 @@ const RESOURCE_KIND: Record<string, string> = {
  prompts: 'prompt',
  promptrequests: 'promptrequest',
  serverattachments: 'serverattachment',
+  mcptokens: 'mcptoken',
+  secretbackends: 'secretbackend',
+  llms: 'llm',
 };

 /**
--- a/src/cli/src/commands/migrate.ts
+++ b/src/cli/src/commands/migrate.ts
@@ -0,0 +1,80 @@
+import { Command } from 'commander';
+import type { ApiClient } from '../api-client.js';
+
+export interface MigrateCommandDeps {
+  client: ApiClient;
+  log: (...args: unknown[]) => void;
+}
+
+interface MigrateResult {
+  migrated: Array<{ name: string }>;
+  skipped: Array<{ name: string; reason: string }>;
+  failed: Array<{ name: string; error: string }>;
+}
+
+interface DryRunResult {
+  dryRun: true;
+  candidates: Array<{ id: string; name: string }>;
+}
+
+/**
+ * Top-level `mcpctl migrate <subcommand>` verb.
+ *
+ * Today only `secrets` is implemented (SecretBackend → SecretBackend move),
+ * but the command is structured so new migrations can slot in.
+ *
+ * Per-secret atomicity is handled server-side — if this command is interrupted
+ * mid-run, re-running is idempotent (skips secrets already on the destination).
+ */
+export function createMigrateCommand(deps: MigrateCommandDeps): Command {
+  const { client, log } = deps;
+
+  const cmd = new Command('migrate')
+    .description('Move resources between backends (currently: secrets between SecretBackends)');
+
+  cmd.command('secrets')
+    .description('Migrate secrets from one SecretBackend to another')
+    .requiredOption('--from <name>', 'Source SecretBackend name')
+    .requiredOption('--to <name>', 'Destination SecretBackend name')
+    .option('--names <csv>', 'Comma-separated secret names (default: all)')
+    .option('--keep-source', 'Leave the source copy intact (default: delete from source after write+commit)')
+    .option('--dry-run', 'Show which secrets would be migrated without touching them')
+    .action(async (opts) => {
+      const body: Record<string, unknown> = { from: opts.from, to: opts.to };
+      if (opts.names) body.names = (opts.names as string).split(',').map((s) => s.trim()).filter(Boolean);
+      if (opts.keepSource) body.keepSource = true;
+      if (opts.dryRun) body.dryRun = true;
+
+      if (opts.dryRun) {
+        const res = await client.post<DryRunResult>('/api/v1/secrets/migrate', body);
+        if (res.candidates.length === 0) {
+          log(`No secrets to migrate from '${opts.from as string}' to '${opts.to as string}'.`);
+          return;
+        }
+        log(`Dry run — ${String(res.candidates.length)} secret(s) would be migrated from '${opts.from as string}' → '${opts.to as string}':`);
+        for (const c of res.candidates) log(`  - ${c.name}`);
+        return;
+      }
+
+      const res = await client.post<MigrateResult>('/api/v1/secrets/migrate', body);
+
+      if (res.migrated.length > 0) {
+        log(`Migrated ${String(res.migrated.length)} secret(s) from '${opts.from as string}' → '${opts.to as string}':`);
+        for (const m of res.migrated) log(`  ✓ ${m.name}`);
+      }
+      if (res.skipped.length > 0) {
+        log(`Skipped ${String(res.skipped.length)}:`);
+        for (const s of res.skipped) log(`  - ${s.name}: ${s.reason}`);
+      }
+      if (res.failed.length > 0) {
+        log(`Failed ${String(res.failed.length)}:`);
+        for (const f of res.failed) log(`  ✗ ${f.name}: ${f.error}`);
+        process.exitCode = 1;
+      }
+      if (res.migrated.length === 0 && res.skipped.length === 0 && res.failed.length === 0) {
+        log(`No secrets to migrate from '${opts.from as string}' to '${opts.to as string}'.`);
+      }
+    });
+
+  return cmd;
+}
--- a/src/cli/src/commands/rbac-bindings.ts
+++ b/src/cli/src/commands/rbac-bindings.ts
@@ -0,0 +1,49 @@
+/**
+ * Parse one `--roleBindings <kv>` entry into a role-binding object the API accepts.
+ *
+ * Accepted forms:
+ *   role:view,resource:servers                       → resource binding (unscoped)
+ *   role:view,resource:servers,name:my-ha            → resource binding (name-scoped)
+ *   action:logs                                       → operation binding (role:run is implied)
+ *
+ * Whitespace around keys/values is trimmed. Keys must be one of: role, resource, name, action.
+ */
+export type RoleBindingEntry =
+  | { role: string; resource: string; name?: string }
+  | { role: 'run'; action: string };
+
+export function parseRoleBinding(entry: string): RoleBindingEntry {
+  const pairs: Record<string, string> = {};
+  for (const part of entry.split(',')) {
+    const colonIdx = part.indexOf(':');
+    if (colonIdx === -1) {
+      throw new Error(`Invalid roleBindings entry '${entry}': expected key:value pairs separated by commas`);
+    }
+    const key = part.slice(0, colonIdx).trim();
+    const value = part.slice(colonIdx + 1).trim();
+    if (!key || !value) {
+      throw new Error(`Invalid roleBindings entry '${entry}': empty key or value`);
+    }
+    if (!['role', 'resource', 'name', 'action'].includes(key)) {
+      throw new Error(`Invalid roleBindings key '${key}' in '${entry}': expected one of role, resource, name, action`);
+    }
+    pairs[key] = value;
+  }
+
+  // Operation binding: presence of `action:` implies role:run
+  if (pairs['action'] !== undefined) {
+    if (pairs['resource'] !== undefined || pairs['name'] !== undefined) {
+      throw new Error(`Invalid roleBindings entry '${entry}': 'action' cannot be combined with 'resource' or 'name'`);
+    }
+    return { role: 'run', action: pairs['action'] };
+  }
+
+  // Resource binding
+  if (pairs['role'] === undefined || pairs['resource'] === undefined) {
+    throw new Error(`Invalid roleBindings entry '${entry}': need either 'action:…' or both 'role:…,resource:…'`);
+  }
+  if (pairs['name'] !== undefined) {
+    return { role: pairs['role'], resource: pairs['resource'], name: pairs['name'] };
+  }
+  return { role: pairs['role'], resource: pairs['resource'] };
+}
--- a/src/cli/src/commands/rotate.ts
+++ b/src/cli/src/commands/rotate.ts
@@ -0,0 +1,50 @@
+/**
+ * `mcpctl rotate secretbackend <name>` — force an immediate token rotation on
+ * a wizard-provisioned OpenBao backend.
+ *
+ * Hits `POST /api/v1/secretbackends/:id/rotate` after resolving name → id.
+ * Gated server-side by the `rotate-secretbackend` operation.
+ */
+import { Command } from 'commander';
+import type { ApiClient } from '../api-client.js';
+import { resolveNameOrId } from './shared.js';
+
+export interface RotateCommandDeps {
+  client: ApiClient;
+  log: (...args: unknown[]) => void;
+}
+
+export function createRotateCommand(deps: RotateCommandDeps): Command {
+  const { client, log } = deps;
+
+  const cmd = new Command('rotate')
+    .description('Force rotation of a credential-rotating resource (currently: secretbackend)');
+
+  cmd.command('secretbackend')
+    .alias('sb')
+    .description('Rotate the vault token on an OpenBao SecretBackend (wizard-provisioned)')
+    .argument('<name>', 'SecretBackend name or id')
+    .action(async (nameOrId: string) => {
+      const id = await resolveNameOrId(client, 'secretbackends', nameOrId);
+      const res = await client.post<{ ok?: boolean; tokenMeta?: Record<string, unknown>; error?: string }>(
+        `/api/v1/secretbackends/${id}/rotate`,
+        {},
+      );
+      if (res.ok !== true) {
+        throw new Error(`rotation failed: ${res.error ?? 'unknown error'}`);
+      }
+      log(`secretbackend '${nameOrId}' rotated.`);
+      const meta = res.tokenMeta ?? {};
+      if (typeof meta.generatedAt === 'string') {
+        log(`  generated:    ${meta.generatedAt}`);
+      }
+      if (typeof meta.nextRenewalAt === 'string') {
+        log(`  next renewal: ${meta.nextRenewalAt}`);
+      }
+      if (typeof meta.validUntil === 'string') {
+        log(`  valid until:  ${meta.validUntil}`);
+      }
+    });
+
+  return cmd;
+}
--- a/src/cli/src/commands/shared.ts
+++ b/src/cli/src/commands/shared.ts
@@ -27,6 +27,15 @@ export const RESOURCE_ALIASES: Record<string, string> = {
  proxymodel: 'proxymodels',
  proxymodels: 'proxymodels',
  pm: 'proxymodels',
+  mcptoken: 'mcptokens',
+  mcptokens: 'mcptokens',
+  token: 'mcptokens',
+  tokens: 'mcptokens',
+  secretbackend: 'secretbackends',
+  secretbackends: 'secretbackends',
+  sb: 'secretbackends',
+  llm: 'llms',
+  llms: 'llms',
  all: 'all',
 };

@@ -72,6 +81,21 @@ export function stripInternalFields(obj: Record<string, unknown>): Record<string
    delete result[key];
  }

+  // McpToken-specific: promote projectName → project; drop secret/derived fields
+  if ('tokenHash' in result || 'tokenPrefix' in result) {
+    delete result.tokenHash;
+    delete result.tokenPrefix;
+    delete result.lastUsedAt;
+    delete result.revokedAt;
+    delete result.status;
+    delete result.ownerEmail;
+    if (typeof result.projectName === 'string') {
+      result.project = result.projectName;
+      delete result.projectName;
+      delete result.projectId;
+    }
+  }
+
  // Rename linkTarget → link for cleaner YAML
  if ('linkTarget' in result) {
    result.link = result.linkTarget;
--- a/src/cli/src/commands/status.ts
+++ b/src/cli/src/commands/status.ts
@@ -1,5 +1,11 @@
 import { Command } from 'commander';
 import http from 'node:http';
+import https from 'node:https';
+
+/** Pick the http or https driver based on the URL scheme. */
+function httpDriverFor(url: string): typeof http | typeof https {
+  return new URL(url).protocol === 'https:' ? https : http;
+}
 import { loadConfig } from '../config/index.js';
 import type { ConfigLoaderDeps } from '../config/index.js';
 import { loadCredentials } from '../auth/index.js';
@@ -45,10 +51,16 @@ export interface StatusCommandDeps {

 function defaultCheckHealth(url: string): Promise<boolean> {
  return new Promise((resolve) => {
-    const req = http.get(`${url}/health`, { timeout: 3000 }, (res) => {
-      resolve(res.statusCode !== undefined && res.statusCode >= 200 && res.statusCode < 400);
-      res.resume();
-    });
+    let req: http.ClientRequest;
+    try {
+      req = httpDriverFor(url).get(`${url}/health`, { timeout: 3000 }, (res) => {
+        resolve(res.statusCode !== undefined && res.statusCode >= 200 && res.statusCode < 400);
+        res.resume();
+      });
+    } catch {
+      resolve(false);
+      return;
+    }
    req.on('error', () => resolve(false));
    req.on('timeout', () => {
      req.destroy();
@@ -63,26 +75,32 @@ function defaultCheckHealth(url: string): Promise<boolean> {
 */
 function defaultCheckLlm(mcplocalUrl: string): Promise<string> {
  return new Promise((resolve) => {
-    const req = http.get(`${mcplocalUrl}/llm/health`, { timeout: 45000 }, (res) => {
-      const chunks: Buffer[] = [];
-      res.on('data', (chunk: Buffer) => chunks.push(chunk));
-      res.on('end', () => {
-        try {
-          const body = JSON.parse(Buffer.concat(chunks).toString('utf-8')) as { status: string; error?: string };
-          if (body.status === 'ok') {
-            resolve('ok');
-          } else if (body.status === 'not configured') {
-            resolve('not configured');
-          } else if (body.error) {
-            resolve(body.error.slice(0, 80));
-          } else {
-            resolve(body.status);
+    let req: http.ClientRequest;
+    try {
+      req = httpDriverFor(mcplocalUrl).get(`${mcplocalUrl}/llm/health`, { timeout: 45000 }, (res) => {
+        const chunks: Buffer[] = [];
+        res.on('data', (chunk: Buffer) => chunks.push(chunk));
+        res.on('end', () => {
+          try {
+            const body = JSON.parse(Buffer.concat(chunks).toString('utf-8')) as { status: string; error?: string };
+            if (body.status === 'ok') {
+              resolve('ok');
+            } else if (body.status === 'not configured') {
+              resolve('not configured');
+            } else if (body.error) {
+              resolve(body.error.slice(0, 80));
+            } else {
+              resolve(body.status);
+            }
+          } catch {
+            resolve('invalid response');
          }
-        } catch {
-          resolve('invalid response');
-        }
+        });
      });
-    });
+    } catch {
+      resolve('mcplocal unreachable');
+      return;
+    }
    req.on('error', () => resolve('mcplocal unreachable'));
    req.on('timeout', () => { req.destroy(); resolve('timeout'); });
  });
@@ -90,18 +108,24 @@ function defaultCheckLlm(mcplocalUrl: string): Promise<string> {

 function defaultFetchModels(mcplocalUrl: string): Promise<string[]> {
  return new Promise((resolve) => {
-    const req = http.get(`${mcplocalUrl}/llm/models`, { timeout: 5000 }, (res) => {
-      const chunks: Buffer[] = [];
-      res.on('data', (chunk: Buffer) => chunks.push(chunk));
-      res.on('end', () => {
-        try {
-          const body = JSON.parse(Buffer.concat(chunks).toString('utf-8')) as { models?: string[] };
-          resolve(body.models ?? []);
-        } catch {
-          resolve([]);
-        }
+    let req: http.ClientRequest;
+    try {
+      req = httpDriverFor(mcplocalUrl).get(`${mcplocalUrl}/llm/models`, { timeout: 5000 }, (res) => {
+        const chunks: Buffer[] = [];
+        res.on('data', (chunk: Buffer) => chunks.push(chunk));
+        res.on('end', () => {
+          try {
+            const body = JSON.parse(Buffer.concat(chunks).toString('utf-8')) as { models?: string[] };
+            resolve(body.models ?? []);
+          } catch {
+            resolve([]);
+          }
+        });
      });
-    });
+    } catch {
+      resolve([]);
+      return;
+    }
    req.on('error', () => resolve([]));
    req.on('timeout', () => { req.destroy(); resolve([]); });
  });
@@ -109,18 +133,24 @@ function defaultFetchModels(mcplocalUrl: string): Promise<string[]> {

 function defaultFetchProviders(mcplocalUrl: string): Promise<ProvidersInfo | null> {
  return new Promise((resolve) => {
-    const req = http.get(`${mcplocalUrl}/llm/providers`, { timeout: 5000 }, (res) => {
-      const chunks: Buffer[] = [];
-      res.on('data', (chunk: Buffer) => chunks.push(chunk));
-      res.on('end', () => {
-        try {
-          const body = JSON.parse(Buffer.concat(chunks).toString('utf-8')) as ProvidersInfo;
-          resolve(body);
-        } catch {
-          resolve(null);
-        }
+    let req: http.ClientRequest;
+    try {
+      req = httpDriverFor(mcplocalUrl).get(`${mcplocalUrl}/llm/providers`, { timeout: 5000 }, (res) => {
+        const chunks: Buffer[] = [];
+        res.on('data', (chunk: Buffer) => chunks.push(chunk));
+        res.on('end', () => {
+          try {
+            const body = JSON.parse(Buffer.concat(chunks).toString('utf-8')) as ProvidersInfo;
+            resolve(body);
+          } catch {
+            resolve(null);
+          }
+        });
      });
-    });
+    } catch {
+      resolve(null);
+      return;
+    }
    req.on('error', () => resolve(null));
    req.on('timeout', () => { req.destroy(); resolve(null); });
  });
--- a/src/cli/src/commands/test-mcp.ts
+++ b/src/cli/src/commands/test-mcp.ts
@@ -0,0 +1,176 @@
+import { Command } from 'commander';
+import { McpHttpSession, McpProtocolError, McpTransportError, deriveBaseUrl, mcpHealthCheck } from '@mcpctl/shared';
+
+export interface TestMcpCommandDeps {
+  log: (...args: unknown[]) => void;
+  /**
+   * Inject a session factory for testing. The default creates a real `McpHttpSession`.
+   */
+  createSession?: (url: string, opts: { bearer?: string; timeoutMs?: number }) => {
+    initialize(): Promise<unknown>;
+    listTools(): Promise<Array<{ name: string }>>;
+    callTool(name: string, args: Record<string, unknown>): Promise<unknown>;
+    close(): Promise<void>;
+  };
+  healthCheck?: (baseUrl: string) => Promise<boolean>;
+}
+
+export type TestMcpExitCode = 0 | 1 | 2;
+
+export interface TestMcpReport {
+  url: string;
+  health: 'ok' | 'fail' | 'skipped';
+  initialize: 'ok' | 'fail';
+  tools: string[] | null;
+  toolCall?: { name: string; result: unknown; isError?: boolean };
+  missingTools?: string[];
+  exitCode: TestMcpExitCode;
+  error?: string;
+}
+
+export function createTestCommand(deps: TestMcpCommandDeps): Command {
+  const { log } = deps;
+  const createSession = deps.createSession ?? ((url, opts) => new McpHttpSession(url, opts));
+  const healthCheck = deps.healthCheck ?? mcpHealthCheck;
+
+  const test = new Command('test').description('Utilities for testing MCP endpoints and config');
+
+  test
+    .command('mcp')
+    .description('Verify a Streamable-HTTP MCP endpoint: health, initialize, tools/list, optionally call a tool.')
+    .argument('<url>', 'Full URL of the MCP endpoint (e.g. https://mcp.example.com/projects/foo/mcp)')
+    .option('--token <bearer>', 'Bearer token (also reads $MCPCTL_TOKEN)')
+    .option('--tool <name>', 'Invoke a specific tool after listing')
+    .option('--args <json>', 'JSON-encoded arguments for --tool', '{}')
+    .option('--expect-tools <list>', 'Comma-separated tool names that MUST appear; fails otherwise')
+    .option('--timeout <seconds>', 'Per-request timeout in seconds', '10')
+    .option('-o, --output <format>', 'Output format: text or json', 'text')
+    .option('--no-health', 'Skip the /healthz preflight check')
+    .action(async (url: string, opts: {
+      token?: string;
+      tool?: string;
+      args: string;
+      expectTools?: string;
+      timeout: string;
+      output: string;
+      health: boolean;
+    }) => {
+      const bearer = opts.token ?? process.env.MCPCTL_TOKEN;
+      const timeoutMs = Number(opts.timeout) * 1000;
+      if (!Number.isFinite(timeoutMs) || timeoutMs <= 0) {
+        throw new Error(`--timeout must be a positive number of seconds (got '${opts.timeout}')`);
+      }
+
+      const report: TestMcpReport = {
+        url,
+        health: 'skipped',
+        initialize: 'fail',
+        tools: null,
+        exitCode: 1,
+      };
+
+      // 1. Health preflight
+      if (opts.health !== false) {
+        const baseUrl = deriveBaseUrl(url);
+        const ok = await healthCheck(baseUrl);
+        report.health = ok ? 'ok' : 'fail';
+        if (!ok) {
+          report.error = `healthz preflight failed at ${baseUrl}/healthz`;
+          return emit(report, opts.output, log);
+        }
+      }
+
+      const sessionOpts: { bearer?: string; timeoutMs: number } = { timeoutMs };
+      if (bearer !== undefined) sessionOpts.bearer = bearer;
+      const session = createSession(url, sessionOpts);
+
+      try {
+        // 2. Initialize
+        await session.initialize();
+        report.initialize = 'ok';
+
+        // 3. tools/list
+        const tools = await session.listTools();
+        report.tools = tools.map((t) => t.name);
+
+        // 4. --expect-tools check
+        if (opts.expectTools !== undefined && opts.expectTools.trim() !== '') {
+          const expected = opts.expectTools.split(',').map((s) => s.trim()).filter(Boolean);
+          const missing = expected.filter((name) => !report.tools!.includes(name));
+          if (missing.length > 0) {
+            report.missingTools = missing;
+            report.exitCode = 2;
+            report.error = `Missing tools: ${missing.join(', ')}`;
+            return emit(report, opts.output, log);
+          }
+        }
+
+        // 5. Optional --tool call
+        if (opts.tool !== undefined) {
+          let parsedArgs: Record<string, unknown> = {};
+          try {
+            parsedArgs = JSON.parse(opts.args) as Record<string, unknown>;
+          } catch {
+            throw new Error(`--args must be valid JSON (got '${opts.args}')`);
+          }
+          const result = await session.callTool(opts.tool, parsedArgs);
+          const toolCall: TestMcpReport['toolCall'] = { name: opts.tool, result };
+          if (typeof result === 'object' && result !== null && 'isError' in result) {
+            toolCall.isError = Boolean((result as { isError?: boolean }).isError);
+          }
+          report.toolCall = toolCall;
+          if (toolCall.isError) {
+            report.exitCode = 2;
+            report.error = `Tool '${opts.tool}' returned isError=true`;
+            return emit(report, opts.output, log);
+          }
+        }
+
+        report.exitCode = 0;
+      } catch (err) {
+        if (err instanceof McpProtocolError) {
+          report.exitCode = 1;
+          report.error = `protocol error ${err.code}: ${err.message}`;
+        } else if (err instanceof McpTransportError) {
+          report.exitCode = 1;
+          report.error = `transport error (HTTP ${err.status}): ${err.message}`;
+        } else {
+          report.exitCode = 1;
+          report.error = err instanceof Error ? err.message : String(err);
+        }
+      } finally {
+        await session.close().catch(() => { /* best-effort */ });
+      }
+
+      return emit(report, opts.output, log);
+    });
+
+  return test;
+}
+
+function emit(report: TestMcpReport, output: string, log: (...args: unknown[]) => void): void {
+  if (output === 'json') {
+    log(JSON.stringify(report, null, 2));
+  } else {
+    log(`URL:        ${report.url}`);
+    log(`Health:     ${report.health}`);
+    log(`Initialize: ${report.initialize}`);
+    if (report.tools !== null) {
+      log(`Tools (${report.tools.length}): ${report.tools.slice(0, 10).join(', ')}${report.tools.length > 10 ? `, …(+${report.tools.length - 10})` : ''}`);
+    }
+    if (report.missingTools !== undefined) {
+      log(`Missing:    ${report.missingTools.join(', ')}`);
+    }
+    if (report.toolCall !== undefined) {
+      log(`Tool call:  ${report.toolCall.name} → ${report.toolCall.isError ? 'ERROR' : 'ok'}`);
+    }
+    if (report.error !== undefined) {
+      log(`Error:      ${report.error}`);
+    }
+    log(`Result:     ${report.exitCode === 0 ? 'PASS' : report.exitCode === 2 ? 'CONTRACT FAIL' : 'TRANSPORT/AUTH FAIL'}`);
+  }
+
+  if (report.exitCode !== 0) {
+    process.exitCode = report.exitCode;
+  }
+}
--- a/src/cli/src/index.ts
+++ b/src/cli/src/index.ts
@@ -8,6 +8,7 @@ import { createDescribeCommand } from './commands/describe.js';
 import { createDeleteCommand } from './commands/delete.js';
 import { createLogsCommand } from './commands/logs.js';
 import { createApplyCommand } from './commands/apply.js';
+import { createTestCommand } from './commands/test-mcp.js';
 import { createCreateCommand } from './commands/create.js';
 import { createEditCommand } from './commands/edit.js';
 import { createBackupCommand } from './commands/backup.js';
@@ -17,6 +18,8 @@ import { createMcpCommand } from './commands/mcp.js';
 import { createPatchCommand } from './commands/patch.js';
 import { createConsoleCommand } from './commands/console/index.js';
 import { createCacheCommand } from './commands/cache.js';
+import { createMigrateCommand } from './commands/migrate.js';
+import { createRotateCommand } from './commands/rotate.js';
 import { ApiClient, ApiError } from './api-client.js';
 import { loadConfig } from './config/index.js';
 import { loadCredentials } from './auth/index.js';
@@ -99,6 +102,25 @@ export function createProgram(): Command {
      }
    }

+    // --project scoping for mcptokens
+    if (!nameOrId && resource === 'mcptokens' && projectName) {
+      return client.get<unknown[]>(`/api/v1/mcptokens?projectName=${encodeURIComponent(projectName)}`);
+    }
+
+    // Name-based lookup for mcptokens: names are unique only within a project
+    if (nameOrId && resource === 'mcptokens' && !/^c[a-z0-9]{24}/.test(nameOrId)) {
+      if (!projectName) {
+        throw new Error('mcptoken names are scoped to a project — pass --project <name> or use the token id (cuid)');
+      }
+      const items = await client.get<Array<{ id: string; name: string }>>(
+        `/api/v1/mcptokens?projectName=${encodeURIComponent(projectName)}`,
+      );
+      const match = items.find((i) => i.name === nameOrId);
+      if (!match) throw new Error(`mcptoken '${nameOrId}' not found in project '${projectName}'`);
+      const item = await client.get(`/api/v1/mcptokens/${match.id}`);
+      return [item];
+    }
+
    if (nameOrId) {
      // Glob pattern — use query param filtering
      if (nameOrId.includes('*')) {
@@ -132,6 +154,19 @@ export function createProgram(): Command {
      return client.get(`/api/v1/${resource}/${match.id as string}`);
    }

+    // Mcptokens: names are project-scoped. CUIDs pass straight through.
+    if (resource === 'mcptokens' && !/^c[a-z0-9]{24}/.test(nameOrId)) {
+      if (!projectName) {
+        throw new Error('mcptoken names are scoped to a project — pass --project <name> or use the token id (cuid)');
+      }
+      const items = await client.get<Array<Record<string, unknown>>>(
+        `/api/v1/mcptokens?projectName=${encodeURIComponent(projectName)}`,
+      );
+      const match = items.find((item) => item.name === nameOrId);
+      if (!match) throw new Error(`mcptoken '${nameOrId}' not found in project '${projectName}'`);
+      return client.get(`/api/v1/mcptokens/${match.id as string}`);
+    }
+
    let id: string;
    try {
      id = await resolveNameOrId(client, resource, nameOrId);
@@ -212,6 +247,20 @@ export function createProgram(): Command {
    mcplocalUrl: config.mcplocalUrl,
  }));

+  program.addCommand(createTestCommand({
+    log: (...args) => console.log(...args),
+  }));
+
+  program.addCommand(createMigrateCommand({
+    client,
+    log: (...args) => console.log(...args),
+  }));
+
+  program.addCommand(createRotateCommand({
+    client,
+    log: (...args) => console.log(...args),
+  }));
+
  return program;
 }

--- a/src/cli/tests/commands/create-secretbackend-wizard.test.ts
+++ b/src/cli/tests/commands/create-secretbackend-wizard.test.ts
@@ -0,0 +1,150 @@
+import { describe, it, expect, vi } from 'vitest';
+import { runSecretBackendOpenbaoWizard } from '../../src/commands/create-secretbackend-wizard.js';
+import type { ApiClient } from '../../src/api-client.js';
+import type { ConfigSetupPrompt } from '../../src/commands/config-setup.js';
+
+function mockClient(handlers: Record<string, (body?: unknown) => unknown>): ApiClient {
+  const call = (method: 'GET' | 'POST' | 'PUT' | 'DELETE') => async (path: string, body?: unknown) => {
+    const handler = handlers[`${method} ${path}`] ?? handlers[path];
+    if (handler === undefined) throw new Error(`unmocked ${method} ${path}`);
+    return handler(body);
+  };
+  return {
+    get: call('GET'),
+    post: call('POST'),
+    put: call('PUT'),
+    delete: call('DELETE'),
+  } as unknown as ApiClient;
+}
+
+function vaultFetch(responses: Array<{ match: RegExp; status: number; body?: unknown }>): ReturnType<typeof vi.fn> {
+  return vi.fn(async (url: string | URL, init?: RequestInit) => {
+    const key = `${init?.method ?? 'GET'} ${String(url)}`;
+    const match = responses.find((r) => r.match.test(key) || r.match.test(String(url)));
+    if (!match) throw new Error(`unexpected vault fetch: ${key}`);
+    const body = match.body !== undefined ? JSON.stringify(match.body) : '';
+    return new Response(body, { status: match.status });
+  });
+}
+
+function scriptedPrompt(answers: {
+  input?: Record<string, string>;
+  password?: Record<string, string>;
+  confirm?: Record<string, boolean>;
+}): ConfigSetupPrompt {
+  return {
+    async input(message, def) {
+      return answers.input?.[message] ?? def ?? '';
+    },
+    async password(message) {
+      return answers.password?.[message] ?? '';
+    },
+    async confirm(message, def) {
+      return answers.confirm?.[message] ?? def ?? true;
+    },
+    select: vi.fn(),
+  };
+}
+
+describe('runSecretBackendOpenbaoWizard', () => {
+  it('walks through provisioning and creates Secret + SecretBackend + triggers initial rotate', async () => {
+    const logs: string[] = [];
+    const log = (...args: unknown[]) => logs.push(args.map(String).join(' '));
+
+    const vaultResponses = [
+      { match: /GET .*\/v1\/sys\/health$/, status: 200, body: { initialized: true, sealed: false, standby: false, version: '2.5.2' } },
+      { match: /GET .*\/v1\/sys\/mounts$/, status: 200, body: { 'secret/': { type: 'kv', options: { version: '2' } } } },
+      { match: /PUT .*\/v1\/sys\/policies\/acl\/app-mcpd$/, status: 200 },
+      { match: /POST .*\/v1\/auth\/token\/roles\/app-mcpd-role$/, status: 200 },
+      { match: /POST .*\/v1\/auth\/token\/create\/app-mcpd-role$/, status: 200, body: { auth: { client_token: 'hvs.AAA', accessor: 'acc-first', lease_duration: 2592000, renewable: true } } },
+      // smoke test: write / read / delete
+      { match: /POST .*\/v1\/secret\/data\/mcpd\/\.__mcpctl_wizard_smoke__$/, status: 200 },
+      { match: /GET .*\/v1\/secret\/data\/mcpd\/\.__mcpctl_wizard_smoke__$/, status: 200, body: { data: { data: { marker: 'mcpctl-smoke' } } } },
+      { match: /DELETE .*\/v1\/secret\/metadata\/mcpd\/\.__mcpctl_wizard_smoke__$/, status: 200 },
+    ];
+    const fetchFn = vaultFetch(vaultResponses);
+
+    const created: Record<string, unknown> = {};
+    const client = mockClient({
+      'POST /api/v1/secrets': (body) => { created.secret = body; return { id: 'sec-new', name: (body as { name: string }).name }; },
+      'POST /api/v1/secretbackends': (body) => { created.backend = body; return { id: 'backend-new', name: (body as { name: string }).name }; },
+      'POST /api/v1/secretbackends/backend-new/rotate': () => ({ ok: true, tokenMeta: { generatedAt: 'now' } }),
+      'POST /api/v1/secretbackends/backend-new/default': () => ({ id: 'backend-new' }),
+      'GET /api/v1/secretbackends': () => [{ name: 'default', isDefault: true }],
+      'POST /api/v1/secrets/migrate': () => ({ dryRun: true, candidates: [{ id: 's1', name: 'grafana-creds' }, { id: 's2', name: 'unifi-creds' }] }),
+    });
+
+    const prompt = scriptedPrompt({
+      input: {
+        'OpenBao URL': 'http://bao.example:8200',
+        'KV v2 mount': 'secret',
+        'Path prefix under mount': 'mcpd',
+        'Policy name': 'app-mcpd',
+        'Token role name': 'app-mcpd-role',
+      },
+      password: {
+        'OpenBao admin / root token': 'root.admin.token',
+      },
+      confirm: {
+        "Promote 'bao' to default backend?": true,
+      },
+    });
+
+    await runSecretBackendOpenbaoWizard(
+      { name: 'bao' },
+      { client, log, prompt, fetch: fetchFn as unknown as typeof fetch },
+    );
+
+    // Admin token used for the provisioning calls (first 5 vault requests)
+    const firstCallInit = fetchFn.mock.calls[0]![1] as RequestInit;
+    expect((firstCallInit.headers as Record<string, string>)['X-Vault-Token']).toBe('root.admin.token');
+
+    // Secret was created with the minted token value (hvs.AAA), not the admin token
+    expect(created.secret).toMatchObject({ name: 'bao-creds', data: { token: 'hvs.AAA' } });
+
+    // SecretBackend created with rotation config
+    expect(created.backend).toMatchObject({
+      name: 'bao',
+      type: 'openbao',
+      config: expect.objectContaining({
+        url: 'http://bao.example:8200',
+        auth: 'token',
+        tokenSecretRef: { name: 'bao-creds', key: 'token' },
+        rotation: expect.objectContaining({ enabled: true, tokenRole: 'app-mcpd-role' }),
+      }),
+    });
+
+    // Migration hint mentions both candidate count + the concrete command
+    const fullLog = logs.join('\n');
+    expect(fullLog).toContain("You have 2 secret(s) on 'default'");
+    expect(fullLog).toContain('mcpctl --direct migrate secrets --from default --to bao');
+
+    // Admin token never appears in the log (critical)
+    expect(fullLog).not.toContain('root.admin.token');
+  });
+
+  it('rejects when admin token is empty', async () => {
+    const prompt = scriptedPrompt({
+      input: { 'OpenBao URL': 'http://x' },
+      password: { 'OpenBao admin / root token': '' },
+    });
+    await expect(runSecretBackendOpenbaoWizard(
+      { name: 'bao' },
+      { client: mockClient({}), log: () => {}, prompt, fetch: vi.fn() as unknown as typeof fetch },
+    )).rejects.toThrow(/admin token is required/);
+  });
+
+  it('rejects when vault is sealed', async () => {
+    const fetchFn = vaultFetch([
+      { match: /\/sys\/health$/, status: 200, body: { initialized: true, sealed: true, standby: false, version: '2.5.2' } },
+    ]);
+    const prompt = scriptedPrompt({
+      input: { 'OpenBao URL': 'http://x' },
+      password: { 'OpenBao admin / root token': 't' },
+    });
+    await expect(runSecretBackendOpenbaoWizard(
+      { name: 'bao' },
+      { client: mockClient({}), log: () => {}, prompt, fetch: fetchFn as unknown as typeof fetch },
+    )).rejects.toThrow(/not ready/);
+  });
+});
--- a/src/cli/tests/commands/create.test.ts
+++ b/src/cli/tests/commands/create.test.ts
@@ -318,8 +318,8 @@ describe('create command', () => {
        'rbac', 'developers',
        '--subject', 'User:alice@test.com',
        '--subject', 'Group:dev-team',
-        '--binding', 'edit:servers',
-        '--binding', 'view:instances',
+        '--roleBindings', 'role:edit,resource:servers',
+        '--roleBindings', 'role:view,resource:instances',
      ], { from: 'user' });

      expect(client.post).toHaveBeenCalledWith('/api/v1/rbac', {
@@ -342,7 +342,7 @@ describe('create command', () => {
      await cmd.parseAsync([
        'rbac', 'admins',
        '--subject', 'User:admin@test.com',
-        '--binding', 'edit:*',
+        '--roleBindings', 'role:edit,resource:*',
      ], { from: 'user' });

      expect(client.post).toHaveBeenCalledWith('/api/v1/rbac', {
@@ -371,18 +371,18 @@ describe('create command', () => {
      ).rejects.toThrow('Invalid subject format');
    });

-    it('throws on invalid binding format', async () => {
+    it('throws on invalid roleBindings format', async () => {
      const cmd = createCreateCommand({ client, log });
      await expect(
-        cmd.parseAsync(['rbac', 'bad', '--binding', 'no-colon'], { from: 'user' }),
-      ).rejects.toThrow('Invalid binding format');
+        cmd.parseAsync(['rbac', 'bad', '--roleBindings', 'no-colon'], { from: 'user' }),
+      ).rejects.toThrow(/Invalid roleBindings/);
    });

    it('throws on 409 without --force', async () => {
      vi.mocked(client.post).mockRejectedValueOnce(new ApiError(409, '{"error":"RBAC already exists"}'));
      const cmd = createCreateCommand({ client, log });
      await expect(
-        cmd.parseAsync(['rbac', 'developers', '--subject', 'User:a@b.com', '--binding', 'edit:servers'], { from: 'user' }),
+        cmd.parseAsync(['rbac', 'developers', '--subject', 'User:a@b.com', '--roleBindings', 'role:edit,resource:servers'], { from: 'user' }),
      ).rejects.toThrow('API error 409');
    });

@@ -393,7 +393,7 @@ describe('create command', () => {
      await cmd.parseAsync([
        'rbac', 'developers',
        '--subject', 'User:new@test.com',
-        '--binding', 'edit:*',
+        '--roleBindings', 'role:edit,resource:*',
        '--force',
      ], { from: 'user' });

@@ -404,15 +404,15 @@ describe('create command', () => {
      expect(output.join('\n')).toContain("rbac 'developers' updated");
    });

-    it('creates an RBAC definition with operation bindings', async () => {
+    it('creates an RBAC definition with operation bindings (action:… shorthand)', async () => {
      vi.mocked(client.post).mockResolvedValueOnce({ id: 'rbac-1', name: 'ops' });
      const cmd = createCreateCommand({ client, log });
      await cmd.parseAsync([
        'rbac', 'ops',
        '--subject', 'Group:ops-team',
-        '--binding', 'edit:servers',
-        '--operation', 'logs',
-        '--operation', 'backup',
+        '--roleBindings', 'role:edit,resource:servers',
+        '--roleBindings', 'action:logs',
+        '--roleBindings', 'action:backup',
      ], { from: 'user' });

      expect(client.post).toHaveBeenCalledWith('/api/v1/rbac', {
@@ -433,7 +433,7 @@ describe('create command', () => {
      await cmd.parseAsync([
        'rbac', 'ha-viewer',
        '--subject', 'User:alice@test.com',
-        '--binding', 'view:servers:my-ha',
+        '--roleBindings', 'role:view,resource:servers,name:my-ha',
      ], { from: 'user' });

      expect(client.post).toHaveBeenCalledWith('/api/v1/rbac', {
--- a/src/cli/tests/commands/describe.test.ts
+++ b/src/cli/tests/commands/describe.test.ts
@@ -108,6 +108,77 @@ describe('describe command', () => {
    expect(text).not.toContain('Gated:');
  });

+  it('shows project Llm reference without warning when the name matches a registered Llm', async () => {
+    const deps = makeDeps({
+      id: 'proj-1',
+      name: 'with-llm',
+      description: '',
+      ownerId: 'user-1',
+      proxyModel: 'default',
+      llmProvider: 'claude',
+      llmModel: 'claude-3-opus',
+      createdAt: '2025-01-01',
+    });
+    // /api/v1/llms returns a claude entry → no warning
+    deps.client = {
+      get: vi.fn(async (path: string) => {
+        if (path === '/api/v1/llms') return [{ name: 'claude' }];
+        return [];
+      }),
+    } as unknown as typeof deps.client;
+    const cmd = createDescribeCommand(deps);
+    await cmd.parseAsync(['node', 'test', 'project', 'proj-1']);
+    const text = deps.output.join('\n');
+    expect(text).toContain('LLM:');
+    expect(text).toContain('claude');
+    expect(text).not.toContain('warning:');
+  });
+
+  it('warns on describe project when llmProvider does not resolve to any registered Llm', async () => {
+    const deps = makeDeps({
+      id: 'proj-1',
+      name: 'orphan',
+      description: '',
+      ownerId: 'user-1',
+      proxyModel: 'default',
+      llmProvider: 'claude-ghost',
+      createdAt: '2025-01-01',
+    });
+    deps.client = {
+      get: vi.fn(async (path: string) => {
+        if (path === '/api/v1/llms') return [{ name: 'claude' }, { name: 'gpt-4o' }];
+        return [];
+      }),
+    } as unknown as typeof deps.client;
+    const cmd = createDescribeCommand(deps);
+    await cmd.parseAsync(['node', 'test', 'project', 'proj-1']);
+    const text = deps.output.join('\n');
+    expect(text).toContain('claude-ghost');
+    expect(text).toContain('warning:');
+    expect(text).toContain('fall back to registry default');
+  });
+
+  it('does not warn when llmProvider is "none" (explicit disable)', async () => {
+    const deps = makeDeps({
+      id: 'proj-1',
+      name: 'no-llm',
+      description: '',
+      ownerId: 'user-1',
+      proxyModel: 'default',
+      llmProvider: 'none',
+      createdAt: '2025-01-01',
+    });
+    deps.client = {
+      get: vi.fn(async () => []),
+    } as unknown as typeof deps.client;
+    const cmd = createDescribeCommand(deps);
+    await cmd.parseAsync(['node', 'test', 'project', 'proj-1']);
+    const text = deps.output.join('\n');
+    expect(text).toContain('LLM:');
+    expect(text).toContain('none');
+    expect(text).not.toContain('warning:');
+  });
+
  it('shows project Plugin Config defaulting to "default" when proxyModel is empty', async () => {
    const deps = makeDeps({
      id: 'proj-1',
--- a/src/cli/tests/commands/rbac-bindings.test.ts
+++ b/src/cli/tests/commands/rbac-bindings.test.ts
@@ -0,0 +1,54 @@
+import { describe, it, expect } from 'vitest';
+import { parseRoleBinding } from '../../src/commands/rbac-bindings.js';
+
+describe('parseRoleBinding', () => {
+  it('parses an unscoped resource binding', () => {
+    expect(parseRoleBinding('role:view,resource:servers')).toEqual({
+      role: 'view',
+      resource: 'servers',
+    });
+  });
+
+  it('parses a name-scoped resource binding', () => {
+    expect(parseRoleBinding('role:view,resource:servers,name:my-ha')).toEqual({
+      role: 'view',
+      resource: 'servers',
+      name: 'my-ha',
+    });
+  });
+
+  it('parses an operation binding via the action shorthand', () => {
+    expect(parseRoleBinding('action:logs')).toEqual({
+      role: 'run',
+      action: 'logs',
+    });
+  });
+
+  it('trims whitespace around keys and values', () => {
+    expect(parseRoleBinding('role: edit , resource: * ')).toEqual({
+      role: 'edit',
+      resource: '*',
+    });
+  });
+
+  it('rejects a pair with no colon', () => {
+    expect(() => parseRoleBinding('role=view')).toThrow(/key:value pairs/);
+  });
+
+  it('rejects an unknown key', () => {
+    expect(() => parseRoleBinding('role:view,resource:servers,scope:project')).toThrow(/Invalid roleBindings key 'scope'/);
+  });
+
+  it('rejects an empty value', () => {
+    expect(() => parseRoleBinding('role:view,resource:')).toThrow(/empty key or value/);
+  });
+
+  it('rejects action combined with resource/name', () => {
+    expect(() => parseRoleBinding('action:logs,resource:servers')).toThrow(/cannot be combined/);
+  });
+
+  it('requires both role and resource when action is absent', () => {
+    expect(() => parseRoleBinding('role:view')).toThrow(/need either 'action/);
+    expect(() => parseRoleBinding('resource:servers')).toThrow(/need either 'action/);
+  });
+});
--- a/src/cli/tests/commands/test-mcp.test.ts
+++ b/src/cli/tests/commands/test-mcp.test.ts
@@ -0,0 +1,168 @@
+import { describe, it, expect, vi, beforeEach, afterEach } from 'vitest';
+import { createTestCommand } from '../../src/commands/test-mcp.js';
+
+function makeSession(overrides: Partial<{
+  initialize: () => Promise<unknown>;
+  listTools: () => Promise<Array<{ name: string }>>;
+  callTool: (name: string, args: Record<string, unknown>) => Promise<unknown>;
+  close: () => Promise<void>;
+}> = {}) {
+  return {
+    initialize: overrides.initialize ?? vi.fn(async () => ({ protocolVersion: '2024-11-05' })),
+    listTools: overrides.listTools ?? vi.fn(async () => [{ name: 'echo' }, { name: 'search' }]),
+    callTool: overrides.callTool ?? vi.fn(async () => ({ content: [{ type: 'text', text: 'hi' }] })),
+    close: overrides.close ?? vi.fn(async () => { /* no-op */ }),
+  };
+}
+
+describe('mcpctl test mcp', () => {
+  const output: string[] = [];
+  const log = (...args: unknown[]) => {
+    output.push(args.map(String).join(' '));
+  };
+
+  beforeEach(() => {
+    output.length = 0;
+    process.exitCode = 0;
+  });
+
+  afterEach(() => {
+    process.exitCode = 0;
+  });
+
+  it('exits 0 on happy path (health + initialize + tools/list)', async () => {
+    const session = makeSession();
+    const cmd = createTestCommand({
+      log,
+      createSession: () => session,
+      healthCheck: async () => true,
+    });
+    await cmd.parseAsync(['mcp', 'https://mcp.example.com/projects/foo/mcp'], { from: 'user' });
+    expect(process.exitCode).toBe(0);
+    expect(session.initialize).toHaveBeenCalled();
+    expect(session.listTools).toHaveBeenCalled();
+    expect(output.join('\n')).toContain('Result:     PASS');
+  });
+
+  it('exits 1 when the /healthz preflight fails', async () => {
+    const cmd = createTestCommand({
+      log,
+      createSession: () => makeSession(),
+      healthCheck: async () => false,
+    });
+    await cmd.parseAsync(['mcp', 'https://mcp.example.com/projects/foo/mcp'], { from: 'user' });
+    expect(process.exitCode).toBe(1);
+    expect(output.join('\n')).toContain('healthz preflight failed');
+  });
+
+  it('exits 2 (contract fail) when --expect-tools are missing', async () => {
+    const cmd = createTestCommand({
+      log,
+      createSession: () => makeSession({
+        listTools: async () => [{ name: 'echo' }],
+      }),
+      healthCheck: async () => true,
+    });
+    await cmd.parseAsync(
+      ['mcp', 'https://mcp.example.com/projects/foo/mcp', '--expect-tools', 'echo,search'],
+      { from: 'user' },
+    );
+    expect(process.exitCode).toBe(2);
+    expect(output.join('\n')).toContain('Missing:    search');
+    expect(output.join('\n')).toContain('CONTRACT FAIL');
+  });
+
+  it('exits 0 when --expect-tools all match', async () => {
+    const cmd = createTestCommand({
+      log,
+      createSession: () => makeSession({
+        listTools: async () => [{ name: 'echo' }, { name: 'search' }, { name: 'x' }],
+      }),
+      healthCheck: async () => true,
+    });
+    await cmd.parseAsync(
+      ['mcp', 'https://mcp.example.com/projects/foo/mcp', '--expect-tools', 'echo,search'],
+      { from: 'user' },
+    );
+    expect(process.exitCode).toBe(0);
+  });
+
+  it('exits 1 on transport/auth failure (initialize throws)', async () => {
+    const cmd = createTestCommand({
+      log,
+      createSession: () => makeSession({
+        initialize: async () => { throw new Error('HTTP 401: unauthorized'); },
+      }),
+      healthCheck: async () => true,
+    });
+    await cmd.parseAsync(['mcp', 'https://mcp.example.com/projects/foo/mcp'], { from: 'user' });
+    expect(process.exitCode).toBe(1);
+    expect(output.join('\n')).toContain('Error:');
+    expect(output.join('\n')).toContain('TRANSPORT/AUTH FAIL');
+  });
+
+  it('invokes --tool with --args and reports isError', async () => {
+    const callTool = vi.fn(async () => ({ content: [{ type: 'text', text: 'oops' }], isError: true }));
+    const cmd = createTestCommand({
+      log,
+      createSession: () => makeSession({ callTool }),
+      healthCheck: async () => true,
+    });
+    await cmd.parseAsync(
+      ['mcp', 'https://mcp.example.com/projects/foo/mcp', '--tool', 'echo', '--args', '{"msg":"hi"}'],
+      { from: 'user' },
+    );
+    expect(callTool).toHaveBeenCalledWith('echo', { msg: 'hi' });
+    expect(process.exitCode).toBe(2);
+  });
+
+  it('outputs a JSON report with -o json', async () => {
+    const cmd = createTestCommand({
+      log,
+      createSession: () => makeSession(),
+      healthCheck: async () => true,
+    });
+    await cmd.parseAsync(
+      ['mcp', 'https://mcp.example.com/projects/foo/mcp', '-o', 'json'],
+      { from: 'user' },
+    );
+    const parsed = JSON.parse(output.join('\n')) as { exitCode: number; tools: string[] };
+    expect(parsed.exitCode).toBe(0);
+    expect(parsed.tools).toEqual(['echo', 'search']);
+  });
+
+  it('reads $MCPCTL_TOKEN when --token is not given', async () => {
+    let observedBearer: string | undefined;
+    const cmd = createTestCommand({
+      log,
+      createSession: (_url, opts) => {
+        observedBearer = opts.bearer;
+        return makeSession();
+      },
+      healthCheck: async () => true,
+    });
+    const prev = process.env.MCPCTL_TOKEN;
+    process.env.MCPCTL_TOKEN = 'mcpctl_pat_fromenv';
+    try {
+      await cmd.parseAsync(['mcp', 'https://mcp.example.com/projects/foo/mcp'], { from: 'user' });
+    } finally {
+      if (prev === undefined) delete process.env.MCPCTL_TOKEN;
+      else process.env.MCPCTL_TOKEN = prev;
+    }
+    expect(observedBearer).toBe('mcpctl_pat_fromenv');
+  });
+
+  it('rejects invalid --args as JSON', async () => {
+    const cmd = createTestCommand({
+      log,
+      createSession: () => makeSession(),
+      healthCheck: async () => true,
+    });
+    await cmd.parseAsync(
+      ['mcp', 'https://mcp.example.com/projects/foo/mcp', '--tool', 'echo', '--args', 'not-json'],
+      { from: 'user' },
+    );
+    expect(process.exitCode).toBe(1);
+    expect(output.join('\n')).toContain('must be valid JSON');
+  });
+});
--- a/src/db/prisma/schema.prisma
+++ b/src/db/prisma/schema.prisma
@@ -25,6 +25,7 @@ model User {
  auditLogs          AuditLog[]
  ownedProjects      Project[]
  groupMemberships   GroupMember[]
+  mcpTokens          McpToken[]

  @@index([email])
 }
@@ -110,17 +111,90 @@ model McpTemplate {
  @@index([name])
 }

+// ── Secret Backends ──
+//
+// Pluggable storage for Secret.data. Default is `plaintext` (data stored in
+// Secret.data JSON). Other drivers (e.g. `openbao`) store only a reference in
+// Secret.externalRef and fetch actual values from the external system at read
+// time. A `plaintext` row is seeded on first startup so the system always has
+// a viable backend; additional backends are user-managed via
+// `mcpctl create secretbackend`.
+
+model SecretBackend {
+  id          String   @id @default(cuid())
+  name        String   @unique
+  type        String                                  // plaintext | openbao | (future: vault, aws-sm, ...)
+  config      Json     @default("{}")                 // type-specific: url, mount, namespace, tokenSecretRef
+  // Runtime metadata for auto-rotating backend credentials (openbao token
+  // auth). Fields: generatedAt, nextRenewalAt, validUntil, lastRotationAt,
+  // lastRotationError, rotatable (true only for wizard-provisioned tokens).
+  // Empty object for backends that don't use rotation (plaintext, kubernetes
+  // auth, or static tokens). Managed entirely by the rotator service.
+  tokenMeta   Json     @default("{}")
+  isDefault   Boolean  @default(false)                // exactly one row has isDefault=true
+  description String   @default("")
+  version     Int      @default(1)
+  createdAt   DateTime @default(now())
+  updatedAt   DateTime @updatedAt
+
+  secrets Secret[]
+
+  @@index([name])
+  @@index([isDefault])
+}
+
 // ── Secrets ──

 model Secret {
-  id        String   @id @default(cuid())
-  name      String   @unique
-  data      Json     @default("{}")
-  version   Int      @default(1)
-  createdAt DateTime @default(now())
-  updatedAt DateTime @updatedAt
+  id          String   @id @default(cuid())
+  name        String   @unique
+  // FK to SecretBackend. Default empty string lets `prisma db push` add the
+  // column to pre-existing rows without a data-loss reset; `bootstrapSecretBackends`
+  // then points any empty-string values at the seeded `default` plaintext backend
+  // on next mcpd startup. New rows written by SecretService always carry a
+  // valid FK immediately.
+  backendId   String   @default("")
+  data        Json     @default("{}")                 // populated by plaintext backend only
+  externalRef String   @default("")                   // populated by non-plaintext backends (e.g. "mount/path#v3")
+  version     Int      @default(1)
+  createdAt   DateTime @default(now())
+  updatedAt   DateTime @updatedAt
+
+  backend SecretBackend @relation(fields: [backendId], references: [id])
+  llms    Llm[]

  @@index([name])
+  @@index([backendId])
+}
+
+// ── LLMs ──
+//
+// Server-managed LLM providers. Clients (agent, HTTP-mode mcplocal) send
+// OpenAI-format requests to `mcpd /api/v1/llms/:name/infer` — mcpd attaches the
+// provider API key server-side so credentials never leave the cluster.
+// Credentials are stored by reference: `apiKeySecret` points at a Secret, and
+// `apiKeySecretKey` names the key within that secret's data.
+
+model Llm {
+  id              String   @id @default(cuid())
+  name            String   @unique
+  type            String                                  // anthropic | openai | deepseek | vllm | ollama | gemini-cli
+  model           String                                  // e.g. claude-3-5-sonnet-20241022
+  url             String   @default("")                   // endpoint (empty for provider default)
+  tier            String   @default("fast")               // fast | heavy
+  description     String   @default("")
+  apiKeySecretId  String?                                 // FK to Secret
+  apiKeySecretKey String?                                 // key inside the Secret's data
+  extraConfig     Json     @default("{}")                 // per-type extras
+  version         Int      @default(1)
+  createdAt       DateTime @default(now())
+  updatedAt       DateTime @updatedAt
+
+  apiKeySecret Secret? @relation(fields: [apiKeySecretId], references: [id], onDelete: SetNull)
+
+  @@index([name])
+  @@index([tier])
+  @@index([apiKeySecretId])
 }

 // ── Groups ──
@@ -187,6 +261,7 @@ model Project {
  servers        ProjectServer[]
  prompts        Prompt[]
  promptRequests PromptRequest[]
+  mcpTokens      McpToken[]

  @@index([name])
  @@index([ownerId])
@@ -204,6 +279,36 @@ model ProjectServer {
  @@unique([projectId, serverId])
 }

+// ── MCP Tokens (bearer credentials for HTTP-mode mcplocal) ──
+//
+// Raw value format: `mcpctl_pat_<32 base62 chars>`. The raw value is shown
+// exactly once at create time; only the SHA-256 hash is persisted. Tokens are
+// scoped to exactly one project — they're only valid at
+// `/projects/<that-project>/mcp`. Creator's RBAC is the ceiling; the service
+// rejects bindings that exceed what the creator themselves can do.
+
+model McpToken {
+  id          String    @id @default(cuid())
+  name        String
+  projectId   String
+  tokenHash   String    @unique
+  tokenPrefix String
+  ownerId     String
+  description String    @default("")
+  createdAt   DateTime  @default(now())
+  expiresAt   DateTime?
+  lastUsedAt  DateTime?
+  revokedAt   DateTime?
+
+  project Project @relation(fields: [projectId], references: [id], onDelete: Cascade)
+  owner   User    @relation(fields: [ownerId], references: [id], onDelete: Cascade)
+
+  @@unique([name, projectId])
+  @@index([tokenHash])
+  @@index([projectId])
+  @@index([ownerId])
+}
+
 // ── MCP Instances (running containers) ──

 model McpInstance {
@@ -288,6 +393,8 @@ model AuditEvent {
  correlationId  String?
  parentEventId  String?
  userName       String?
+  tokenName      String?
+  tokenSha       String?
  payload        Json
  createdAt      DateTime @default(now())

@@ -297,6 +404,7 @@ model AuditEvent {
  @@index([timestamp])
  @@index([eventKind])
  @@index([userName])
+  @@index([tokenSha])
 }

 // ── Backup Pending Queue ──
--- a/src/db/src/scripts/pre-migrate-bootstrap.ts
+++ b/src/db/src/scripts/pre-migrate-bootstrap.ts
@@ -0,0 +1,105 @@
+/**
+ * Self-healing pre-migration step for the SecretBackend rollout (Phase 0).
+ *
+ * Why this exists: `prisma db push` applies schema changes sequentially. When
+ * a cluster upgrades from a pre-SecretBackend DB:
+ *   1. `Secret.backendId` column is added with `DEFAULT ''`
+ *   2. `SecretBackend` table is created (empty)
+ *   3. The FK `Secret.backendId → SecretBackend.id` is added — and FAILS
+ *      because every Secret row now has `backendId = ''` which references no
+ *      row in SecretBackend.
+ *
+ * This script runs AFTER a failed `prisma db push` attempt:
+ *   - If SecretBackend table doesn't exist yet → noop (fresh install case;
+ *     db push will create everything and the FK succeeds because there are
+ *     no Secret rows to violate it).
+ *   - If SecretBackend exists but is empty → insert a default plaintext row.
+ *   - If any Secret rows have `backendId = ''` → point them at the default.
+ *
+ * Idempotent: safe to run multiple times. No-op on a fully-migrated cluster.
+ * Never throws; logs and exits 0 even on errors so the subsequent
+ * `prisma db push` retry is still attempted.
+ */
+import { PrismaClient, Prisma } from '@prisma/client';
+
+const DEFAULT_ID = 'cdefault000backend00000001';
+
+async function main(): Promise<void> {
+  const prisma = new PrismaClient();
+  try {
+    // Does the SecretBackend table exist yet? We check by querying the
+    // information_schema rather than catching Prisma's error — cleaner, and
+    // lets us distinguish "table missing" from "query succeeded but empty".
+    const tableExists = await prisma.$queryRaw<Array<{ exists: boolean }>>`
+      SELECT EXISTS (
+        SELECT 1 FROM information_schema.tables
+        WHERE table_schema = 'public' AND table_name = 'SecretBackend'
+      ) AS exists
+    `;
+    if (!tableExists[0]?.exists) {
+      console.log('bootstrap: SecretBackend table not present yet — skipping');
+      return;
+    }
+
+    // Ensure at least one row exists, marked isDefault.
+    const existingDefault = await prisma.$queryRaw<Array<{ id: string }>>`
+      SELECT id FROM "SecretBackend" WHERE "isDefault" = true LIMIT 1
+    `;
+    let defaultId: string;
+    if (existingDefault.length === 0) {
+      await prisma.$executeRaw`
+        INSERT INTO "SecretBackend"
+          ("id", "name", "type", "config", "isDefault", "description", "version", "createdAt", "updatedAt")
+        VALUES (
+          ${DEFAULT_ID},
+          'default',
+          'plaintext',
+          '{}'::jsonb,
+          true,
+          'Default in-database plaintext backend. Seeded by pre-migrate-bootstrap.',
+          1,
+          CURRENT_TIMESTAMP,
+          CURRENT_TIMESTAMP
+        )
+        ON CONFLICT (name) DO NOTHING
+      `;
+      // Re-read — if there was an existing row with the same name but no
+      // isDefault flag we need its id, not the one we tried to insert.
+      const afterInsert = await prisma.$queryRaw<Array<{ id: string }>>`
+        SELECT id FROM "SecretBackend" WHERE name = 'default' LIMIT 1
+      `;
+      if (afterInsert.length === 0) {
+        console.log('bootstrap: could not establish a default SecretBackend — bailing');
+        return;
+      }
+      defaultId = afterInsert[0]!.id;
+      // Make sure it's flagged default.
+      await prisma.$executeRaw`
+        UPDATE "SecretBackend" SET "isDefault" = true WHERE id = ${defaultId}
+      `;
+      console.log(`bootstrap: seeded default SecretBackend (id=${defaultId})`);
+    } else {
+      defaultId = existingDefault[0]!.id;
+    }
+
+    // Backfill Secret.backendId for any rows left with an empty value.
+    // Using $executeRaw returns affected row count.
+    const updated = await prisma.$executeRaw(
+      Prisma.sql`UPDATE "Secret" SET "backendId" = ${defaultId} WHERE "backendId" = ''`,
+    );
+    if (updated > 0) {
+      console.log(`bootstrap: backfilled ${updated} Secret row(s) with default backendId`);
+    }
+  } catch (err) {
+    // Never fail the deploy — worst case prisma db push tries again anyway.
+    // Log the error so it's visible in pod logs.
+    console.error('bootstrap: non-fatal error:', err instanceof Error ? err.message : err);
+  } finally {
+    await prisma.$disconnect();
+  }
+}
+
+main().catch((err: unknown) => {
+  console.error('bootstrap: fatal error (ignored):', err);
+  // Intentionally exit 0 — we don't want to block the deploy on this.
+});
--- a/src/db/src/seed/index.ts
+++ b/src/db/src/seed/index.ts
@@ -8,7 +8,8 @@ export interface TemplateEnvEntry {
 }

 export interface HealthCheckSpec {
-  tool: string;
+  /** When set, probe sends initialize + tools/call (readiness). When omitted, probe sends tools/list only (liveness). */
+  tool?: string;
  arguments?: Record<string, unknown>;
  intervalSeconds?: number;
  timeoutSeconds?: number;
--- a/src/mcpd/src/bootstrap/secret-backends.ts
+++ b/src/mcpd/src/bootstrap/secret-backends.ts
@@ -0,0 +1,53 @@
+/**
+ * Bootstrap the `plaintext` SecretBackend + backfill existing Secret rows.
+ *
+ * Runs on every mcpd startup. Idempotent:
+ *   - if no SecretBackend exists, create `default` (type `plaintext`, isDefault=true)
+ *   - if any Secret has no backendId (fresh after schema migration), point it at `default`
+ *   - if no backend is currently flagged default, promote `default`
+ *
+ * Safe to run repeatedly; never destroys configuration.
+ */
+import type { PrismaClient } from '@prisma/client';
+
+/** Well-known name for the always-present plaintext backend. */
+export const DEFAULT_PLAINTEXT_BACKEND_NAME = 'default';
+
+export async function bootstrapSecretBackends(prisma: PrismaClient): Promise<void> {
+  let plaintext = await prisma.secretBackend.findUnique({
+    where: { name: DEFAULT_PLAINTEXT_BACKEND_NAME },
+  });
+
+  if (plaintext === null) {
+    plaintext = await prisma.secretBackend.create({
+      data: {
+        name: DEFAULT_PLAINTEXT_BACKEND_NAME,
+        type: 'plaintext',
+        isDefault: true,
+        description: 'Default in-database plaintext backend. Seeded on first startup.',
+      },
+    });
+  }
+
+  const currentDefault = await prisma.secretBackend.findFirst({ where: { isDefault: true } });
+  if (currentDefault === null) {
+    await prisma.secretBackend.update({
+      where: { id: plaintext.id },
+      data: { isDefault: true },
+    });
+  }
+
+  // Backfill any secrets left with an empty backendId after the schema migration.
+  // `findMany({ where: { backendId: '' } })` catches rows that existed before
+  // the column was added and had a default-empty value assigned.
+  const orphans = await prisma.secret.findMany({
+    where: { backendId: '' },
+    select: { id: true },
+  });
+  if (orphans.length > 0) {
+    await prisma.secret.updateMany({
+      where: { id: { in: orphans.map((o) => o.id) } },
+      data: { backendId: plaintext.id },
+    });
+  }
+}
--- a/src/mcpd/src/main.ts
+++ b/src/mcpd/src/main.ts
@@ -18,7 +18,22 @@ import {
  UserRepository,
  GroupRepository,
  AuditEventRepository,
+  McpTokenRepository,
 } from './repositories/index.js';
+import { SecretBackendRepository } from './repositories/secret-backend.repository.js';
+import { SecretBackendService } from './services/secret-backend.service.js';
+import { SecretMigrateService } from './services/secret-migrate.service.js';
+import { bootstrapSecretBackends } from './bootstrap/secret-backends.js';
+import { registerSecretBackendRoutes } from './routes/secret-backends.js';
+import { registerSecretMigrateRoutes } from './routes/secret-migrate.js';
+import { SecretBackendRotator } from './services/secret-backend-rotator.service.js';
+import { SecretBackendRotatorLoop } from './services/secret-backend-rotator-loop.js';
+import { registerSecretBackendRotateRoutes } from './routes/secret-backend-rotate.js';
+import { LlmRepository } from './repositories/llm.repository.js';
+import { LlmService } from './services/llm.service.js';
+import { LlmAdapterRegistry } from './services/llm/dispatcher.js';
+import { registerLlmRoutes } from './routes/llms.js';
+import { registerLlmInferRoutes } from './routes/llm-infer.js';
 import { PromptRepository } from './repositories/prompt.repository.js';
 import { PromptRequestRepository } from './repositories/prompt-request.repository.js';
 import { bootstrapSystemProject } from './bootstrap/system-project.js';
@@ -43,6 +58,7 @@ import {
  UserService,
  GroupService,
  AuditEventService,
+  McpTokenService,
 } from './services/index.js';
 import type { RbacAction } from './services/index.js';
 import type { UpdateRbacDefinitionInput } from './validation/rbac-definition.schema.js';
@@ -62,6 +78,7 @@ import {
  registerUserRoutes,
  registerGroupRoutes,
  registerAuditEventRoutes,
+  registerMcpTokenRoutes,
 } from './routes/index.js';
 import { registerPromptRoutes } from './routes/prompts.js';
 import { registerGitBackupRoutes } from './routes/git-backup.js';
@@ -90,11 +107,26 @@ function mapUrlToPermission(method: string, url: string): PermissionCheck {
  if (segment === 'backup') return { kind: 'operation', operation: 'backup' };
  if (segment === 'restore') return { kind: 'operation', operation: 'restore' };
  if (segment === 'audit-logs' && method === 'DELETE') return { kind: 'operation', operation: 'audit-purge' };
+  // /api/v1/secrets/migrate is a bulk cross-backend operation — treat as op, not a plain secret write.
+  if (url.startsWith('/api/v1/secrets/migrate')) return { kind: 'operation', operation: 'migrate-secrets' };
+  // /api/v1/secretbackends/:id/rotate — manual rotation trigger. Operation so
+  // only explicitly-granted callers can force it (the loop itself bypasses
+  // RBAC by calling the rotator in-process).
+  if (/^\/api\/v1\/secretbackends\/[^/?]+\/rotate/.test(url)) {
+    return { kind: 'operation', operation: 'rotate-secretbackend' };
+  }
+
+  // /api/v1/llms/:name/infer → `run:llms:<name>` (not the default create:llms).
+  const inferMatch = url.match(/^\/api\/v1\/llms\/([^/?]+)\/infer/);
+  if (inferMatch?.[1]) {
+    return { kind: 'resource', resource: 'llms', action: 'run', resourceName: inferMatch[1] };
+  }

  const resourceMap: Record<string, string | undefined> = {
    'servers': 'servers',
    'instances': 'instances',
    'secrets': 'secrets',
+    'secretbackends': 'secretbackends',
    'projects': 'projects',
    'templates': 'templates',
    'users': 'users',
@@ -104,6 +136,8 @@ function mapUrlToPermission(method: string, url: string): PermissionCheck {
    'mcp': 'servers',
    'prompts': 'prompts',
    'promptrequests': 'promptrequests',
+    'mcptokens': 'mcptokens',
+    'llms': 'llms',
  };

  const resource = resourceMap[segment];
@@ -116,6 +150,12 @@ function mapUrlToPermission(method: string, url: string): PermissionCheck {
    return { kind: 'resource', resource: 'promptrequests', action: 'delete', resourceName: approveMatch[1] };
  }

+  // Special case: /api/v1/mcptokens/:id/revoke → treated as 'delete' on the token.
+  const revokeMatch = url.match(/^\/api\/v1\/mcptokens\/([^/?]+)\/revoke/);
+  if (revokeMatch?.[1]) {
+    return { kind: 'resource', resource: 'mcptokens', action: 'delete', resourceName: revokeMatch[1] };
+  }
+
  // Special case: /api/v1/projects/:name/prompts/visible → view prompts
  const visiblePromptsMatch = url.match(/^\/api\/v1\/projects\/([^/?]+)\/prompts\/visible/);
  if (visiblePromptsMatch?.[1]) {
@@ -200,7 +240,7 @@ async function migrateAdminRole(rbacRepo: InstanceType<typeof RbacDefinitionRepo
    // Add operation bindings (idempotent — only for wildcard admin)
    const hasWildcard = bindings.some((b) => b['role'] === 'admin' && b['resource'] === '*');
    if (hasWildcard) {
-      const ops = ['impersonate', 'logs', 'backup', 'restore', 'audit-purge'];
+      const ops = ['impersonate', 'logs', 'backup', 'restore', 'audit-purge', 'migrate-secrets', 'rotate-secretbackend'];
      for (const op of ops) {
        if (!newBindings.some((b) => b['action'] === op)) {
          newBindings.push({ role: 'run', action: op });
@@ -251,6 +291,8 @@ async function main(): Promise<void> {
  // Repositories
  const serverRepo = new McpServerRepository(prisma);
  const secretRepo = new SecretRepository(prisma);
+  const secretBackendRepo = new SecretBackendRepository(prisma);
+  const llmRepo = new LlmRepository(prisma);
  const instanceRepo = new McpInstanceRepository(prisma);
  const projectRepo = new ProjectRepository(prisma);
  const auditLogRepo = new AuditLogRepository(prisma);
@@ -259,14 +301,22 @@ async function main(): Promise<void> {
  const rbacDefinitionRepo = new RbacDefinitionRepository(prisma);
  const userRepo = new UserRepository(prisma);
  const groupRepo = new GroupRepository(prisma);
+  const mcpTokenRepo = new McpTokenRepository(prisma);
+
+  // SecretBackend bootstrap: ensure a `plaintext` default row exists and any
+  // pre-existing `Secret` rows are pointed at it. Idempotent per run.
+  await bootstrapSecretBackends(prisma);

  // CUID detection for RBAC name resolution
  const CUID_RE = /^c[^\s-]{8,}$/i;
  const nameResolvers: Record<string, { findById(id: string): Promise<{ name: string } | null> }> = {
    servers: serverRepo,
    secrets: secretRepo,
+    secretbackends: secretBackendRepo,
    projects: projectRepo,
    groups: groupRepo,
+    mcptokens: mcpTokenRepo,
+    llms: llmRepo,
  };

  // Migrate legacy 'admin' role → granular roles
@@ -279,9 +329,39 @@ async function main(): Promise<void> {

  // Services
  const serverService = new McpServerService(serverRepo);
-  const instanceService = new InstanceService(instanceRepo, serverRepo, orchestrator, secretRepo);
+  // SecretBackend service — needs a lazy bridge to the yet-to-be-constructed
+  // SecretService because the OpenBao driver's auth token lives in a plaintext
+  // Secret. The bridge defers the resolve until after `secretService` is
+  // assigned, breaking the circular dependency at construction time.
+  const secretResolverBridge = {
+    resolve: async (name: string, key: string): Promise<string> => secretService.resolve(name, key),
+  };
+  const secretBackendService = new SecretBackendService(secretBackendRepo, {
+    plaintext: {
+      listAllPlaintext: async () => {
+        const rows = await prisma.secret.findMany({
+          where: { backend: { type: 'plaintext' } },
+          select: { name: true, data: true },
+        });
+        return rows.map((r) => ({ name: r.name, data: r.data as Record<string, string> }));
+      },
+    },
+    secretRefResolver: secretResolverBridge,
+  });
+  const secretService = new SecretService(secretRepo, secretBackendService);
+  const secretMigrateService = new SecretMigrateService(secretRepo, secretBackendService);
+  const secretBackendRotator = new SecretBackendRotator({
+    backends: secretBackendService,
+    secrets: secretService,
+  });
+  const secretBackendRotatorLoop = new SecretBackendRotatorLoop({
+    backends: secretBackendService,
+    rotator: secretBackendRotator,
+  });
+  const llmService = new LlmService(llmRepo, secretService);
+  const llmAdapters = new LlmAdapterRegistry();
+  const instanceService = new InstanceService(instanceRepo, serverRepo, orchestrator, secretService);
  serverService.setInstanceService(instanceService);
-  const secretService = new SecretService(secretRepo);
  const projectService = new ProjectService(projectRepo, serverRepo);
  const auditLogService = new AuditLogService(auditLogRepo);
  const auditEventService = new AuditEventService(auditEventRepo);
@@ -292,6 +372,7 @@ async function main(): Promise<void> {
  const mcpProxyService = new McpProxyService(instanceRepo, serverRepo, orchestrator);
  const rbacDefinitionService = new RbacDefinitionService(rbacDefinitionRepo);
  const rbacService = new RbacService(rbacDefinitionRepo, prisma);
+  const mcpTokenService = new McpTokenService(mcpTokenRepo, projectRepo, rbacDefinitionRepo, rbacService);
  const userService = new UserService(userRepo);
  const groupService = new GroupService(groupRepo, userRepo);
  const promptRepo = new PromptRepository(prisma);
@@ -300,12 +381,30 @@ async function main(): Promise<void> {
  promptRuleRegistry.register(systemPromptVarsRule);
  const promptService = new PromptService(promptRepo, promptRequestRepo, projectRepo, promptRuleRegistry);
  const backupService = new BackupService(serverRepo, projectRepo, secretRepo, userRepo, groupRepo, rbacDefinitionRepo, promptRepo, templateRepo);
-  const restoreService = new RestoreService(serverRepo, projectRepo, secretRepo, userRepo, groupRepo, rbacDefinitionRepo, promptRepo, templateRepo);
+  const restoreService = new RestoreService(serverRepo, projectRepo, secretRepo, secretService, userRepo, groupRepo, rbacDefinitionRepo, promptRepo, templateRepo);

-  // Auth middleware for global hooks
-  const authMiddleware = createAuthMiddleware({
-    findSession: (token) => authService.findSession(token),
-  });
+  // Shared auth dependencies. Both the global auth hook and the per-route
+  // preHandler on /api/v1/mcp/proxy must know how to resolve both session
+  // bearers AND mcpctl_pat_ bearers, or mcplocal→mcpd proxy calls with a
+  // McpToken will 401 at the route layer even though the global hook accepts them.
+  const authDeps = {
+    findSession: (token: string) => authService.findSession(token),
+    findMcpToken: async (tokenHash: string) => {
+      const row = await mcpTokenRepo.findByHash(tokenHash);
+      if (row === null) return null;
+      return {
+        tokenId: row.id,
+        tokenName: row.name,
+        tokenSha: row.tokenHash,
+        projectId: row.projectId,
+        projectName: row.project.name,
+        ownerId: row.ownerId,
+        expiresAt: row.expiresAt,
+        revokedAt: row.revokedAt,
+      };
+    },
+  };
+  const authMiddleware = createAuthMiddleware(authDeps);

  // Server
  const app = await createServer(config, {
@@ -329,6 +428,8 @@ async function main(): Promise<void> {
    const url = request.url;
    // Skip auth for health, auth, and root
    if (url.startsWith('/api/v1/auth/') || url === '/healthz' || url === '/health') return;
+    // Introspection authenticates via the McpToken bearer itself — route handles its own auth.
+    if (url.startsWith('/api/v1/mcptokens/introspect')) return;
    if (!url.startsWith('/api/v1/')) return;

    // Run auth middleware
@@ -351,9 +452,28 @@ async function main(): Promise<void> {
    const saHeader = request.headers['x-service-account'];
    const serviceAccountName = typeof saHeader === 'string' ? saHeader : undefined;

+    // McpToken principal (set by authMiddleware when the bearer was mcpctl_pat_…)
+    const mcpTokenSha = request.mcpToken?.tokenSha;
+
+    // Second layer of project-scope enforcement: a McpToken principal can only
+    // hit resources inside its bound project.
+    if (request.mcpToken !== undefined) {
+      const projectMatch = url.match(/^\/api\/v1\/projects\/([^/?]+)/);
+      if (projectMatch?.[1]) {
+        let targetProjectName = projectMatch[1];
+        if (CUID_RE.test(targetProjectName)) {
+          const entity = await projectRepo.findById(targetProjectName);
+          if (entity) targetProjectName = entity.name;
+        }
+        if (targetProjectName !== request.mcpToken.projectName) {
+          return reply.code(403).send({ error: 'Token is not valid for this project' });
+        }
+      }
+    }
+
    let allowed: boolean;
    if (check.kind === 'operation') {
-      allowed = await rbacService.canRunOperation(request.userId, check.operation, serviceAccountName);
+      allowed = await rbacService.canRunOperation(request.userId, check.operation, serviceAccountName, mcpTokenSha);
    } else {
      // Resolve CUID → human name for name-scoped RBAC bindings
      if (check.resourceName !== undefined && CUID_RE.test(check.resourceName)) {
@@ -363,10 +483,10 @@ async function main(): Promise<void> {
          if (entity) check.resourceName = entity.name;
        }
      }
-      allowed = await rbacService.canAccess(request.userId, check.action, check.resource, check.resourceName, serviceAccountName);
+      allowed = await rbacService.canAccess(request.userId, check.action, check.resource, check.resourceName, serviceAccountName, mcpTokenSha);
      // Compute scope for list filtering (used by preSerialization hook)
      if (allowed && check.resourceName === undefined) {
-        request.rbacScope = await rbacService.getAllowedScope(request.userId, check.action, check.resource, serviceAccountName);
+        request.rbacScope = await rbacService.getAllowedScope(request.userId, check.action, check.resource, serviceAccountName, mcpTokenSha);
      }
    }
    if (!allowed) {
@@ -378,6 +498,27 @@ async function main(): Promise<void> {
  registerMcpServerRoutes(app, serverService, instanceService);
  registerTemplateRoutes(app, templateService);
  registerSecretRoutes(app, secretService);
+  registerSecretBackendRoutes(app, secretBackendService);
+  registerSecretBackendRotateRoutes(app, secretBackendRotator);
+  registerSecretMigrateRoutes(app, secretMigrateService);
+  registerLlmRoutes(app, llmService);
+  registerLlmInferRoutes(app, {
+    llmService,
+    adapters: llmAdapters,
+    onInferenceEvent: (event) => {
+      app.log.info({
+        event: 'llm_inference_call',
+        llm: event.llmName,
+        model: event.model,
+        type: event.type,
+        userId: event.userId,
+        tokenSha: event.tokenSha,
+        streaming: event.streaming,
+        durationMs: event.durationMs,
+        status: event.status,
+      });
+    },
+  });
  registerInstanceRoutes(app, instanceService);
  registerProjectRoutes(app, projectService);
  registerAuditLogRoutes(app, auditLogService);
@@ -388,11 +529,12 @@ async function main(): Promise<void> {
  registerMcpProxyRoutes(app, {
    mcpProxyService,
    auditLogService,
-    authDeps: { findSession: (token) => authService.findSession(token) },
+    authDeps,
  });
  registerRbacRoutes(app, rbacDefinitionService);
  registerUserRoutes(app, userService);
  registerGroupRoutes(app, groupService);
+  registerMcpTokenRoutes(app, { tokenService: mcpTokenService, projectRepo });
  registerPromptRoutes(app, promptService, projectRepo);

  // ── Git-based backup ──
@@ -505,20 +647,31 @@ async function main(): Promise<void> {
    }
  }, RECONCILE_INTERVAL_MS);

-  // Health probe runner — periodic MCP tool-call probes (like k8s livenessProbe)
+  // Health probe runner — periodic MCP probes (like k8s livenessProbe).
+  // Without explicit healthCheck.tool, probes send tools/list through
+  // McpProxyService so they traverse the exact production call path.
  const healthProbeRunner = new HealthProbeRunner(
    instanceRepo,
    serverRepo,
    orchestrator,
    { info: (msg) => app.log.info(msg), error: (obj, msg) => app.log.error(obj, msg) },
+    mcpProxyService,
  );
  healthProbeRunner.start(15_000);

+  // SecretBackend token rotator — wakes up for wizard-provisioned openbao
+  // backends only, noop for the rest. Errors inside the loop are logged +
+  // surfaced in `describe secretbackend`, never kill the process.
+  secretBackendRotatorLoop.start().catch((err: unknown) => {
+    app.log.error({ err }, 'secret-backend rotator loop failed to start');
+  });
+
  // Graceful shutdown
  setupGracefulShutdown(app, {
    disconnectDb: async () => {
      clearInterval(reconcileTimer);
      healthProbeRunner.stop();
+      secretBackendRotatorLoop.stop();
      gitBackup.stop();
      await prisma.$disconnect();
    },
--- a/src/mcpd/src/middleware/auth.ts
+++ b/src/mcpd/src/middleware/auth.ts
@@ -1,13 +1,41 @@
 import type { FastifyRequest, FastifyReply } from 'fastify';
+import { isMcpToken, hashToken } from '@mcpctl/shared';
+
+export interface McpTokenPrincipal {
+  tokenId: string;
+  tokenName: string;
+  tokenSha: string;
+  projectId: string;
+  projectName: string;
+  ownerId: string;
+}
+
+export interface McpTokenLookup {
+  tokenId: string;
+  tokenName: string;
+  tokenSha: string;
+  projectId: string;
+  projectName: string;
+  ownerId: string;
+  expiresAt: Date | null;
+  revokedAt: Date | null;
+}

 export interface AuthDeps {
  findSession: (token: string) => Promise<{ userId: string; expiresAt: Date } | null>;
+  /**
+   * Look up an McpToken by SHA-256 hash. Optional — when absent, Bearer tokens
+   * that look like `mcpctl_pat_…` are rejected (400).
+   */
+  findMcpToken?: (tokenHash: string) => Promise<McpTokenLookup | null>;
 }

 declare module 'fastify' {
  interface FastifyRequest {
    userId?: string;
    rbacScope?: { wildcard: boolean; names: Set<string> };
+    /** Set by the auth hook when the caller authenticated via a McpToken bearer (prefix `mcpctl_pat_`). */
+    mcpToken?: McpTokenPrincipal;
  }
 }

@@ -25,6 +53,37 @@ export function createAuthMiddleware(deps: AuthDeps) {
      return;
    }

+    // Dispatch on the prefix: `mcpctl_pat_…` → McpToken path; anything else → session path.
+    if (isMcpToken(token)) {
+      if (deps.findMcpToken === undefined) {
+        reply.code(401).send({ error: 'McpToken auth not enabled' });
+        return;
+      }
+      const row = await deps.findMcpToken(hashToken(token));
+      if (row === null) {
+        reply.code(401).send({ error: 'Invalid token' });
+        return;
+      }
+      if (row.revokedAt !== null) {
+        reply.code(401).send({ error: 'Token revoked' });
+        return;
+      }
+      if (row.expiresAt !== null && row.expiresAt < new Date()) {
+        reply.code(401).send({ error: 'Token expired' });
+        return;
+      }
+      request.userId = row.ownerId;
+      request.mcpToken = {
+        tokenId: row.tokenId,
+        tokenName: row.tokenName,
+        tokenSha: row.tokenSha,
+        projectId: row.projectId,
+        projectName: row.projectName,
+        ownerId: row.ownerId,
+      };
+      return;
+    }
+
    const session = await deps.findSession(token);
    if (session === null) {
      reply.code(401).send({ error: 'Invalid token' });
--- a/src/mcpd/src/repositories/audit-event.repository.ts
+++ b/src/mcpd/src/repositories/audit-event.repository.ts
@@ -30,6 +30,8 @@ export class AuditEventRepository implements IAuditEventRepository {
      correlationId: e.correlationId ?? null,
      parentEventId: e.parentEventId ?? null,
      userName: e.userName ?? null,
+      tokenName: e.tokenName ?? null,
+      tokenSha: e.tokenSha ?? null,
      payload: e.payload as Prisma.InputJsonValue,
    }));
    const result = await this.prisma.auditEvent.createMany({ data });
@@ -132,6 +134,8 @@ function buildWhere(filter?: AuditEventFilter): Prisma.AuditEventWhereInput {
  if (filter.serverName !== undefined) where.serverName = filter.serverName;
  if (filter.correlationId !== undefined) where.correlationId = filter.correlationId;
  if (filter.userName !== undefined) where.userName = filter.userName;
+  if (filter.tokenName !== undefined) where.tokenName = filter.tokenName;
+  if (filter.tokenSha !== undefined) where.tokenSha = filter.tokenSha;

  if (filter.from !== undefined || filter.to !== undefined) {
    const timestamp: Prisma.DateTimeFilter = {};
--- a/src/mcpd/src/repositories/index.ts
+++ b/src/mcpd/src/repositories/index.ts
@@ -15,3 +15,5 @@ export type { IGroupRepository, GroupWithMembers } from './group.repository.js';
 export { GroupRepository } from './group.repository.js';
 export type { IAuditEventRepository, AuditEventFilter, AuditEventCreateInput } from './interfaces.js';
 export { AuditEventRepository } from './audit-event.repository.js';
+export type { IMcpTokenRepository, McpTokenFilter, McpTokenWithRelations, CreateMcpTokenRepoInput } from './interfaces.js';
+export { McpTokenRepository } from './mcp-token.repository.js';
--- a/src/mcpd/src/repositories/interfaces.ts
+++ b/src/mcpd/src/repositories/interfaces.ts
@@ -1,6 +1,6 @@
-import type { McpServer, McpInstance, AuditLog, AuditEvent, Secret, InstanceStatus } from '@prisma/client';
+import type { McpServer, McpInstance, AuditLog, AuditEvent, McpToken, Secret, InstanceStatus } from '@prisma/client';
 import type { CreateMcpServerInput, UpdateMcpServerInput } from '../validation/mcp-server.schema.js';
-import type { CreateSecretInput, UpdateSecretInput } from '../validation/secret.schema.js';
+import type { SecretRepoCreateInput, SecretRepoUpdateInput } from './secret.repository.js';

 export interface IMcpServerRepository {
  findAll(): Promise<McpServer[]>;
@@ -24,8 +24,9 @@ export interface ISecretRepository {
  findAll(): Promise<Secret[]>;
  findById(id: string): Promise<Secret | null>;
  findByName(name: string): Promise<Secret | null>;
-  create(data: CreateSecretInput): Promise<Secret>;
-  update(id: string, data: UpdateSecretInput): Promise<Secret>;
+  findByBackend(backendId: string): Promise<Secret[]>;
+  create(data: SecretRepoCreateInput): Promise<Secret>;
+  update(id: string, data: SecretRepoUpdateInput): Promise<Secret>;
  delete(id: string): Promise<void>;
 }

@@ -57,6 +58,8 @@ export interface AuditEventFilter {
  serverName?: string;
  correlationId?: string;
  userName?: string;
+  tokenName?: string;
+  tokenSha?: string;
  from?: Date;
  to?: Date;
  limit?: number;
@@ -74,6 +77,8 @@ export interface AuditEventCreateInput {
  correlationId?: string;
  parentEventId?: string;
  userName?: string;
+  tokenName?: string;
+  tokenSha?: string;
  payload: Record<string, unknown>;
 }

@@ -95,3 +100,37 @@ export interface IAuditEventRepository {
  listSessions(filter?: { projectName?: string; userName?: string; from?: Date; to?: Date; limit?: number; offset?: number }): Promise<AuditSessionSummary[]>;
  countSessions(filter?: { projectName?: string; userName?: string; from?: Date; to?: Date }): Promise<number>;
 }
+
+// ── MCP Tokens ──
+
+export interface McpTokenFilter {
+  projectId?: string;
+  ownerId?: string;
+  includeRevoked?: boolean;
+}
+
+export interface CreateMcpTokenRepoInput {
+  name: string;
+  projectId: string;
+  ownerId: string;
+  tokenHash: string;
+  tokenPrefix: string;
+  description?: string;
+  expiresAt?: Date | null;
+}
+
+export type McpTokenWithRelations = McpToken & {
+  project: { id: string; name: string };
+  owner: { id: string; email: string };
+};
+
+export interface IMcpTokenRepository {
+  findAll(filter?: McpTokenFilter): Promise<McpTokenWithRelations[]>;
+  findById(id: string): Promise<McpTokenWithRelations | null>;
+  findByHash(tokenHash: string): Promise<McpTokenWithRelations | null>;
+  findByNameAndProject(name: string, projectId: string): Promise<McpTokenWithRelations | null>;
+  create(data: CreateMcpTokenRepoInput): Promise<McpTokenWithRelations>;
+  revoke(id: string): Promise<McpTokenWithRelations>;
+  touchLastUsed(id: string): Promise<void>;
+  delete(id: string): Promise<void>;
+}
--- a/src/mcpd/src/repositories/llm.repository.ts
+++ b/src/mcpd/src/repositories/llm.repository.ts
@@ -0,0 +1,89 @@
+import type { PrismaClient, Llm, Prisma } from '@prisma/client';
+
+export interface CreateLlmInput {
+  name: string;
+  type: string;
+  model: string;
+  url?: string;
+  tier?: string;
+  description?: string;
+  apiKeySecretId?: string | null;
+  apiKeySecretKey?: string | null;
+  extraConfig?: Record<string, unknown>;
+}
+
+export interface UpdateLlmInput {
+  model?: string;
+  url?: string;
+  tier?: string;
+  description?: string;
+  apiKeySecretId?: string | null;
+  apiKeySecretKey?: string | null;
+  extraConfig?: Record<string, unknown>;
+}
+
+export interface ILlmRepository {
+  findAll(): Promise<Llm[]>;
+  findById(id: string): Promise<Llm | null>;
+  findByName(name: string): Promise<Llm | null>;
+  findByTier(tier: string): Promise<Llm[]>;
+  create(data: CreateLlmInput): Promise<Llm>;
+  update(id: string, data: UpdateLlmInput): Promise<Llm>;
+  delete(id: string): Promise<void>;
+}
+
+export class LlmRepository implements ILlmRepository {
+  constructor(private readonly prisma: PrismaClient) {}
+
+  async findAll(): Promise<Llm[]> {
+    return this.prisma.llm.findMany({ orderBy: { name: 'asc' } });
+  }
+
+  async findById(id: string): Promise<Llm | null> {
+    return this.prisma.llm.findUnique({ where: { id } });
+  }
+
+  async findByName(name: string): Promise<Llm | null> {
+    return this.prisma.llm.findUnique({ where: { name } });
+  }
+
+  async findByTier(tier: string): Promise<Llm[]> {
+    return this.prisma.llm.findMany({ where: { tier }, orderBy: { name: 'asc' } });
+  }
+
+  async create(data: CreateLlmInput): Promise<Llm> {
+    return this.prisma.llm.create({
+      data: {
+        name: data.name,
+        type: data.type,
+        model: data.model,
+        url: data.url ?? '',
+        tier: data.tier ?? 'fast',
+        description: data.description ?? '',
+        apiKeySecretId: data.apiKeySecretId ?? null,
+        apiKeySecretKey: data.apiKeySecretKey ?? null,
+        extraConfig: (data.extraConfig ?? {}) as Prisma.InputJsonValue,
+      },
+    });
+  }
+
+  async update(id: string, data: UpdateLlmInput): Promise<Llm> {
+    const updateData: Prisma.LlmUpdateInput = {};
+    if (data.model !== undefined) updateData.model = data.model;
+    if (data.url !== undefined) updateData.url = data.url;
+    if (data.tier !== undefined) updateData.tier = data.tier;
+    if (data.description !== undefined) updateData.description = data.description;
+    if (data.apiKeySecretId !== undefined) {
+      updateData.apiKeySecret = data.apiKeySecretId === null
+        ? { disconnect: true }
+        : { connect: { id: data.apiKeySecretId } };
+    }
+    if (data.apiKeySecretKey !== undefined) updateData.apiKeySecretKey = data.apiKeySecretKey;
+    if (data.extraConfig !== undefined) updateData.extraConfig = data.extraConfig as Prisma.InputJsonValue;
+    return this.prisma.llm.update({ where: { id }, data: updateData });
+  }
+
+  async delete(id: string): Promise<void> {
+    await this.prisma.llm.delete({ where: { id } });
+  }
+}
--- a/src/mcpd/src/repositories/mcp-token.repository.ts
+++ b/src/mcpd/src/repositories/mcp-token.repository.ts
@@ -0,0 +1,83 @@
+import type { PrismaClient } from '@prisma/client';
+import type {
+  IMcpTokenRepository,
+  McpTokenFilter,
+  McpTokenWithRelations,
+  CreateMcpTokenRepoInput,
+} from './interfaces.js';
+
+const INCLUDE_RELATIONS = {
+  project: { select: { id: true, name: true } },
+  owner: { select: { id: true, email: true } },
+} as const;
+
+export class McpTokenRepository implements IMcpTokenRepository {
+  constructor(private readonly prisma: PrismaClient) {}
+
+  async findAll(filter?: McpTokenFilter): Promise<McpTokenWithRelations[]> {
+    const where: Record<string, unknown> = {};
+    if (filter?.projectId !== undefined) where['projectId'] = filter.projectId;
+    if (filter?.ownerId !== undefined) where['ownerId'] = filter.ownerId;
+    if (!filter?.includeRevoked) where['revokedAt'] = null;
+    return this.prisma.mcpToken.findMany({
+      where,
+      include: INCLUDE_RELATIONS,
+      orderBy: { createdAt: 'desc' },
+    }) as Promise<McpTokenWithRelations[]>;
+  }
+
+  async findById(id: string): Promise<McpTokenWithRelations | null> {
+    return this.prisma.mcpToken.findUnique({
+      where: { id },
+      include: INCLUDE_RELATIONS,
+    }) as Promise<McpTokenWithRelations | null>;
+  }
+
+  async findByHash(tokenHash: string): Promise<McpTokenWithRelations | null> {
+    return this.prisma.mcpToken.findUnique({
+      where: { tokenHash },
+      include: INCLUDE_RELATIONS,
+    }) as Promise<McpTokenWithRelations | null>;
+  }
+
+  async findByNameAndProject(name: string, projectId: string): Promise<McpTokenWithRelations | null> {
+    return this.prisma.mcpToken.findUnique({
+      where: { name_projectId: { name, projectId } },
+      include: INCLUDE_RELATIONS,
+    }) as Promise<McpTokenWithRelations | null>;
+  }
+
+  async create(data: CreateMcpTokenRepoInput): Promise<McpTokenWithRelations> {
+    return this.prisma.mcpToken.create({
+      data: {
+        name: data.name,
+        projectId: data.projectId,
+        ownerId: data.ownerId,
+        tokenHash: data.tokenHash,
+        tokenPrefix: data.tokenPrefix,
+        description: data.description ?? '',
+        expiresAt: data.expiresAt ?? null,
+      },
+      include: INCLUDE_RELATIONS,
+    }) as Promise<McpTokenWithRelations>;
+  }
+
+  async revoke(id: string): Promise<McpTokenWithRelations> {
+    return this.prisma.mcpToken.update({
+      where: { id },
+      data: { revokedAt: new Date() },
+      include: INCLUDE_RELATIONS,
+    }) as Promise<McpTokenWithRelations>;
+  }
+
+  async touchLastUsed(id: string): Promise<void> {
+    await this.prisma.mcpToken.update({
+      where: { id },
+      data: { lastUsedAt: new Date() },
+    });
+  }
+
+  async delete(id: string): Promise<void> {
+    await this.prisma.mcpToken.delete({ where: { id } });
+  }
+}
--- a/src/mcpd/src/repositories/secret-backend.repository.ts
+++ b/src/mcpd/src/repositories/secret-backend.repository.ts
@@ -0,0 +1,105 @@
+import type { PrismaClient, SecretBackend, Prisma } from '@prisma/client';
+
+export interface CreateSecretBackendInput {
+  name: string;
+  type: string;
+  config?: Record<string, unknown>;
+  isDefault?: boolean;
+  description?: string;
+}
+
+export interface UpdateSecretBackendInput {
+  config?: Record<string, unknown>;
+  isDefault?: boolean;
+  description?: string;
+  tokenMeta?: Record<string, unknown>;
+}
+
+export interface ISecretBackendRepository {
+  findAll(): Promise<SecretBackend[]>;
+  findById(id: string): Promise<SecretBackend | null>;
+  findByName(name: string): Promise<SecretBackend | null>;
+  findDefault(): Promise<SecretBackend | null>;
+  create(data: CreateSecretBackendInput): Promise<SecretBackend>;
+  update(id: string, data: UpdateSecretBackendInput): Promise<SecretBackend>;
+  /**
+   * Atomically clear `isDefault` on every row except the one named, then set
+   * the given row as default. Used by `setDefault`.
+   */
+  setAsDefault(id: string): Promise<SecretBackend>;
+  delete(id: string): Promise<void>;
+  /** Count secrets that still reference this backend — used to guard delete. */
+  countReferencingSecrets(backendId: string): Promise<number>;
+}
+
+export class SecretBackendRepository implements ISecretBackendRepository {
+  constructor(private readonly prisma: PrismaClient) {}
+
+  async findAll(): Promise<SecretBackend[]> {
+    return this.prisma.secretBackend.findMany({ orderBy: { name: 'asc' } });
+  }
+
+  async findById(id: string): Promise<SecretBackend | null> {
+    return this.prisma.secretBackend.findUnique({ where: { id } });
+  }
+
+  async findByName(name: string): Promise<SecretBackend | null> {
+    return this.prisma.secretBackend.findUnique({ where: { name } });
+  }
+
+  async findDefault(): Promise<SecretBackend | null> {
+    return this.prisma.secretBackend.findFirst({ where: { isDefault: true } });
+  }
+
+  async create(data: CreateSecretBackendInput): Promise<SecretBackend> {
+    return this.prisma.$transaction(async (tx) => {
+      if (data.isDefault === true) {
+        await tx.secretBackend.updateMany({ where: { isDefault: true }, data: { isDefault: false } });
+      }
+      return tx.secretBackend.create({
+        data: {
+          name: data.name,
+          type: data.type,
+          config: (data.config ?? {}) as Prisma.InputJsonValue,
+          isDefault: data.isDefault ?? false,
+          description: data.description ?? '',
+        },
+      });
+    });
+  }
+
+  async update(id: string, data: UpdateSecretBackendInput): Promise<SecretBackend> {
+    return this.prisma.$transaction(async (tx) => {
+      if (data.isDefault === true) {
+        await tx.secretBackend.updateMany({
+          where: { isDefault: true, NOT: { id } },
+          data: { isDefault: false },
+        });
+      }
+      const updateData: Prisma.SecretBackendUpdateInput = {};
+      if (data.config !== undefined) updateData.config = data.config as Prisma.InputJsonValue;
+      if (data.isDefault !== undefined) updateData.isDefault = data.isDefault;
+      if (data.description !== undefined) updateData.description = data.description;
+      if (data.tokenMeta !== undefined) updateData.tokenMeta = data.tokenMeta as Prisma.InputJsonValue;
+      return tx.secretBackend.update({ where: { id }, data: updateData });
+    });
+  }
+
+  async setAsDefault(id: string): Promise<SecretBackend> {
+    return this.prisma.$transaction(async (tx) => {
+      await tx.secretBackend.updateMany({
+        where: { isDefault: true, NOT: { id } },
+        data: { isDefault: false },
+      });
+      return tx.secretBackend.update({ where: { id }, data: { isDefault: true } });
+    });
+  }
+
+  async delete(id: string): Promise<void> {
+    await this.prisma.secretBackend.delete({ where: { id } });
+  }
+
+  async countReferencingSecrets(backendId: string): Promise<number> {
+    return this.prisma.secret.count({ where: { backendId } });
+  }
+}
--- a/src/mcpd/src/repositories/secret.repository.ts
+++ b/src/mcpd/src/repositories/secret.repository.ts
@@ -1,6 +1,18 @@
-import { type PrismaClient, type Secret } from '@prisma/client';
+import { type PrismaClient, type Secret, type Prisma } from '@prisma/client';
 import type { ISecretRepository } from './interfaces.js';
-import type { CreateSecretInput, UpdateSecretInput } from '../validation/secret.schema.js';
+
+export interface SecretRepoCreateInput {
+  name: string;
+  backendId: string;
+  data?: Record<string, string>;
+  externalRef?: string;
+}
+
+export interface SecretRepoUpdateInput {
+  data?: Record<string, string>;
+  externalRef?: string;
+  backendId?: string;
+}

 export class SecretRepository implements ISecretRepository {
  constructor(private readonly prisma: PrismaClient) {}
@@ -17,20 +29,29 @@ export class SecretRepository implements ISecretRepository {
    return this.prisma.secret.findUnique({ where: { name } });
  }

-  async create(data: CreateSecretInput): Promise<Secret> {
+  async findByBackend(backendId: string): Promise<Secret[]> {
+    return this.prisma.secret.findMany({ where: { backendId }, orderBy: { name: 'asc' } });
+  }
+
+  async create(data: SecretRepoCreateInput): Promise<Secret> {
    return this.prisma.secret.create({
      data: {
        name: data.name,
-        data: data.data,
+        backendId: data.backendId,
+        data: (data.data ?? {}) as Prisma.InputJsonValue,
+        externalRef: data.externalRef ?? '',
      },
    });
  }

-  async update(id: string, data: UpdateSecretInput): Promise<Secret> {
-    return this.prisma.secret.update({
-      where: { id },
-      data: { data: data.data },
-    });
+  async update(id: string, data: SecretRepoUpdateInput): Promise<Secret> {
+    const updateData: Prisma.SecretUpdateInput = {};
+    if (data.data !== undefined) updateData.data = data.data as Prisma.InputJsonValue;
+    if (data.externalRef !== undefined) updateData.externalRef = data.externalRef;
+    if (data.backendId !== undefined) {
+      updateData.backend = { connect: { id: data.backendId } };
+    }
+    return this.prisma.secret.update({ where: { id }, data: updateData });
  }

  async delete(id: string): Promise<void> {
--- a/src/mcpd/src/routes/index.ts
+++ b/src/mcpd/src/routes/index.ts
@@ -18,3 +18,5 @@ export { registerRbacRoutes } from './rbac-definitions.js';
 export { registerUserRoutes } from './users.js';
 export { registerGroupRoutes } from './groups.js';
 export { registerAuditEventRoutes } from './audit-events.js';
+export { registerMcpTokenRoutes } from './mcp-tokens.js';
+export type { McpTokenRouteDeps } from './mcp-tokens.js';
--- a/src/mcpd/src/routes/llm-infer.ts
+++ b/src/mcpd/src/routes/llm-infer.ts
@@ -0,0 +1,145 @@
+/**
+ * POST /api/v1/llms/:name/infer
+ *
+ * OpenAI-compatible chat completions endpoint. The RBAC check runs in the
+ * global hook — this URL maps to `run:llms:<name>`, not the default
+ * `create:llms`. See `main.ts:mapUrlToPermission`.
+ *
+ * Non-streaming: resolves the Llm, dispatches to the right provider adapter,
+ * returns the OpenAI chat.completion JSON.
+ *
+ * Streaming (`stream: true`): pipes adapter-emitted chunks back as
+ * `text/event-stream`. Adapters translate provider-native SSE into OpenAI
+ * `chat.completion.chunk`s so clients can use any OpenAI SDK unchanged.
+ */
+import type { FastifyInstance, FastifyReply } from 'fastify';
+import type { LlmService } from '../services/llm.service.js';
+import type { LlmAdapterRegistry } from '../services/llm/dispatcher.js';
+import { NotFoundError } from '../services/mcp-server.service.js';
+import type { OpenAiChatRequest, InferContext } from '../services/llm/types.js';
+
+export interface LlmInferDeps {
+  llmService: LlmService;
+  adapters: LlmAdapterRegistry;
+  /** Optional hook to emit audit events — consumer may ignore. */
+  onInferenceEvent?: (event: InferenceAuditEvent) => void;
+}
+
+export interface InferenceAuditEvent {
+  kind: 'llm_inference_call';
+  llmName: string;
+  model: string;
+  type: string;
+  userId?: string | undefined;
+  tokenSha?: string | undefined;
+  streaming: boolean;
+  durationMs: number;
+  status: number;
+}
+
+export function registerLlmInferRoutes(
+  app: FastifyInstance,
+  deps: LlmInferDeps,
+): void {
+  app.post<{ Params: { name: string }; Body: OpenAiChatRequest }>(
+    '/api/v1/llms/:name/infer',
+    async (request, reply) => {
+      const started = Date.now();
+      let llm;
+      try {
+        llm = await deps.llmService.getByName(request.params.name);
+      } catch (err) {
+        if (err instanceof NotFoundError) {
+          reply.code(404);
+          return { error: err.message };
+        }
+        throw err;
+      }
+
+      const body = (request.body ?? {}) as OpenAiChatRequest;
+      if (!body.messages || body.messages.length === 0) {
+        reply.code(400);
+        return { error: 'messages is required' };
+      }
+
+      // Resolve API key (may be empty string for providers that don't take one).
+      let apiKey = '';
+      if (llm.apiKeyRef !== null) {
+        try {
+          apiKey = await deps.llmService.resolveApiKey(llm.name);
+        } catch (err) {
+          reply.code(500);
+          return { error: `Failed to resolve API key: ${err instanceof Error ? err.message : String(err)}` };
+        }
+      }
+
+      const ctx: InferContext = {
+        body,
+        modelOverride: llm.model,
+        apiKey,
+        url: llm.url,
+        extraConfig: llm.extraConfig,
+      };
+
+      const adapter = deps.adapters.get(llm.type);
+      const streaming = body.stream === true;
+
+      const audit = (status: number): void => {
+        if (deps.onInferenceEvent === undefined) return;
+        deps.onInferenceEvent({
+          kind: 'llm_inference_call',
+          llmName: llm.name,
+          model: llm.model,
+          type: llm.type,
+          userId: request.userId,
+          tokenSha: request.mcpToken?.tokenSha,
+          streaming,
+          durationMs: Date.now() - started,
+          status,
+        });
+      };
+
+      if (!streaming) {
+        try {
+          const result = await adapter.infer(ctx);
+          reply.code(result.status);
+          audit(result.status);
+          return result.body;
+        } catch (err) {
+          audit(502);
+          reply.code(502);
+          return { error: err instanceof Error ? err.message : String(err) };
+        }
+      }
+
+      // Streaming path — set SSE headers and pipe chunks.
+      reply.raw.writeHead(200, {
+        'Content-Type': 'text/event-stream',
+        'Cache-Control': 'no-cache',
+        Connection: 'keep-alive',
+        'X-Accel-Buffering': 'no',
+      });
+      try {
+        for await (const chunk of adapter.stream(ctx)) {
+          writeSseChunk(reply, chunk.data);
+          if (chunk.done === true) break;
+        }
+        audit(200);
+      } catch (err) {
+        const payload = JSON.stringify({
+          error: { message: err instanceof Error ? err.message : String(err) },
+        });
+        writeSseChunk(reply, payload);
+        writeSseChunk(reply, '[DONE]');
+        audit(502);
+      } finally {
+        reply.raw.end();
+      }
+      return reply;
+    },
+  );
+}
+
+function writeSseChunk(reply: FastifyReply, data: string): void {
+  reply.raw.write(`data: ${data}\n\n`);
+}
--- a/src/mcpd/src/routes/llms.ts
+++ b/src/mcpd/src/routes/llms.ts
@@ -0,0 +1,85 @@
+import type { FastifyInstance } from 'fastify';
+import type { LlmService } from '../services/llm.service.js';
+import { NotFoundError, ConflictError } from '../services/mcp-server.service.js';
+
+export function registerLlmRoutes(
+  app: FastifyInstance,
+  service: LlmService,
+): void {
+  app.get('/api/v1/llms', async () => {
+    return service.list();
+  });
+
+  // Accepts either CUID or human name. Used both by the CLI (which usually
+  // resolves to CUID first) and by FailoverRouter's RBAC pre-check (which
+  // hands over the user-facing name to avoid an extra round-trip).
+  app.get<{ Params: { id: string } }>('/api/v1/llms/:id', async (request, reply) => {
+    try {
+      return await getByIdOrName(service, request.params.id);
+    } catch (err) {
+      if (err instanceof NotFoundError) {
+        reply.code(404);
+        return { error: err.message };
+      }
+      throw err;
+    }
+  });
+
+  // No explicit HEAD handler: Fastify auto-derives HEAD from GET, which runs
+  // the same RBAC hook + lookup and drops the body. That's exactly what
+  // FailoverRouter wants for its "can the caller still view this Llm?" probe.
+
+  app.post('/api/v1/llms', async (request, reply) => {
+    try {
+      const row = await service.create(request.body);
+      reply.code(201);
+      return row;
+    } catch (err) {
+      if (err instanceof ConflictError) {
+        reply.code(409);
+        return { error: err.message };
+      }
+      throw err;
+    }
+  });
+
+  app.put<{ Params: { id: string } }>('/api/v1/llms/:id', async (request, reply) => {
+    try {
+      return await service.update(request.params.id, request.body);
+    } catch (err) {
+      if (err instanceof NotFoundError) {
+        reply.code(404);
+        return { error: err.message };
+      }
+      throw err;
+    }
+  });
+
+  app.delete<{ Params: { id: string } }>('/api/v1/llms/:id', async (request, reply) => {
+    try {
+      await service.delete(request.params.id);
+      reply.code(204);
+      return null;
+    } catch (err) {
+      if (err instanceof NotFoundError) {
+        reply.code(404);
+        return { error: err.message };
+      }
+      throw err;
+    }
+  });
+}
+
+const CUID_RE = /^c[a-z0-9]{24}/i;
+
+/**
+ * Look up by CUID first; if the input doesn't look like one, fall back to
+ * findByName. Lets the same URL serve both `mcpctl describe llm <name>` and
+ * the FailoverRouter's name-based RBAC check.
+ */
+async function getByIdOrName(service: LlmService, idOrName: string) {
+  if (CUID_RE.test(idOrName)) {
+    return service.getById(idOrName);
+  }
+  return service.getByName(idOrName);
+}
--- a/src/mcpd/src/routes/mcp-tokens.ts
+++ b/src/mcpd/src/routes/mcp-tokens.ts
@@ -0,0 +1,142 @@
+import type { FastifyInstance, FastifyReply, FastifyRequest } from 'fastify';
+import { isMcpToken } from '@mcpctl/shared';
+import type { McpTokenService } from '../services/mcp-token.service.js';
+import { PermissionCeilingError } from '../services/mcp-token.service.js';
+import { NotFoundError, ConflictError } from '../services/mcp-server.service.js';
+import type { IProjectRepository } from '../repositories/project.repository.js';
+
+export interface McpTokenRouteDeps {
+  tokenService: McpTokenService;
+  projectRepo: IProjectRepository;
+}
+
+export function registerMcpTokenRoutes(app: FastifyInstance, deps: McpTokenRouteDeps): void {
+  const { tokenService, projectRepo } = deps;
+
+  // ── List ─────────────────────────────────────────────────────────────
+  app.get<{ Querystring: { projectId?: string; projectName?: string; includeRevoked?: string } }>(
+    '/api/v1/mcptokens',
+    async (request) => {
+      const { projectId, projectName, includeRevoked } = request.query;
+
+      // Allow filtering by project name for CLI ergonomics.
+      let resolvedProjectId = projectId;
+      if (resolvedProjectId === undefined && projectName !== undefined) {
+        const project = await projectRepo.findByName(projectName);
+        if (project === null) throw new NotFoundError(`Project not found: ${projectName}`);
+        resolvedProjectId = project.id;
+      }
+
+      const filter: { projectId?: string; includeRevoked?: boolean } = {};
+      if (resolvedProjectId !== undefined) filter.projectId = resolvedProjectId;
+      if (includeRevoked === 'true') filter.includeRevoked = true;
+
+      const rows = await tokenService.list(filter);
+      return rows.map(toListResponse);
+    },
+  );
+
+  // ── Describe ─────────────────────────────────────────────────────────
+  app.get<{ Params: { id: string } }>('/api/v1/mcptokens/:id', async (request) => {
+    const row = await tokenService.getById(request.params.id);
+    return toListResponse(row);
+  });
+
+  // ── Create ───────────────────────────────────────────────────────────
+  app.post('/api/v1/mcptokens', async (request, reply) => {
+    const userId = request.userId;
+    if (userId === undefined) {
+      reply.code(401);
+      return { error: 'Not authenticated' };
+    }
+
+    try {
+      // Accept projectName OR projectId for CLI ergonomics.
+      const body = (request.body ?? {}) as Record<string, unknown>;
+      if (typeof body['projectName'] === 'string' && typeof body['projectId'] !== 'string') {
+        const project = await projectRepo.findByName(body['projectName']);
+        if (project === null) throw new NotFoundError(`Project not found: ${body['projectName']}`);
+        body['projectId'] = project.id;
+      }
+
+      const result = await tokenService.create(userId, body);
+      reply.code(201);
+      return {
+        ...toListResponse(result.mcpToken),
+        token: result.raw,
+      };
+    } catch (err) {
+      if (err instanceof NotFoundError) {
+        reply.code(404);
+        return { error: err.message };
+      }
+      if (err instanceof ConflictError) {
+        reply.code(409);
+        return { error: err.message };
+      }
+      if (err instanceof PermissionCeilingError) {
+        reply.code(403);
+        return { error: err.message };
+      }
+      throw err;
+    }
+  });
+
+  // ── Revoke (soft-delete) ────────────────────────────────────────────
+  app.post<{ Params: { id: string } }>('/api/v1/mcptokens/:id/revoke', async (request) => {
+    const row = await tokenService.revoke(request.params.id);
+    return toListResponse(row);
+  });
+
+  // ── Delete (hard) ────────────────────────────────────────────────────
+  app.delete<{ Params: { id: string } }>('/api/v1/mcptokens/:id', async (request, reply) => {
+    await tokenService.delete(request.params.id);
+    reply.code(204);
+  });
+
+  // ── Introspect ───────────────────────────────────────────────────────
+  // Called by mcplocal's HTTP-mode auth preHandler to resolve a raw bearer
+  // to principal info. Accepts a McpToken bearer directly — bypasses the
+  // session-auth path.
+  app.get('/api/v1/mcptokens/introspect', async (request: FastifyRequest, reply: FastifyReply) => {
+    const header = request.headers.authorization;
+    if (header === undefined || !header.startsWith('Bearer ')) {
+      reply.code(401);
+      return { ok: false, error: 'Missing Authorization' };
+    }
+    const token = header.slice(7);
+    if (!isMcpToken(token)) {
+      reply.code(401);
+      return { ok: false, error: 'Not a mcptoken bearer' };
+    }
+    const result = await tokenService.introspectRaw(token);
+    if (!result.ok) {
+      reply.code(401);
+    }
+    return result;
+  });
+}
+
+function toListResponse(row: import('../repositories/interfaces.js').McpTokenWithRelations): Record<string, unknown> {
+  return {
+    id: row.id,
+    name: row.name,
+    projectId: row.projectId,
+    projectName: row.project.name,
+    tokenPrefix: row.tokenPrefix,
+    ownerId: row.ownerId,
+    ownerEmail: row.owner.email,
+    description: row.description,
+    createdAt: row.createdAt,
+    expiresAt: row.expiresAt,
+    lastUsedAt: row.lastUsedAt,
+    revokedAt: row.revokedAt,
+    status: statusOf(row),
+  };
+}
+
+function statusOf(row: import('../repositories/interfaces.js').McpTokenWithRelations): 'active' | 'revoked' | 'expired' {
+  if (row.revokedAt !== null) return 'revoked';
+  if (row.expiresAt !== null && row.expiresAt < new Date()) return 'expired';
+  return 'active';
+}
--- a/src/mcpd/src/routes/secret-backend-rotate.ts
+++ b/src/mcpd/src/routes/secret-backend-rotate.ts
@@ -0,0 +1,29 @@
+/**
+ * POST /api/v1/secretbackends/:id/rotate — force an immediate rotation.
+ *
+ * Used by the wizard (final verify step) + operators troubleshooting a
+ * stale backend. RBAC handled in the global hook via the operation
+ * `rotate-secretbackend` (see `main.ts:mapUrlToPermission`).
+ */
+import type { FastifyInstance } from 'fastify';
+import type { SecretBackendRotator } from '../services/secret-backend-rotator.service.js';
+import { NotFoundError } from '../services/mcp-server.service.js';
+
+export function registerSecretBackendRotateRoutes(
+  app: FastifyInstance,
+  rotator: SecretBackendRotator,
+): void {
+  app.post<{ Params: { id: string } }>('/api/v1/secretbackends/:id/rotate', async (request, reply) => {
+    try {
+      const tokenMeta = await rotator.rotateOne(request.params.id);
+      return { ok: true, tokenMeta };
+    } catch (err) {
+      if (err instanceof NotFoundError) {
+        reply.code(404);
+        return { error: err.message };
+      }
+      reply.code(502);
+      return { error: err instanceof Error ? err.message : String(err) };
+    }
+  });
+}
--- a/src/mcpd/src/routes/secret-backends.ts
+++ b/src/mcpd/src/routes/secret-backends.ts
@@ -0,0 +1,89 @@
+import type { FastifyInstance } from 'fastify';
+import type { SecretBackendService } from '../services/secret-backend.service.js';
+import { SecretBackendInUseError } from '../services/secret-backend.service.js';
+import { NotFoundError, ConflictError } from '../services/mcp-server.service.js';
+
+export function registerSecretBackendRoutes(
+  app: FastifyInstance,
+  service: SecretBackendService,
+): void {
+  app.get('/api/v1/secretbackends', async () => {
+    const rows = await service.list();
+    return rows.map(redactConfig);
+  });
+
+  app.get<{ Params: { id: string } }>('/api/v1/secretbackends/:id', async (request) => {
+    const row = await service.getById(request.params.id);
+    return redactConfig(row);
+  });
+
+  app.post('/api/v1/secretbackends', async (request, reply) => {
+    try {
+      const row = await service.create(request.body as {
+        name: string;
+        type: string;
+        config?: Record<string, unknown>;
+        isDefault?: boolean;
+        description?: string;
+      });
+      reply.code(201);
+      return redactConfig(row);
+    } catch (err) {
+      if (err instanceof ConflictError) {
+        reply.code(409);
+        return { error: err.message };
+      }
+      throw err;
+    }
+  });
+
+  app.put<{ Params: { id: string } }>('/api/v1/secretbackends/:id', async (request) => {
+    const row = await service.update(request.params.id, request.body as {
+      config?: Record<string, unknown>;
+      isDefault?: boolean;
+      description?: string;
+    });
+    return redactConfig(row);
+  });
+
+  app.post<{ Params: { id: string } }>('/api/v1/secretbackends/:id/default', async (request) => {
+    const row = await service.setDefault(request.params.id);
+    return redactConfig(row);
+  });
+
+  app.delete<{ Params: { id: string } }>('/api/v1/secretbackends/:id', async (request, reply) => {
+    try {
+      await service.delete(request.params.id);
+      reply.code(204);
+      return null;
+    } catch (err) {
+      if (err instanceof SecretBackendInUseError) {
+        reply.code(409);
+        return { error: err.message };
+      }
+      if (err instanceof NotFoundError) {
+        reply.code(404);
+        return { error: err.message };
+      }
+      throw err;
+    }
+  });
+}
+
+/**
+ * Strip any value from `config` whose key looks like a credential, and replace
+ * tokenSecretRef with a short description. Prevents accidental exposure via
+ * GET responses.
+ */
+function redactConfig<T extends { config: unknown }>(row: T): T {
+  const config = (row.config ?? {}) as Record<string, unknown>;
+  const cleaned: Record<string, unknown> = {};
+  for (const [k, v] of Object.entries(config)) {
+    if (/token|secret|password|key/i.test(k) && typeof v === 'string') {
+      cleaned[k] = '***';
+    } else {
+      cleaned[k] = v;
+    }
+  }
+  return { ...row, config: cleaned };
+}
--- a/src/mcpd/src/routes/secret-migrate.ts
+++ b/src/mcpd/src/routes/secret-migrate.ts
@@ -0,0 +1,41 @@
+import type { FastifyInstance } from 'fastify';
+import type { SecretMigrateService } from '../services/secret-migrate.service.js';
+
+export function registerSecretMigrateRoutes(
+  app: FastifyInstance,
+  service: SecretMigrateService,
+): void {
+  /**
+   * POST /api/v1/secrets/migrate
+   *   body: { from: string, to: string, names?: string[], keepSource?: boolean, dryRun?: boolean }
+   *   RBAC: operation `migrate-secrets` (role:run).
+   */
+  app.post<{
+    Body: {
+      from: string;
+      to: string;
+      names?: string[];
+      keepSource?: boolean;
+      dryRun?: boolean;
+    };
+  }>('/api/v1/secrets/migrate', async (request, reply) => {
+    const { from, to, names, keepSource, dryRun } = request.body;
+    if (!from || !to) {
+      reply.code(400);
+      return { error: 'from and to are required' };
+    }
+
+    if (dryRun === true) {
+      const options: Parameters<SecretMigrateService['dryRun']>[0] = { from, to };
+      if (names !== undefined) options.names = names;
+      if (keepSource !== undefined) options.keepSource = keepSource;
+      const secrets = await service.dryRun(options);
+      return { dryRun: true, candidates: secrets.map((s) => ({ id: s.id, name: s.name })) };
+    }
+
+    const options: Parameters<SecretMigrateService['migrate']>[0] = { from, to };
+    if (names !== undefined) options.names = names;
+    if (keepSource !== undefined) options.keepSource = keepSource;
+    return service.migrate(options);
+  });
+}
--- a/src/mcpd/src/services/audit-event.service.ts
+++ b/src/mcpd/src/services/audit-event.service.ts
@@ -9,6 +9,8 @@ export interface AuditEventQueryParams {
  serverName?: string;
  correlationId?: string;
  userName?: string;
+  tokenName?: string;
+  tokenSha?: string;
  from?: string;
  to?: string;
  limit?: number;
@@ -71,6 +73,8 @@ export class AuditEventService {
    if (params.serverName !== undefined) filter.serverName = params.serverName;
    if (params.correlationId !== undefined) filter.correlationId = params.correlationId;
    if (params.userName !== undefined) filter.userName = params.userName;
+    if (params.tokenName !== undefined) filter.tokenName = params.tokenName;
+    if (params.tokenSha !== undefined) filter.tokenSha = params.tokenSha;
    if (params.from !== undefined) filter.from = new Date(params.from);
    if (params.to !== undefined) filter.to = new Date(params.to);
    if (params.limit !== undefined) filter.limit = params.limit;
--- a/src/mcpd/src/services/backup/restore-service.ts
+++ b/src/mcpd/src/services/backup/restore-service.ts
@@ -6,6 +6,7 @@ import type { IRbacDefinitionRepository } from '../../repositories/rbac-definiti
 import type { IPromptRepository } from '../../repositories/prompt.repository.js';
 import type { ITemplateRepository } from '../../repositories/template.repository.js';
 import type { RbacRoleBinding } from '../../validation/rbac-definition.schema.js';
+import type { SecretService } from '../secret.service.js';
 import { decrypt } from './crypto.js';
 import type { BackupBundle } from './backup-service.js';

@@ -41,6 +42,7 @@ export class RestoreService {
    private serverRepo: IMcpServerRepository,
    private projectRepo: IProjectRepository,
    private secretRepo: ISecretRepository,
+    private secretService: SecretService,
    private userRepo?: IUserRepository,
    private groupRepo?: IGroupRepository,
    private rbacRepo?: IRbacDefinitionRepository,
@@ -125,16 +127,13 @@ export class RestoreService {
            result.secretsSkipped++;
            continue;
          }
-          // overwrite
-          await this.secretRepo.update(existing.id, { data: secret.data });
+          // overwrite — route through SecretService so backend dispatch applies.
+          await this.secretService.update(existing.id, { data: secret.data });
          result.secretsCreated++;
          continue;
        }

-        await this.secretRepo.create({
-          name: secret.name,
-          data: secret.data,
-        });
+        await this.secretService.create({ name: secret.name, data: secret.data });
        result.secretsCreated++;
      } catch (err) {
        result.errors.push(`Failed to restore secret "${secret.name}": ${err instanceof Error ? err.message : String(err)}`);
--- a/src/mcpd/src/services/env-resolver.ts
+++ b/src/mcpd/src/services/env-resolver.ts
@@ -1,42 +1,44 @@
 import type { McpServer } from '@prisma/client';
-import type { ISecretRepository } from '../repositories/interfaces.js';
 import type { ServerEnvEntry } from '../validation/mcp-server.schema.js';

+/**
+ * Minimal dependency surface for the env resolver: anything that can turn a
+ * (secretName, key) pair into a string. Matches `SecretService.resolve()` so
+ * resolution now flows through the configured SecretBackend driver instead
+ * of reading `Secret.data` directly.
+ */
+export interface SecretResolver {
+  resolve(secretName: string, key: string): Promise<string>;
+}
+
 /**
 * Resolve a server's env entries into a flat key-value map.
 * - Inline `value` entries are used directly.
- * - `valueFrom.secretRef` entries are looked up from the secret repository.
+ * - `valueFrom.secretRef` entries are looked up through the resolver.
 * Throws if a referenced secret or key is missing.
 */
 export async function resolveServerEnv(
  server: McpServer,
-  secretRepo: ISecretRepository,
+  resolver: SecretResolver,
 ): Promise<Record<string, string>> {
  const entries = server.env as ServerEnvEntry[];
  if (!entries || entries.length === 0) return {};

  const result: Record<string, string> = {};
-  const secretCache = new Map<string, Record<string, string>>();

  for (const entry of entries) {
    if (entry.value !== undefined) {
      result[entry.name] = entry.value;
    } else if (entry.valueFrom?.secretRef) {
      const { name: secretName, key } = entry.valueFrom.secretRef;
-
-      if (!secretCache.has(secretName)) {
-        const secret = await secretRepo.findByName(secretName);
-        if (!secret) {
-          throw new Error(`Secret '${secretName}' not found (referenced by server '${server.name}' env '${entry.name}')`);
-        }
-        secretCache.set(secretName, secret.data as Record<string, string>);
+      try {
+        result[entry.name] = await resolver.resolve(secretName, key);
+      } catch (err) {
+        const msg = err instanceof Error ? err.message : String(err);
+        throw new Error(
+          `Cannot resolve secret for server '${server.name}' env '${entry.name}': ${msg}`,
+        );
      }
-
-      const data = secretCache.get(secretName)!;
-      if (!(key in data)) {
-        throw new Error(`Key '${key}' not found in secret '${secretName}' (referenced by server '${server.name}' env '${entry.name}')`);
-      }
-      result[entry.name] = data[key]!;
    }
  }

--- a/src/mcpd/src/services/health-probe.service.ts
+++ b/src/mcpd/src/services/health-probe.service.ts
@@ -1,15 +1,24 @@
 import type { McpServer, McpInstance } from '@prisma/client';
 import type { IMcpInstanceRepository, IMcpServerRepository } from '../repositories/interfaces.js';
 import type { McpOrchestrator } from './orchestrator.js';
+import type { McpProxyService } from './mcp-proxy-service.js';

 export interface HealthCheckSpec {
-  tool: string;
+  /** When set, probe sends initialize + tools/call (readiness). When omitted, probe sends tools/list only (liveness). */
+  tool?: string;
  arguments?: Record<string, unknown>;
  intervalSeconds?: number;
  timeoutSeconds?: number;
  failureThreshold?: number;
 }

+/** Default liveness probe applied to any RUNNING instance whose server has no explicit healthCheck. */
+export const DEFAULT_HEALTH_CHECK: HealthCheckSpec = {
+  intervalSeconds: 30,
+  timeoutSeconds: 8,
+  failureThreshold: 3,
+};
+
 export interface ProbeResult {
  healthy: boolean;
  latencyMs: number;
@@ -39,6 +48,8 @@ export class HealthProbeRunner {
    private serverRepo: IMcpServerRepository,
    private orchestrator: McpOrchestrator,
    private logger?: { info: (msg: string) => void; error: (obj: unknown, msg: string) => void },
+    /** Used for liveness probes (no explicit tool) — routes tools/list through the real production path. */
+    private mcpProxyService?: McpProxyService,
  ) {}

  /** Start the periodic probe loop. Runs every `tickIntervalMs` (default 15s). */
@@ -75,8 +86,8 @@ export class HealthProbeRunner {
        server = s;
      }

-      const healthCheck = server.healthCheck as HealthCheckSpec | null;
-      if (!healthCheck) continue;
+      // Any server without an explicit healthCheck gets the default liveness probe.
+      const healthCheck: HealthCheckSpec = (server.healthCheck as HealthCheckSpec | null) ?? DEFAULT_HEALTH_CHECK;

      const intervalMs = (healthCheck.intervalSeconds ?? 60) * 1000;
      const state = this.probeStates.get(inst.id);
@@ -111,10 +122,18 @@ export class HealthProbeRunner {
    let result: ProbeResult;

    try {
-      if (server.transport === 'SSE' || server.transport === 'STREAMABLE_HTTP') {
-        result = await this.probeHttp(instance, server, healthCheck, timeoutMs);
+      if (healthCheck.tool === undefined) {
+        // Liveness probe: send tools/list through the real production path.
+        // Mirrors exactly what mcplocal/client calls do, so synthetic and real
+        // failures converge on the same signal.
+        result = await this.probeLiveness(server, timeoutMs);
      } else {
-        result = await this.probeStdio(instance, server, healthCheck, timeoutMs);
+        const readinessCheck = healthCheck as HealthCheckSpec & { tool: string };
+        if (server.transport === 'SSE' || server.transport === 'STREAMABLE_HTTP') {
+          result = await this.probeHttp(instance, server, readinessCheck, timeoutMs);
+        } else {
+          result = await this.probeStdio(instance, server, readinessCheck, timeoutMs);
+        }
      }
    } catch (err) {
      result = {
@@ -169,11 +188,47 @@ export class HealthProbeRunner {
    return result;
  }

+  /**
+   * Liveness probe — sends tools/list via McpProxyService so the probe traverses
+   * the exact code path production clients use. Works uniformly across every
+   * transport (STDIO exec/attach, SSE, Streamable HTTP, external).
+   */
+  private async probeLiveness(server: McpServer, timeoutMs: number): Promise<ProbeResult> {
+    const start = Date.now();
+    if (!this.mcpProxyService) {
+      return { healthy: false, latencyMs: 0, message: 'mcpProxyService not wired — cannot run default liveness probe' };
+    }
+
+    const deadline = new Promise<ProbeResult>((resolve) => {
+      setTimeout(() => resolve({
+        healthy: false,
+        latencyMs: timeoutMs,
+        message: `Liveness probe timed out after ${timeoutMs}ms`,
+      }), timeoutMs);
+    });
+
+    const probe = this.mcpProxyService.execute({ serverId: server.id, method: 'tools/list' })
+      .then((response): ProbeResult => {
+        const latencyMs = Date.now() - start;
+        if (response.error) {
+          return { healthy: false, latencyMs, message: response.error.message ?? 'tools/list error' };
+        }
+        return { healthy: true, latencyMs, message: 'ok' };
+      })
+      .catch((err: unknown): ProbeResult => ({
+        healthy: false,
+        latencyMs: Date.now() - start,
+        message: err instanceof Error ? err.message : String(err),
+      }));
+
+    return Promise.race([probe, deadline]);
+  }
+
  /** Probe an HTTP/SSE MCP server by sending a JSON-RPC tool call. */
  private async probeHttp(
    instance: McpInstance,
    server: McpServer,
-    healthCheck: HealthCheckSpec,
+    healthCheck: HealthCheckSpec & { tool: string },
    timeoutMs: number,
  ): Promise<ProbeResult> {
    if (!instance.containerId) {
@@ -205,7 +260,7 @@ export class HealthProbeRunner {
   */
  private async probeStreamableHttp(
    baseUrl: string,
-    healthCheck: HealthCheckSpec,
+    healthCheck: HealthCheckSpec & { tool: string },
    timeoutMs: number,
  ): Promise<ProbeResult> {
    const start = Date.now();
@@ -274,7 +329,7 @@ export class HealthProbeRunner {
   */
  private async probeSse(
    baseUrl: string,
-    healthCheck: HealthCheckSpec,
+    healthCheck: HealthCheckSpec & { tool: string },
    timeoutMs: number,
  ): Promise<ProbeResult> {
    const start = Date.now();
@@ -415,7 +470,7 @@ export class HealthProbeRunner {
  private async probeStdio(
    instance: McpInstance,
    server: McpServer,
-    healthCheck: HealthCheckSpec,
+    healthCheck: HealthCheckSpec & { tool: string },
    timeoutMs: number,
  ): Promise<ProbeResult> {
    if (!instance.containerId) {
--- a/src/mcpd/src/services/index.ts
+++ b/src/mcpd/src/services/index.ts
@@ -34,3 +34,5 @@ export { UserService } from './user.service.js';
 export { GroupService } from './group.service.js';
 export { AuditEventService } from './audit-event.service.js';
 export type { AuditEventQueryParams } from './audit-event.service.js';
+export { McpTokenService, PermissionCeilingError } from './mcp-token.service.js';
+export type { CreateMcpTokenResult, IntrospectResult } from './mcp-token.service.js';
--- a/src/mcpd/src/services/instance.service.ts
+++ b/src/mcpd/src/services/instance.service.ts
@@ -1,8 +1,8 @@
 import type { McpInstance } from '@prisma/client';
-import type { IMcpInstanceRepository, IMcpServerRepository, ISecretRepository } from '../repositories/interfaces.js';
+import type { IMcpInstanceRepository, IMcpServerRepository } from '../repositories/interfaces.js';
 import type { McpOrchestrator, ContainerSpec, ContainerInfo } from './orchestrator.js';
 import { NotFoundError } from './mcp-server.service.js';
-import { resolveServerEnv } from './env-resolver.js';
+import { resolveServerEnv, type SecretResolver } from './env-resolver.js';

 /** Runner images for package-based MCP servers, keyed by runtime name. */
 const RUNNER_IMAGES: Record<string, string> = {
@@ -26,7 +26,7 @@ export class InstanceService {
    private instanceRepo: IMcpInstanceRepository,
    private serverRepo: IMcpServerRepository,
    private orchestrator: McpOrchestrator,
-    private secretRepo?: ISecretRepository,
+    private secretResolver?: SecretResolver,
  ) {}

  async list(serverId?: string): Promise<McpInstance[]> {
@@ -284,9 +284,9 @@ export class InstanceService {
      }

      // Resolve env vars from inline values and secret refs
-      if (this.secretRepo) {
+      if (this.secretResolver) {
        try {
-          const resolvedEnv = await resolveServerEnv(server, this.secretRepo);
+          const resolvedEnv = await resolveServerEnv(server, this.secretResolver);
          if (Object.keys(resolvedEnv).length > 0) {
            spec.env = resolvedEnv;
          }
--- a/src/mcpd/src/services/llm.service.ts
+++ b/src/mcpd/src/services/llm.service.ts
@@ -0,0 +1,180 @@
+/**
+ * LlmService — CRUD over `Llm` rows plus credential resolution.
+ *
+ * Credentials are stored by reference: the row carries `(apiKeySecretId,
+ * apiKeySecretKey)`. Callers that need the raw key (the inference proxy, once
+ * it lands in Phase 2) call `resolveApiKey()`, which reads through the
+ * SecretService (whose own backend dispatch transparently hits plaintext or
+ * OpenBao as configured).
+ *
+ * The CLI/API accepts `apiKeyRef: { name, key }` — the service translates
+ * that to the FK pair.
+ */
+import type { Llm } from '@prisma/client';
+import type { ILlmRepository } from '../repositories/llm.repository.js';
+import type { SecretService } from './secret.service.js';
+import {
+  CreateLlmSchema,
+  UpdateLlmSchema,
+  type CreateLlmInput,
+  type ApiKeyRef,
+} from '../validation/llm.schema.js';
+import { NotFoundError, ConflictError } from './mcp-server.service.js';
+
+/** Shape returned by API layer — merges DB row with a human-readable apiKeyRef. */
+export interface LlmView {
+  id: string;
+  name: string;
+  type: string;
+  model: string;
+  url: string;
+  tier: string;
+  description: string;
+  apiKeyRef: ApiKeyRef | null;
+  extraConfig: Record<string, unknown>;
+  version: number;
+  createdAt: Date;
+  updatedAt: Date;
+}
+
+export class LlmService {
+  constructor(
+    private readonly repo: ILlmRepository,
+    private readonly secrets: SecretService,
+  ) {}
+
+  async list(): Promise<LlmView[]> {
+    const rows = await this.repo.findAll();
+    return Promise.all(rows.map((r) => this.toView(r)));
+  }
+
+  async getById(id: string): Promise<LlmView> {
+    const row = await this.repo.findById(id);
+    if (row === null) throw new NotFoundError(`Llm not found: ${id}`);
+    return this.toView(row);
+  }
+
+  async getByName(name: string): Promise<LlmView> {
+    const row = await this.repo.findByName(name);
+    if (row === null) throw new NotFoundError(`Llm not found: ${name}`);
+    return this.toView(row);
+  }
+
+  async create(input: unknown): Promise<LlmView> {
+    const data = CreateLlmSchema.parse(input);
+    const existing = await this.repo.findByName(data.name);
+    if (existing !== null) throw new ConflictError(`Llm already exists: ${data.name}`);
+
+    const apiKeyFields = await this.resolveApiKeyRefToIds(data.apiKeyRef);
+    const row = await this.repo.create({
+      name: data.name,
+      type: data.type,
+      model: data.model,
+      url: data.url ?? '',
+      tier: data.tier,
+      description: data.description,
+      apiKeySecretId: apiKeyFields.id,
+      apiKeySecretKey: apiKeyFields.key,
+      extraConfig: data.extraConfig,
+    });
+    return this.toView(row);
+  }
+
+  async update(id: string, input: unknown): Promise<LlmView> {
+    const data = UpdateLlmSchema.parse(input);
+    await this.getById(id);
+
+    const updateFields: Parameters<ILlmRepository['update']>[1] = {};
+    if (data.model !== undefined) updateFields.model = data.model;
+    if (data.url !== undefined) updateFields.url = data.url;
+    if (data.tier !== undefined) updateFields.tier = data.tier;
+    if (data.description !== undefined) updateFields.description = data.description;
+    if (data.extraConfig !== undefined) updateFields.extraConfig = data.extraConfig;
+
+    // apiKeyRef: null → explicit unlink; object → replace; undefined → leave alone.
+    if (data.apiKeyRef !== undefined) {
+      if (data.apiKeyRef === null) {
+        updateFields.apiKeySecretId = null;
+        updateFields.apiKeySecretKey = null;
+      } else {
+        const resolved = await this.resolveApiKeyRefToIds(data.apiKeyRef);
+        updateFields.apiKeySecretId = resolved.id;
+        updateFields.apiKeySecretKey = resolved.key;
+      }
+    }
+
+    const row = await this.repo.update(id, updateFields);
+    return this.toView(row);
+  }
+
+  async delete(id: string): Promise<void> {
+    await this.getById(id);
+    await this.repo.delete(id);
+  }
+
+  /**
+   * Return the raw API key string for a given Llm. Called by the inference
+   * proxy in Phase 2. Throws NotFoundError if the Llm has no apiKeyRef, or the
+   * referenced secret/key doesn't exist.
+   */
+  async resolveApiKey(llmName: string): Promise<string> {
+    const row = await this.repo.findByName(llmName);
+    if (row === null) throw new NotFoundError(`Llm not found: ${llmName}`);
+    if (row.apiKeySecretId === null || row.apiKeySecretKey === null) {
+      throw new NotFoundError(`Llm '${llmName}' has no apiKeyRef configured`);
+    }
+    const secret = await this.secrets.getById(row.apiKeySecretId);
+    const data = await this.secrets.resolveData(secret);
+    const value = data[row.apiKeySecretKey];
+    if (value === undefined) {
+      throw new NotFoundError(`Secret '${secret.name}' has no key '${row.apiKeySecretKey}'`);
+    }
+    return value;
+  }
+
+  private async resolveApiKeyRefToIds(ref: ApiKeyRef | undefined): Promise<{ id: string | null; key: string | null }> {
+    if (ref === undefined) return { id: null, key: null };
+    const secret = await this.secrets.getByName(ref.name);
+    return { id: secret.id, key: ref.key };
+  }
+
+  private async toView(row: Llm): Promise<LlmView> {
+    let apiKeyRef: ApiKeyRef | null = null;
+    if (row.apiKeySecretId !== null && row.apiKeySecretKey !== null) {
+      const secret = await this.secrets.getById(row.apiKeySecretId).catch(() => null);
+      if (secret !== null) {
+        apiKeyRef = { name: secret.name, key: row.apiKeySecretKey };
+      }
+    }
+    return {
+      id: row.id,
+      name: row.name,
+      type: row.type,
+      model: row.model,
+      url: row.url,
+      tier: row.tier,
+      description: row.description,
+      apiKeyRef,
+      extraConfig: row.extraConfig as Record<string, unknown>,
+      version: row.version,
+      createdAt: row.createdAt,
+      updatedAt: row.updatedAt,
+    };
+  }
+
+  // ── Backup/restore helpers ──
+
+  async upsertByName(input: CreateLlmInput): Promise<LlmView> {
+    const existing = await this.repo.findByName(input.name);
+    if (existing !== null) {
+      return this.update(existing.id, input);
+    }
+    return this.create(input);
+  }
+
+  async deleteByName(name: string): Promise<void> {
+    const row = await this.repo.findByName(name);
+    if (row === null) return;
+    await this.delete(row.id);
+  }
+}
--- a/src/mcpd/src/services/llm/adapters/anthropic.ts
+++ b/src/mcpd/src/services/llm/adapters/anthropic.ts
@@ -0,0 +1,256 @@
+/**
+ * Anthropic adapter — translates between OpenAI chat/completions format and
+ * the Anthropic Messages API (`POST /v1/messages`).
+ *
+ * Key differences we translate:
+ *   - OpenAI `role: 'system'` messages become a top-level `system` string.
+ *   - Anthropic returns `content: [{ type: 'text', text }]` — we join into
+ *     OpenAI's `content: "…"` string.
+ *   - Streaming: Anthropic emits a sequence of
+ *     `message_start / content_block_{start,delta,stop} / message_delta /
+ *     message_stop` events. We translate those to OpenAI
+ *     `chat.completion.chunk` deltas.
+ *
+ * This adapter implements the subset needed for plain-text chat — tool-use
+ * translation is intentionally left out for this phase; agents that need tool
+ * calling should target an OpenAI-compatible provider until the translator
+ * covers it.
+ */
+import type {
+  LlmAdapter,
+  InferContext,
+  NonStreamingResult,
+  StreamingChunk,
+  AdapterDeps,
+  OpenAiMessage,
+} from '../types.js';
+
+const DEFAULT_ANTHROPIC_URL = 'https://api.anthropic.com';
+const ANTHROPIC_VERSION = '2023-06-01';
+
+interface AnthropicMessageResponse {
+  id: string;
+  model: string;
+  role: 'assistant';
+  content: Array<{ type: 'text'; text: string } | { type: string; [k: string]: unknown }>;
+  stop_reason?: string;
+  usage?: { input_tokens: number; output_tokens: number };
+}
+
+export class AnthropicAdapter implements LlmAdapter {
+  readonly kind = 'anthropic';
+  private readonly fetchImpl: typeof globalThis.fetch;
+
+  constructor(deps: AdapterDeps = {}) {
+    this.fetchImpl = deps.fetch ?? globalThis.fetch;
+  }
+
+  async infer(ctx: InferContext): Promise<NonStreamingResult> {
+    const url = (ctx.url !== '' ? ctx.url : DEFAULT_ANTHROPIC_URL).replace(/\/+$/, '');
+    const body = this.toAnthropicRequest(ctx, false);
+    const res = await this.fetchImpl(`${url}/v1/messages`, {
+      method: 'POST',
+      headers: this.headers(ctx),
+      body: JSON.stringify(body),
+    });
+    if (!res.ok) {
+      const text = await res.text().catch(() => '');
+      return {
+        status: res.status,
+        body: { error: { message: `anthropic: HTTP ${String(res.status)} ${text}` } },
+      };
+    }
+    const anth = await res.json() as AnthropicMessageResponse;
+    return { status: 200, body: this.toOpenAiResponse(anth) };
+  }
+
+  async *stream(ctx: InferContext): AsyncGenerator<StreamingChunk> {
+    const url = (ctx.url !== '' ? ctx.url : DEFAULT_ANTHROPIC_URL).replace(/\/+$/, '');
+    const body = this.toAnthropicRequest(ctx, true);
+    const res = await this.fetchImpl(`${url}/v1/messages`, {
+      method: 'POST',
+      headers: this.headers(ctx),
+      body: JSON.stringify(body),
+    });
+    if (!res.ok || res.body === null) {
+      const text = await res.text().catch(() => '');
+      throw new Error(`anthropic stream: HTTP ${String(res.status)} ${text}`);
+    }
+
+    const id = `chatcmpl-${cryptoNonce()}`;
+    const model = body.model;
+    const created = Math.floor(Date.now() / 1000);
+
+    // Parse Anthropic SSE. Each event is `event: <name>\ndata: <json>\n\n`.
+    const decoder = new TextDecoder();
+    let buf = '';
+    const reader = res.body.getReader();
+    let emittedFirst = false;
+
+    const baseChunk = (delta: Record<string, unknown>, finishReason?: string): string => {
+      const chunk = {
+        id,
+        object: 'chat.completion.chunk',
+        created,
+        model,
+        choices: [{
+          index: 0,
+          delta,
+          finish_reason: finishReason ?? null,
+        }],
+      };
+      return JSON.stringify(chunk);
+    };
+
+    try {
+      // eslint-disable-next-line no-constant-condition
+      while (true) {
+        const { value, done } = await reader.read();
+        if (done) break;
+        buf += decoder.decode(value, { stream: true });
+
+        let idx: number;
+        while ((idx = buf.indexOf('\n\n')) !== -1) {
+          const rawEvent = buf.slice(0, idx);
+          buf = buf.slice(idx + 2);
+          const parsed = parseSseEvent(rawEvent);
+          if (parsed === null) continue;
+          const { event, data } = parsed;
+
+          if (event === 'content_block_delta') {
+            const textDelta = (data as { delta?: { type?: string; text?: string } }).delta;
+            if (textDelta?.type === 'text_delta' && typeof textDelta.text === 'string') {
+              if (!emittedFirst) {
+                yield { data: baseChunk({ role: 'assistant', content: '' }) };
+                emittedFirst = true;
+              }
+              yield { data: baseChunk({ content: textDelta.text }) };
+            }
+          } else if (event === 'message_delta') {
+            const stopReason = (data as { delta?: { stop_reason?: string } }).delta?.stop_reason;
+            if (typeof stopReason === 'string') {
+              yield { data: baseChunk({}, mapStopReason(stopReason)) };
+            }
+          } else if (event === 'message_stop') {
+            yield { data: '[DONE]', done: true };
+            return;
+          } else if (event === 'error') {
+            throw new Error(`anthropic stream error: ${JSON.stringify(data)}`);
+          }
+        }
+      }
+    } finally {
+      reader.releaseLock();
+    }
+    // Anthropic closed without message_stop — give consumer a clean end.
+    yield { data: '[DONE]', done: true };
+  }
+
+  private headers(ctx: InferContext): Record<string, string> {
+    return {
+      'Content-Type': 'application/json',
+      'x-api-key': ctx.apiKey,
+      'anthropic-version': ANTHROPIC_VERSION,
+    };
+  }
+
+  /** Translate the OpenAI request to the Anthropic Messages shape. */
+  private toAnthropicRequest(ctx: InferContext, stream: boolean): {
+    model: string;
+    max_tokens: number;
+    messages: Array<{ role: 'user' | 'assistant'; content: string }>;
+    system?: string;
+    stream?: boolean;
+    temperature?: number;
+    top_p?: number;
+    stop_sequences?: string[];
+  } {
+    const { body } = ctx;
+    const systemParts: string[] = [];
+    const messages: Array<{ role: 'user' | 'assistant'; content: string }> = [];
+
+    for (const msg of body.messages) {
+      const text = normaliseContent(msg);
+      if (msg.role === 'system') {
+        systemParts.push(text);
+      } else if (msg.role === 'user' || msg.role === 'assistant') {
+        messages.push({ role: msg.role, content: text });
+      }
+      // `tool` role messages are dropped — tool translation is out of scope
+      // for this phase.
+    }
+
+    const out: ReturnType<typeof this.toAnthropicRequest> = {
+      model: body.model !== '' ? body.model : ctx.modelOverride,
+      max_tokens: typeof body.max_tokens === 'number' ? body.max_tokens : 1024,
+      messages,
+    };
+    if (systemParts.length > 0) out.system = systemParts.join('\n\n');
+    if (stream) out.stream = true;
+    if (typeof body.temperature === 'number') out.temperature = body.temperature;
+    if (typeof body.top_p === 'number') out.top_p = body.top_p;
+    if (body.stop !== undefined) {
+      out.stop_sequences = Array.isArray(body.stop) ? body.stop : [body.stop];
+    }
+    return out;
+  }
+
+  private toOpenAiResponse(anth: AnthropicMessageResponse): Record<string, unknown> {
+    const text = anth.content
+      .map((c) => (c.type === 'text' && typeof (c as { text?: unknown }).text === 'string'
+        ? (c as { text: string }).text
+        : ''))
+      .join('');
+    return {
+      id: `chatcmpl-${anth.id}`,
+      object: 'chat.completion',
+      created: Math.floor(Date.now() / 1000),
+      model: anth.model,
+      choices: [{
+        index: 0,
+        message: { role: 'assistant', content: text },
+        finish_reason: mapStopReason(anth.stop_reason ?? 'end_turn'),
+      }],
+      usage: anth.usage ? {
+        prompt_tokens: anth.usage.input_tokens,
+        completion_tokens: anth.usage.output_tokens,
+        total_tokens: anth.usage.input_tokens + anth.usage.output_tokens,
+      } : undefined,
+    };
+  }
+}
+
+function normaliseContent(msg: OpenAiMessage): string {
+  if (typeof msg.content === 'string') return msg.content;
+  return msg.content
+    .map((part) => (typeof part.text === 'string' ? part.text : ''))
+    .join('');
+}
+
+function mapStopReason(r: string): string {
+  // Anthropic → OpenAI finish_reason
+  if (r === 'end_turn' || r === 'stop_sequence') return 'stop';
+  if (r === 'max_tokens') return 'length';
+  if (r === 'tool_use') return 'tool_calls';
+  return r;
+}
+
+function parseSseEvent(raw: string): { event: string; data: unknown } | null {
+  let event = '';
+  let dataLine = '';
+  for (const line of raw.split('\n')) {
+    if (line.startsWith('event:')) event = line.slice(6).trim();
+    else if (line.startsWith('data:')) dataLine += line.slice(5).trim();
+  }
+  if (dataLine === '') return null;
+  try {
+    return { event, data: JSON.parse(dataLine) as unknown };
+  } catch {
+    return null;
+  }
+}
+
+function cryptoNonce(): string {
+  // Not security-sensitive — just a short randomish id.
+  return Math.random().toString(36).slice(2, 10);
+}
--- a/src/mcpd/src/services/llm/adapters/openai-passthrough.ts
+++ b/src/mcpd/src/services/llm/adapters/openai-passthrough.ts
@@ -0,0 +1,112 @@
+/**
+ * OpenAI-passthrough adapter.
+ *
+ * Covers any provider that already speaks OpenAI chat/completions on the
+ * wire: `openai`, `vllm`, `deepseek`, `ollama` (with their openai-compatible
+ * endpoint enabled). The adapter forwards the request body verbatim and
+ * streams the response straight through — no wire translation.
+ *
+ * Defaults when `url` is empty:
+ *   - openai → https://api.openai.com
+ *   - deepseek → https://api.deepseek.com
+ *   - vllm/ollama → must be configured; these have no canonical public URL.
+ */
+import type { LlmAdapter, InferContext, NonStreamingResult, StreamingChunk, AdapterDeps } from '../types.js';
+
+const DEFAULT_URLS: Record<string, string> = {
+  openai: 'https://api.openai.com',
+  deepseek: 'https://api.deepseek.com',
+};
+
+export class OpenAiPassthroughAdapter implements LlmAdapter {
+  readonly kind: string;
+  private readonly fetchImpl: typeof globalThis.fetch;
+
+  constructor(kind: 'openai' | 'vllm' | 'deepseek' | 'ollama', deps: AdapterDeps = {}) {
+    this.kind = kind;
+    this.fetchImpl = deps.fetch ?? globalThis.fetch;
+  }
+
+  async infer(ctx: InferContext): Promise<NonStreamingResult> {
+    const url = this.endpointUrl(ctx.url);
+    const body = this.prepareBody(ctx, false);
+    const res = await this.fetchImpl(`${url}/v1/chat/completions`, {
+      method: 'POST',
+      headers: this.headers(ctx),
+      body: JSON.stringify(body),
+    });
+    const json = await res.json() as unknown;
+    return { status: res.status, body: json };
+  }
+
+  async *stream(ctx: InferContext): AsyncGenerator<StreamingChunk> {
+    const url = this.endpointUrl(ctx.url);
+    const body = this.prepareBody(ctx, true);
+    const res = await this.fetchImpl(`${url}/v1/chat/completions`, {
+      method: 'POST',
+      headers: this.headers(ctx),
+      body: JSON.stringify(body),
+    });
+    if (!res.ok || res.body === null) {
+      const text = await res.text().catch(() => '');
+      throw new Error(`${this.kind} stream: HTTP ${String(res.status)} ${text}`);
+    }
+
+    // Re-frame the provider's SSE stream into our `StreamingChunk` shape.
+    // OpenAI-compat providers already emit `data: {...}` + `data: [DONE]` —
+    // we just unwrap the `data: ` prefix, forward payloads, and emit a
+    // single terminal `done` chunk so the consumer always gets one.
+    const decoder = new TextDecoder();
+    let buf = '';
+    const reader = res.body.getReader();
+    try {
+      // eslint-disable-next-line no-constant-condition
+      while (true) {
+        const { value, done } = await reader.read();
+        if (done) break;
+        buf += decoder.decode(value, { stream: true });
+        let idx: number;
+        while ((idx = buf.indexOf('\n\n')) !== -1) {
+          const event = buf.slice(0, idx);
+          buf = buf.slice(idx + 2);
+          for (const line of event.split('\n')) {
+            if (!line.startsWith('data:')) continue;
+            const payload = line.slice(5).trim();
+            if (payload === '') continue;
+            if (payload === '[DONE]') {
+              yield { data: '[DONE]', done: true };
+              return;
+            }
+            yield { data: payload };
+          }
+        }
+      }
+    } finally {
+      reader.releaseLock();
+    }
+    // Provider closed without emitting [DONE] — give the consumer a clean end.
+    yield { data: '[DONE]', done: true };
+  }
+
+  private endpointUrl(url: string): string {
+    if (url !== '') return url.replace(/\/+$/, '');
+    const def = DEFAULT_URLS[this.kind];
+    if (def === undefined) {
+      throw new Error(`${this.kind}: url is required (no default endpoint for this provider)`);
+    }
+    return def;
+  }
+
+  private headers(ctx: InferContext): Record<string, string> {
+    const headers: Record<string, string> = { 'Content-Type': 'application/json' };
+    if (ctx.apiKey !== '') headers['Authorization'] = `Bearer ${ctx.apiKey}`;
+    return headers;
+  }
+
+  private prepareBody(ctx: InferContext, stream: boolean): Record<string, unknown> {
+    const out: Record<string, unknown> = { ...ctx.body };
+    if (out.model === undefined || out.model === '') out.model = ctx.modelOverride;
+    out.stream = stream;
+    return out;
+  }
+}
--- a/src/mcpd/src/services/llm/dispatcher.ts
+++ b/src/mcpd/src/services/llm/dispatcher.ts
@@ -0,0 +1,52 @@
+/**
+ * Adapter dispatcher for the inference proxy.
+ *
+ * `getAdapter(type)` returns the right adapter instance for an Llm's `type`
+ * column. Adapters are cached per-type — they carry no per-request state.
+ * The caller (the infer route) supplies the resolved API key + request body
+ * through `InferContext`, so a single adapter instance serves every Llm of
+ * that type.
+ */
+import type { LlmAdapter, AdapterDeps } from './types.js';
+import { OpenAiPassthroughAdapter } from './adapters/openai-passthrough.js';
+import { AnthropicAdapter } from './adapters/anthropic.js';
+
+export class UnsupportedProviderError extends Error {
+  constructor(type: string) {
+    super(`Unsupported LLM provider: ${type}`);
+    this.name = 'UnsupportedProviderError';
+  }
+}
+
+export class LlmAdapterRegistry {
+  private readonly cache = new Map<string, LlmAdapter>();
+
+  constructor(private readonly deps: AdapterDeps = {}) {}
+
+  get(type: string): LlmAdapter {
+    const cached = this.cache.get(type);
+    if (cached !== undefined) return cached;
+    const adapter = this.build(type);
+    this.cache.set(type, adapter);
+    return adapter;
+  }
+
+  private build(type: string): LlmAdapter {
+    switch (type) {
+      case 'openai':
+      case 'vllm':
+      case 'deepseek':
+      case 'ollama':
+        return new OpenAiPassthroughAdapter(type, this.deps);
+      case 'anthropic':
+        return new AnthropicAdapter(this.deps);
+      case 'gemini-cli':
+        // Intentionally deferred — gemini-cli requires the binary on the mcpd
+        // pod filesystem and subprocess lifecycle management. Flagged as
+        // homelab-only in the plan; not landing in this phase.
+        throw new UnsupportedProviderError(`${type} (subprocess providers are not supported in the proxy yet)`);
+      default:
+        throw new UnsupportedProviderError(type);
+    }
+  }
+}
--- a/src/mcpd/src/services/llm/types.ts
+++ b/src/mcpd/src/services/llm/types.ts
@@ -0,0 +1,70 @@
+/**
+ * Shared types for the LLM inference proxy.
+ *
+ * The wire format on the mcpctl side is OpenAI's chat/completions v1 — it's
+ * the de-facto lingua franca and every client library already speaks it.
+ * Provider-specific adapters translate to/from that shape.
+ */
+
+export interface OpenAiMessage {
+  role: 'system' | 'user' | 'assistant' | 'tool';
+  content: string | Array<{ type: string; text?: string; [k: string]: unknown }>;
+  name?: string;
+  tool_call_id?: string;
+  tool_calls?: Array<{ id: string; type: 'function'; function: { name: string; arguments: string } }>;
+}
+
+export interface OpenAiChatRequest {
+  model: string;
+  messages: OpenAiMessage[];
+  stream?: boolean;
+  temperature?: number;
+  max_tokens?: number;
+  top_p?: number;
+  stop?: string | string[];
+  tools?: Array<{ type: 'function'; function: { name: string; description?: string; parameters?: Record<string, unknown> } }>;
+  tool_choice?: unknown;
+  // Passthrough: unknown extras forwarded as-is.
+  [k: string]: unknown;
+}
+
+export interface InferContext {
+  /** Normalised OpenAI-format body. Adapters read/transform from here. */
+  body: OpenAiChatRequest;
+  /** The Llm row's `model` field, used when the request body has an empty model. */
+  modelOverride: string;
+  /** The resolved API key, or empty string for providers that don't take one. */
+  apiKey: string;
+  /** Target URL from the Llm row (may be empty for provider-default). */
+  url: string;
+  /** Arbitrary config from the Llm row (e.g. vllm gpu settings). */
+  extraConfig: Record<string, unknown>;
+}
+
+export interface NonStreamingResult {
+  status: number;
+  /** OpenAI chat.completion response body. */
+  body: unknown;
+}
+
+export interface StreamingChunk {
+  /** Raw SSE data payload. Consumer emits `data: <payload>\n\n`. */
+  data: string;
+  /** Mark the end of stream — consumer emits `data: [DONE]\n\n`. */
+  done?: boolean;
+}
+
+export interface LlmAdapter {
+  readonly kind: string;
+  /** Non-streaming request. Returns the final chat.completion body. */
+  infer(ctx: InferContext): Promise<NonStreamingResult>;
+  /**
+   * Streaming request. Yields OpenAI-format SSE chunks. Adapters translate
+   * provider-native stream formats into OpenAI `chat.completion.chunk`s.
+   */
+  stream(ctx: InferContext): AsyncGenerator<StreamingChunk>;
+}
+
+export interface AdapterDeps {
+  fetch?: typeof globalThis.fetch;
+}
--- a/src/mcpd/src/services/mcp-proxy-service.ts
+++ b/src/mcpd/src/services/mcp-proxy-service.ts
@@ -5,7 +5,7 @@ import { NotFoundError } from './mcp-server.service.js';
 import { InvalidStateError } from './instance.service.js';
 import { sendViaSse } from './transport/sse-client.js';
 import { sendViaStdio } from './transport/stdio-client.js';
-import { PersistentStdioClient } from './transport/persistent-stdio.js';
+import { PersistentStdioClient, type StdioMode } from './transport/persistent-stdio.js';

 /**
 * Build the spawn command for a runtime inside its runner container.
@@ -35,6 +35,18 @@ export interface McpProxyResponse {
  error?: { code: number; message: string; data?: unknown };
 }

+function formatError(err: unknown): string {
+  if (err instanceof Error) return err.message || err.toString();
+  if (err && typeof err === 'object') {
+    try {
+      return JSON.stringify(err);
+    } catch {
+      return Object.prototype.toString.call(err);
+    }
+  }
+  return String(err);
+}
+
 /**
 * Parses a streamable-http SSE response body to extract the JSON-RPC payload.
 * Streamable-http returns `event: message\ndata: {...}\n\n` format.
@@ -140,28 +152,48 @@ export class McpProxyService {
      }
      const packageName = server.packageName as string | null;
      const command = server.command as string[] | null;
+      const dockerImage = server.dockerImage as string | null;

-      if (!packageName && (!command || command.length === 0)) {
+      // Decide STDIO mode:
+      //   - packageName set  → exec via runtime runner (npx/uvx).
+      //   - command set      → exec the given command in the container.
+      //   - dockerImage only → attach to PID 1 (image entrypoint IS the MCP server).
+      //   - nothing          → unreachable, reject.
+      const runtime = (server.runtime as string | null) ?? 'node';
+      let mode: StdioMode;
+      if (command && command.length > 0) {
+        mode = { kind: 'exec', command };
+      } else if (packageName) {
+        mode = { kind: 'exec', command: buildRuntimeSpawnCmd(runtime, packageName) };
+      } else if (dockerImage) {
+        mode = { kind: 'attach' };
+      } else {
        throw new InvalidStateError(
-          `Server '${server.name}' (${server.id}) uses STDIO transport with a docker image ` +
-          `but has no command. Set 'command' to the image's entrypoint ` +
-          `(e.g. mcpctl edit server ${server.name} --command node --command build/index.js)`
+          `Server '${server.name}' (${server.id}) uses STDIO transport but has no ` +
+          `packageName, command, or dockerImage. Configure one of these.`,
        );
      }

-      // Build the spawn command based on runtime
-      const runtime = (server.runtime as string | null) ?? 'node';
-      const spawnCmd = command && command.length > 0
-        ? command
-        : buildRuntimeSpawnCmd(runtime, packageName!);
-
      // Try persistent connection first
      try {
-        return await this.sendViaPersistentStdio(instance.containerId, spawnCmd, method, params);
-      } catch {
-        // Persistent failed — fall back to one-shot
+        return await this.sendViaPersistentStdio(instance.containerId, mode, method, params);
+      } catch (err) {
        this.removeClient(instance.containerId);
-        return sendViaStdio(this.orchestrator, instance.containerId, packageName, method, params, 120_000, command, runtime);
+        // Fall back to one-shot exec when we have a command to run.
+        // Attach mode has no equivalent one-shot fallback — surface the error.
+        if (mode.kind === 'exec') {
+          return sendViaStdio(this.orchestrator, instance.containerId, packageName, method, params, 120_000, command, runtime);
+        }
+        const detail = formatError(err);
+        console.error(`[mcp-proxy] attach to ${instance.containerId} failed:`, err);
+        return {
+          jsonrpc: '2.0',
+          id: 1,
+          error: {
+            code: -32000,
+            message: `STDIO attach to '${instance.containerId}' failed: ${detail}`,
+          },
+        };
      }
    }

@@ -178,16 +210,17 @@ export class McpProxyService {

  /**
   * Send via a persistent STDIO connection (reused across calls).
+   * Mode is exec (run a command in the container) or attach (talk to PID 1).
   */
  private async sendViaPersistentStdio(
    containerId: string,
-    command: string[],
+    mode: StdioMode,
    method: string,
    params?: Record<string, unknown>,
  ): Promise<McpProxyResponse> {
    let client = this.stdioClients.get(containerId);
    if (!client) {
-      client = new PersistentStdioClient(this.orchestrator!, containerId, command);
+      client = new PersistentStdioClient(this.orchestrator!, containerId, mode);
      this.stdioClients.set(containerId, client);
    }
    return client.send(method, params);
--- a/src/mcpd/src/services/mcp-token.service.ts
+++ b/src/mcpd/src/services/mcp-token.service.ts
@@ -0,0 +1,222 @@
+import { generateToken, hashToken } from '@mcpctl/shared';
+import type { McpToken } from '@prisma/client';
+import type { IMcpTokenRepository, McpTokenWithRelations, McpTokenFilter } from '../repositories/interfaces.js';
+import type { IRbacDefinitionRepository } from '../repositories/rbac-definition.repository.js';
+import type { IProjectRepository } from '../repositories/project.repository.js';
+import { CreateMcpTokenSchema } from '../validation/mcp-token.schema.js';
+import { isResourceBinding, type RbacRoleBinding, type RbacSubject } from '../validation/rbac-definition.schema.js';
+import type { RbacService, Permission } from './rbac.service.js';
+import { ROLE_ACTIONS_FOR_CEILING } from './rbac.service.js';
+import { NotFoundError, ConflictError } from './mcp-server.service.js';
+
+/** Thrown when the requesting user tries to mint a token with bindings they cannot grant themselves. */
+export class PermissionCeilingError extends Error {
+  constructor(message: string) {
+    super(message);
+    this.name = 'PermissionCeilingError';
+  }
+}
+
+export interface CreateMcpTokenResult {
+  /** The database row (with project/owner relations). */
+  mcpToken: McpTokenWithRelations;
+  /** The raw bearer token — shown exactly once. */
+  raw: string;
+}
+
+export interface IntrospectResult {
+  ok: boolean;
+  tokenId?: string;
+  tokenName?: string;
+  tokenSha?: string;
+  projectId?: string;
+  projectName?: string;
+  ownerId?: string;
+  expired?: boolean;
+  revoked?: boolean;
+}
+
+export class McpTokenService {
+  constructor(
+    private readonly tokenRepo: IMcpTokenRepository,
+    private readonly projectRepo: IProjectRepository,
+    private readonly rbacRepo: IRbacDefinitionRepository,
+    private readonly rbacService: RbacService,
+  ) {}
+
+  async list(filter?: McpTokenFilter): Promise<McpTokenWithRelations[]> {
+    return this.tokenRepo.findAll(filter);
+  }
+
+  async getById(id: string): Promise<McpTokenWithRelations> {
+    const row = await this.tokenRepo.findById(id);
+    if (row === null) throw new NotFoundError(`McpToken not found: ${id}`);
+    return row;
+  }
+
+  /** Hash + lookup a raw bearer. Returns the row if valid and active; null if missing, revoked, or expired. */
+  async introspectRaw(raw: string): Promise<IntrospectResult> {
+    const hash = hashToken(raw);
+    const row = await this.tokenRepo.findByHash(hash);
+    if (row === null) return { ok: false };
+
+    const now = new Date();
+    const revoked = row.revokedAt !== null;
+    const expired = row.expiresAt !== null && row.expiresAt < now;
+
+    if (revoked || expired) {
+      return {
+        ok: false,
+        tokenId: row.id,
+        tokenName: row.name,
+        tokenSha: row.tokenHash,
+        revoked,
+        expired,
+      };
+    }
+
+    // Best-effort last-used tracking (don't block on this).
+    this.tokenRepo.touchLastUsed(row.id).catch(() => { /* ignore */ });
+
+    return {
+      ok: true,
+      tokenId: row.id,
+      tokenName: row.name,
+      tokenSha: row.tokenHash,
+      projectId: row.projectId,
+      projectName: row.project.name,
+      ownerId: row.ownerId,
+      expired: false,
+      revoked: false,
+    };
+  }
+
+  async create(creatorUserId: string, input: unknown): Promise<CreateMcpTokenResult> {
+    const data = CreateMcpTokenSchema.parse(input);
+
+    const project = await this.projectRepo.findById(data.projectId);
+    if (project === null) throw new NotFoundError(`Project not found: ${data.projectId}`);
+
+    const existing = await this.tokenRepo.findByNameAndProject(data.name, data.projectId);
+    if (existing !== null && existing.revokedAt === null) {
+      throw new ConflictError(`McpToken already exists: ${data.name} in project ${project.name}`);
+    }
+
+    // Resolve the effective bindings:
+    //   base = rbacMode === 'clone' ? snapshot(creator) : []
+    //   effective = base + explicit bindings
+    const basePerms = data.rbacMode === 'clone'
+      ? await this.rbacService.getPermissions(creatorUserId)
+      : [];
+    const baseBindings = basePerms.map(permissionToBinding);
+    const effectiveBindings: RbacRoleBinding[] = [...baseBindings, ...data.bindings];
+
+    // Creator ceiling: every effective binding must be within what creator can do.
+    // Cloned bindings are trivially satisfied; explicit ones may not be.
+    for (const binding of data.bindings) {
+      const violation = await this.checkCeiling(creatorUserId, binding);
+      if (violation !== null) throw new PermissionCeilingError(violation);
+    }
+
+    // Generate the token
+    const { raw, hash, prefix } = generateToken();
+
+    // Normalize expiresAt
+    let expiresAt: Date | null = null;
+    if (data.expiresAt !== undefined && data.expiresAt !== null) {
+      expiresAt = typeof data.expiresAt === 'string' ? new Date(data.expiresAt) : data.expiresAt;
+    }
+
+    const createArgs: {
+      name: string;
+      projectId: string;
+      ownerId: string;
+      tokenHash: string;
+      tokenPrefix: string;
+      description?: string;
+      expiresAt: Date | null;
+    } = {
+      name: data.name,
+      projectId: data.projectId,
+      ownerId: creatorUserId,
+      tokenHash: hash,
+      tokenPrefix: prefix,
+      expiresAt,
+    };
+    if (data.description !== undefined) createArgs.description = data.description;
+    const row = await this.tokenRepo.create(createArgs);
+
+    // If the token has bindings, auto-create an RbacDefinition so the token is a real RBAC principal.
+    if (effectiveBindings.length > 0) {
+      const subject: RbacSubject = { kind: 'McpToken', name: hash };
+      await this.rbacRepo.create({
+        name: rbacDefNameFor(row),
+        subjects: [subject],
+        roleBindings: effectiveBindings,
+      });
+    }
+
+    return { mcpToken: row, raw };
+  }
+
+  async revoke(id: string): Promise<McpTokenWithRelations> {
+    const existing = await this.getById(id);
+    const row = await this.tokenRepo.revoke(id);
+    // Remove the RBAC definition so the token's bindings stop resolving immediately.
+    await this.deleteRbacDefinitionFor(existing).catch(() => { /* ignore */ });
+    return row;
+  }
+
+  async delete(id: string): Promise<void> {
+    const existing = await this.getById(id);
+    await this.deleteRbacDefinitionFor(existing).catch(() => { /* ignore */ });
+    await this.tokenRepo.delete(id);
+  }
+
+  private async deleteRbacDefinitionFor(row: McpToken): Promise<void> {
+    const name = rbacDefNameFor(row);
+    const existing = await this.rbacRepo.findByName(name);
+    if (existing === null) return;
+    await this.rbacRepo.delete(existing.id);
+  }
+
+  /**
+   * For a single requested binding, return null if the creator can grant it,
+   * or a human-readable reason string if they cannot.
+   */
+  private async checkCeiling(creatorUserId: string, binding: RbacRoleBinding): Promise<string | null> {
+    if (isResourceBinding(binding)) {
+      const grantedActions = ROLE_ACTIONS_FOR_CEILING[binding.role] ?? [];
+      for (const action of grantedActions) {
+        const ok = await this.rbacService.canAccess(
+          creatorUserId,
+          action,
+          binding.resource,
+          binding.name,
+        );
+        if (!ok) {
+          return `Ceiling violation: you do not have permission '${action}' on ${binding.resource}${binding.name !== undefined ? `/${binding.name}` : ''}`;
+        }
+      }
+      return null;
+    }
+    // Operation binding
+    const ok = await this.rbacService.canRunOperation(creatorUserId, binding.action);
+    if (!ok) return `Ceiling violation: you cannot run operation '${binding.action}'`;
+    return null;
+  }
+}
+
+function permissionToBinding(p: Permission): RbacRoleBinding {
+  if ('resource' in p) {
+    return p.name !== undefined
+      ? { role: p.role as RbacRoleBinding extends { role: infer R } ? R : never, resource: p.resource, name: p.name } as RbacRoleBinding
+      : { role: p.role, resource: p.resource } as RbacRoleBinding;
+  }
+  return { role: 'run', action: p.action };
+}
+
+function rbacDefNameFor(row: { id: string }): string {
+  // Must match the regex in CreateRbacDefinitionSchema (lowercase alphanumeric with hyphens).
+  return `mcptoken-${row.id.toLowerCase()}`;
+}
--- a/src/mcpd/src/services/rbac.service.ts
+++ b/src/mcpd/src/services/rbac.service.ts
@@ -38,6 +38,9 @@ const ROLE_ACTIONS: Record<string, readonly RbacAction[]> = {
  expose: ['expose', 'view'],
 };

+/** Exported alias for permission-ceiling checks elsewhere (e.g. McpTokenService). */
+export const ROLE_ACTIONS_FOR_CEILING = ROLE_ACTIONS;
+
 export class RbacService {
  constructor(
    private readonly rbacRepo: IRbacDefinitionRepository,
@@ -50,8 +53,8 @@ export class RbacService {
   *   If provided, name-scoped bindings only match when their name equals this.
   *   If omitted (listing), name-scoped bindings still grant access.
   */
-  async canAccess(userId: string, action: RbacAction, resource: string, resourceName?: string, serviceAccountName?: string): Promise<boolean> {
-    const permissions = await this.getPermissions(userId, serviceAccountName);
+  async canAccess(userId: string, action: RbacAction, resource: string, resourceName?: string, serviceAccountName?: string, mcpTokenSha?: string): Promise<boolean> {
+    const permissions = await this.getPermissions(userId, serviceAccountName, mcpTokenSha);
    const normalized = normalizeResource(resource);

    for (const perm of permissions) {
@@ -73,8 +76,8 @@ export class RbacService {
   * Check whether a user is allowed to perform a named operation.
   * Operations require an explicit 'run' role binding with a matching action.
   */
-  async canRunOperation(userId: string, operation: string, serviceAccountName?: string): Promise<boolean> {
-    const permissions = await this.getPermissions(userId, serviceAccountName);
+  async canRunOperation(userId: string, operation: string, serviceAccountName?: string, mcpTokenSha?: string): Promise<boolean> {
+    const permissions = await this.getPermissions(userId, serviceAccountName, mcpTokenSha);

    for (const perm of permissions) {
      if ('action' in perm && perm.role === 'run' && perm.action === operation) {
@@ -90,8 +93,8 @@ export class RbacService {
   * Returns wildcard:true if any matching binding is unscoped (no name constraint).
   * Returns wildcard:false with a set of allowed names if all bindings are name-scoped.
   */
-  async getAllowedScope(userId: string, action: RbacAction, resource: string, serviceAccountName?: string): Promise<AllowedScope> {
-    const permissions = await this.getPermissions(userId, serviceAccountName);
+  async getAllowedScope(userId: string, action: RbacAction, resource: string, serviceAccountName?: string, mcpTokenSha?: string): Promise<AllowedScope> {
+    const permissions = await this.getPermissions(userId, serviceAccountName, mcpTokenSha);
    const normalized = normalizeResource(resource);
    const names = new Set<string>();

@@ -113,13 +116,13 @@ export class RbacService {
  /**
   * Collect all permissions for a user across all matching RbacDefinitions.
   */
-  async getPermissions(userId: string, serviceAccountName?: string): Promise<Permission[]> {
+  async getPermissions(userId: string, serviceAccountName?: string, mcpTokenSha?: string): Promise<Permission[]> {
    // 1. Resolve user email
    const user = await this.prisma.user.findUnique({
      where: { id: userId },
      select: { email: true },
    });
-    if (user === null && serviceAccountName === undefined) return [];
+    if (user === null && serviceAccountName === undefined && mcpTokenSha === undefined) return [];

    // 2. Resolve group names the user belongs to
    let groupNames: string[] = [];
@@ -142,6 +145,7 @@ export class RbacService {
        if (s.kind === 'User') return user !== null && s.name === user.email;
        if (s.kind === 'Group') return groupNames.includes(s.name);
        if (s.kind === 'ServiceAccount') return serviceAccountName !== undefined && s.name === serviceAccountName;
+        if (s.kind === 'McpToken') return mcpTokenSha !== undefined && s.name === mcpTokenSha;
        return false;
      });

--- a/src/mcpd/src/services/secret-backend-rotator-loop.ts
+++ b/src/mcpd/src/services/secret-backend-rotator-loop.ts
@@ -0,0 +1,129 @@
+/**
+ * Background loop that drives `SecretBackendRotator` on a 24h cadence.
+ *
+ * - On `start()`: scan all rotatable backends. For each that is overdue
+ *   (never rotated OR last rotation > 24h ago), kick rotation immediately.
+ *   Then schedule a per-backend setTimeout for the next tick.
+ * - On `stop()`: clear every pending timer. Called from the graceful-shutdown
+ *   hook so restarts don't leak timers or interrupt an in-flight rotation.
+ *
+ * Jitter (±10 min by default) keeps multiple mcpd replicas from hammering
+ * OpenBao simultaneously if someone scales the Deployment up.
+ *
+ * Failures are swallowed with a warn log — the next scheduled tick will
+ * retry. The rotator service itself writes `lastRotationError` to the row
+ * so operators see the failure in `describe`.
+ */
+import type { SecretBackend } from '@prisma/client';
+import type { SecretBackendService } from './secret-backend.service.js';
+import type { SecretBackendRotator } from './secret-backend-rotator.service.js';
+
+export interface SecretBackendRotatorLoopDeps {
+  backends: SecretBackendService;
+  rotator: SecretBackendRotator;
+  /** Millisecond jitter applied to the 24h base interval; defaults to ±600_000 (10 min). */
+  jitterMs?: number;
+  /** Override in tests. */
+  setTimeout?: (cb: () => void, ms: number) => NodeJS.Timeout;
+  clearTimeout?: (t: NodeJS.Timeout) => void;
+  log?: { info: (msg: string) => void; warn: (msg: string) => void };
+}
+
+const DEFAULT_INTERVAL_MS = 24 * 3600 * 1000;
+const DEFAULT_JITTER_MS = 10 * 60 * 1000;
+
+export class SecretBackendRotatorLoop {
+  private readonly timers = new Map<string, NodeJS.Timeout>();
+  private readonly setT: (cb: () => void, ms: number) => NodeJS.Timeout;
+  private readonly clearT: (t: NodeJS.Timeout) => void;
+  private readonly log: { info: (msg: string) => void; warn: (msg: string) => void };
+  private stopped = false;
+
+  constructor(private readonly deps: SecretBackendRotatorLoopDeps) {
+    this.setT = deps.setTimeout ?? ((cb, ms) => global.setTimeout(cb, ms));
+    this.clearT = deps.clearTimeout ?? ((t) => global.clearTimeout(t));
+    this.log = deps.log ?? {
+      // eslint-disable-next-line no-console
+      info: (m) => console.log(`[rotator] ${m}`),
+      // eslint-disable-next-line no-console
+      warn: (m) => console.warn(`[rotator] ${m}`),
+    };
+  }
+
+  async start(): Promise<void> {
+    const backends = (await this.deps.backends.list())
+      .filter((b) => this.deps.rotator.isRotatable(b));
+
+    if (backends.length === 0) {
+      this.log.info('no rotatable backends registered — loop idle');
+      return;
+    }
+    this.log.info(`starting rotation loop for ${String(backends.length)} backend(s)`);
+
+    for (const b of backends) {
+      if (this.deps.rotator.isOverdue(b)) {
+        this.log.info(`backend '${b.name}' is overdue — rotating now`);
+        this.runOnce(b.id, b.name).catch((err) => {
+          this.log.warn(`initial rotation of '${b.name}' failed: ${err instanceof Error ? err.message : String(err)}`);
+        });
+      }
+      this.schedule(b);
+    }
+  }
+
+  stop(): void {
+    this.stopped = true;
+    for (const [, t] of this.timers) this.clearT(t);
+    this.timers.clear();
+    this.log.info('rotation loop stopped');
+  }
+
+  /** Test hook — force a rotation + rescheduling for one backend. */
+  async rotateNow(backendId: string): Promise<void> {
+    const backend = await this.deps.backends.getById(backendId);
+    await this.runOnce(backendId, backend.name);
+    this.schedule(backend);
+  }
+
+  private schedule(backend: SecretBackend): void {
+    if (this.stopped) return;
+    // Clear any existing timer for this backend
+    const prev = this.timers.get(backend.id);
+    if (prev !== undefined) this.clearT(prev);
+
+    const delay = this.nextDelayMs(backend);
+    const t = this.setT(() => {
+      this.runOnce(backend.id, backend.name)
+        .catch((err) => this.log.warn(`scheduled rotation of '${backend.name}' failed: ${err instanceof Error ? err.message : String(err)}`))
+        .finally(() => {
+          // Re-fetch to pick up latest tokenMeta (nextRenewalAt) for the next delay calc.
+          if (this.stopped) return;
+          this.deps.backends.getById(backend.id)
+            .then((b) => this.schedule(b))
+            .catch((err) => this.log.warn(`re-schedule lookup for '${backend.name}' failed: ${err instanceof Error ? err.message : String(err)}`));
+        });
+    }, delay);
+    this.timers.set(backend.id, t);
+  }
+
+  private async runOnce(backendId: string, name: string): Promise<void> {
+    try {
+      await this.deps.rotator.rotateOne(backendId);
+      this.log.info(`rotated '${name}' successfully`);
+    } catch (err) {
+      // Error already recorded in tokenMeta by rotator; just log.
+      throw err;
+    }
+  }
+
+  private nextDelayMs(backend: SecretBackend): number {
+    const cfg = backend.config as { rotation?: { intervalHours?: number } };
+    const baseMs = cfg.rotation?.intervalHours !== undefined
+      ? cfg.rotation.intervalHours * 3600 * 1000
+      : DEFAULT_INTERVAL_MS;
+    const jitter = this.deps.jitterMs ?? DEFAULT_JITTER_MS;
+    // Uniform in [-jitter, +jitter]
+    const offset = (Math.random() * 2 - 1) * jitter;
+    return Math.max(60_000, Math.floor(baseMs + offset));
+  }
+}
--- a/src/mcpd/src/services/secret-backend-rotator.service.ts
+++ b/src/mcpd/src/services/secret-backend-rotator.service.ts
@@ -0,0 +1,186 @@
+/**
+ * Rotator for wizard-provisioned OpenBao backends.
+ *
+ * Flow on every tick:
+ *   1. Read the CURRENT mcpd token from its backing plaintext Secret.
+ *   2. Use that token to mint a SUCCESSOR via `auth/token/create/<role>`
+ *      (the `app-mcpd` policy grants the caller exactly this path).
+ *   3. Verify the successor with `auth/token/lookup-self`.
+ *   4. Persist the successor in the same Secret (overwriting the old value).
+ *   5. Revoke the predecessor by accessor (best-effort; old tokens expire on
+ *      their own anyway).
+ *   6. Update `tokenMeta` on the SecretBackend row with the new timestamps.
+ *
+ * On any failure: old token remains in place, `tokenMeta.lastRotationError`
+ * is populated, the exception is re-thrown. Old tokens still have ~29 days
+ * of remaining TTL by design (ttl=720h, rotation cadence=24h), so a few
+ * days of rotation failures are survivable without a user outage.
+ */
+import type { SecretBackend } from '@prisma/client';
+import {
+  mintRoleToken,
+  lookupSelf,
+  revokeAccessor,
+  type VaultDeps,
+  type MintedToken,
+} from '@mcpctl/shared';
+import type { SecretBackendService } from './secret-backend.service.js';
+import type { SecretService } from './secret.service.js';
+
+/** Shape of `SecretBackend.config` we require for rotation. */
+export interface RotatableOpenBaoConfig {
+  url: string;
+  auth?: 'token';
+  mount?: string;
+  pathPrefix?: string;
+  namespace?: string;
+  tokenSecretRef: { name: string; key: string };
+  rotation: {
+    enabled: true;
+    tokenRole: string;
+    intervalHours?: number;
+  };
+}
+
+/** Shape we store in `SecretBackend.tokenMeta`. */
+export interface TokenMeta {
+  generatedAt?: string;
+  nextRenewalAt?: string;
+  validUntil?: string;
+  lastRotationAt?: string;
+  lastRotationError?: string | null;
+  currentAccessor?: string;
+  rotatable?: boolean;
+}
+
+export interface SecretBackendRotatorDeps {
+  backends: SecretBackendService;
+  secrets: SecretService;
+  fetch?: typeof globalThis.fetch;
+  now?: () => Date;
+}
+
+export class SecretBackendRotator {
+  private readonly now: () => Date;
+
+  constructor(private readonly deps: SecretBackendRotatorDeps) {
+    this.now = deps.now ?? (() => new Date());
+  }
+
+  /** True iff this backend is a wizard-provisioned token-auth openbao with rotation enabled. */
+  isRotatable(backend: SecretBackend): boolean {
+    if (backend.type !== 'openbao') return false;
+    const cfg = backend.config as Partial<RotatableOpenBaoConfig>;
+    return (cfg.auth ?? 'token') === 'token'
+      && cfg.rotation?.enabled === true
+      && typeof cfg.rotation?.tokenRole === 'string'
+      && typeof cfg.tokenSecretRef?.name === 'string';
+  }
+
+  /**
+   * Execute one rotation pass on the given backend. Returns the freshly
+   * recorded `tokenMeta`. Throws on any failure — callers decide whether to
+   * log + move on (loop) or propagate (manual trigger).
+   */
+  async rotateOne(backendId: string): Promise<TokenMeta> {
+    const backend = await this.deps.backends.getById(backendId);
+    if (!this.isRotatable(backend)) {
+      throw new Error(`SecretBackend '${backend.name}' is not rotatable (need type=openbao, auth=token, rotation.enabled=true)`);
+    }
+    const cfg = backend.config as unknown as RotatableOpenBaoConfig;
+    const meta = (backend.tokenMeta as unknown as TokenMeta | null | undefined) ?? {};
+
+    const vaultDeps: VaultDeps = {};
+    if (this.deps.fetch !== undefined) vaultDeps.fetch = this.deps.fetch;
+    if (cfg.namespace !== undefined) vaultDeps.namespace = cfg.namespace;
+
+    // 1. Read current token from the backing plaintext Secret.
+    const secretRow = await this.deps.secrets.getByName(cfg.tokenSecretRef.name);
+    const data = await this.deps.secrets.resolveData(secretRow);
+    const currentToken = data[cfg.tokenSecretRef.key];
+    if (currentToken === undefined || currentToken === '') {
+      const err = new Error(`rotation: current token missing at ${cfg.tokenSecretRef.name}/${cfg.tokenSecretRef.key}`);
+      await this.recordError(backendId, meta, err.message);
+      throw err;
+    }
+    const oldAccessor = meta.currentAccessor;
+
+    let minted: MintedToken;
+    try {
+      // 2. Mint successor.
+      minted = await mintRoleToken(cfg.url, currentToken, cfg.rotation.tokenRole, vaultDeps);
+      if (!minted.renewable) {
+        throw new Error(`minted token from role '${cfg.rotation.tokenRole}' is not renewable — check the token role's renewable + period settings`);
+      }
+
+      // 3. Verify successor works (belt-and-suspenders — if bao returned a token
+      //    that can't auth back, we'd lock ourselves out on persist).
+      await lookupSelf(cfg.url, minted.clientToken, vaultDeps);
+
+      // 4. Persist successor in the same Secret. Update in-place — we keep
+      //    the other keys (if any) intact.
+      const nextData = { ...data, [cfg.tokenSecretRef.key]: minted.clientToken };
+      await this.deps.secrets.update(secretRow.id, { data: nextData });
+    } catch (err) {
+      const msg = err instanceof Error ? err.message : String(err);
+      await this.recordError(backendId, meta, msg);
+      throw err;
+    }
+
+    // 5. Revoke predecessor (best-effort — old tokens expire anyway).
+    if (oldAccessor !== undefined && oldAccessor !== '') {
+      try {
+        await revokeAccessor(cfg.url, minted.clientToken, oldAccessor, vaultDeps);
+      } catch (err) {
+        // Log but don't fail the rotation — the new token is already live.
+        const msg = err instanceof Error ? err.message : String(err);
+        // eslint-disable-next-line no-console
+        console.warn(`rotation: revoke old accessor '${oldAccessor}' on backend '${backend.name}' failed (continuing): ${msg}`);
+      }
+    }
+
+    // 6. Record success in tokenMeta.
+    const now = this.now();
+    const intervalHours = cfg.rotation.intervalHours ?? 24;
+    const nextMeta: TokenMeta = {
+      generatedAt: now.toISOString(),
+      nextRenewalAt: new Date(now.getTime() + intervalHours * 3600 * 1000).toISOString(),
+      validUntil: minted.leaseDuration > 0
+        ? new Date(now.getTime() + minted.leaseDuration * 1000).toISOString()
+        : undefined as unknown as string, // typed but optional; undefined drops on JSON round-trip
+      lastRotationAt: now.toISOString(),
+      lastRotationError: null,
+      currentAccessor: minted.accessor,
+      rotatable: true,
+    };
+    // Strip undefined so JSON is clean.
+    const cleanMeta: Record<string, unknown> = {};
+    for (const [k, v] of Object.entries(nextMeta)) {
+      if (v !== undefined) cleanMeta[k] = v;
+    }
+    await this.deps.backends.updateTokenMeta(backendId, cleanMeta);
+    return nextMeta;
+  }
+
+  /** Is this backend overdue for rotation? Used by the loop on startup. */
+  isOverdue(backend: SecretBackend): boolean {
+    const meta = (backend.tokenMeta as unknown as TokenMeta | null | undefined) ?? {};
+    if (meta.lastRotationAt === undefined) return true;
+    const last = new Date(meta.lastRotationAt).getTime();
+    if (Number.isNaN(last)) return true;
+    const cfg = backend.config as Partial<RotatableOpenBaoConfig>;
+    const intervalHours = cfg.rotation?.intervalHours ?? 24;
+    return this.now().getTime() - last > intervalHours * 3600 * 1000;
+  }
+
+  private async recordError(backendId: string, prev: TokenMeta, message: string): Promise<void> {
+    const nextMeta: Record<string, unknown> = { ...prev, lastRotationError: message };
+    try {
+      await this.deps.backends.updateTokenMeta(backendId, nextMeta);
+    } catch (inner) {
+      // Don't mask the original error — just log the DB failure.
+      // eslint-disable-next-line no-console
+      console.warn(`rotation: failed to persist lastRotationError (${message}): ${inner instanceof Error ? inner.message : String(inner)}`);
+    }
+  }
+}
--- a/src/mcpd/src/services/secret-backend.service.ts
+++ b/src/mcpd/src/services/secret-backend.service.ts
@@ -0,0 +1,98 @@
+import type { SecretBackend } from '@prisma/client';
+import type { ISecretBackendRepository } from '../repositories/secret-backend.repository.js';
+import type { SecretBackendDriver } from './secret-backends/types.js';
+import { createDriver, type DriverFactoryDeps } from './secret-backends/factory.js';
+import { NotFoundError, ConflictError } from './mcp-server.service.js';
+
+export class SecretBackendInUseError extends Error {
+  constructor(backendName: string, count: number) {
+    super(`SecretBackend '${backendName}' is still referenced by ${String(count)} secret(s); migrate them first`);
+    this.name = 'SecretBackendInUseError';
+  }
+}
+
+export class SecretBackendService {
+  private driverCache = new Map<string, SecretBackendDriver>(); // keyed by backend id
+
+  constructor(
+    private readonly repo: ISecretBackendRepository,
+    private readonly driverDeps: DriverFactoryDeps,
+  ) {}
+
+  async list(): Promise<SecretBackend[]> {
+    return this.repo.findAll();
+  }
+
+  async getById(id: string): Promise<SecretBackend> {
+    const row = await this.repo.findById(id);
+    if (row === null) throw new NotFoundError(`SecretBackend not found: ${id}`);
+    return row;
+  }
+
+  async getByName(name: string): Promise<SecretBackend> {
+    const row = await this.repo.findByName(name);
+    if (row === null) throw new NotFoundError(`SecretBackend not found: ${name}`);
+    return row;
+  }
+
+  async getDefault(): Promise<SecretBackend> {
+    const row = await this.repo.findDefault();
+    if (row === null) {
+      throw new Error('No default SecretBackend configured. This shouldn\'t happen — the plaintext row should have been seeded on startup.');
+    }
+    return row;
+  }
+
+  async create(input: {
+    name: string;
+    type: string;
+    config?: Record<string, unknown>;
+    isDefault?: boolean;
+    description?: string;
+  }): Promise<SecretBackend> {
+    if (!input.name || !input.type) throw new Error('name and type are required');
+    const existing = await this.repo.findByName(input.name);
+    if (existing !== null) throw new ConflictError(`SecretBackend already exists: ${input.name}`);
+    return this.repo.create(input);
+  }
+
+  async update(id: string, input: { config?: Record<string, unknown>; isDefault?: boolean; description?: string }): Promise<SecretBackend> {
+    await this.getById(id);
+    const row = await this.repo.update(id, input);
+    this.driverCache.delete(id); // config may have changed; rebuild lazily
+    return row;
+  }
+
+  /**
+   * Replace `tokenMeta` on a backend row. Called exclusively by the rotator
+   * service every time it mints or fails to mint a successor token. The field
+   * is runtime state (not user-managed config) so it bypasses the normal
+   * update path + doesn't invalidate the driver cache.
+   */
+  async updateTokenMeta(id: string, tokenMeta: Record<string, unknown>): Promise<SecretBackend> {
+    return this.repo.update(id, { tokenMeta });
+  }
+
+  async setDefault(id: string): Promise<SecretBackend> {
+    await this.getById(id);
+    return this.repo.setAsDefault(id);
+  }
+
+  async delete(id: string): Promise<void> {
+    const row = await this.getById(id);
+    const count = await this.repo.countReferencingSecrets(id);
+    if (count > 0) throw new SecretBackendInUseError(row.name, count);
+    if (row.isDefault) throw new Error(`Cannot delete the default SecretBackend '${row.name}'; promote another one first`);
+    await this.repo.delete(id);
+    this.driverCache.delete(id);
+  }
+
+  /** Get the driver for a given backend id, creating + caching on first call. */
+  driverFor(backend: SecretBackend): SecretBackendDriver {
+    const cached = this.driverCache.get(backend.id);
+    if (cached) return cached;
+    const driver = createDriver(backend, this.driverDeps);
+    this.driverCache.set(backend.id, driver);
+    return driver;
+  }
+}
--- a/src/mcpd/src/services/secret-backends/factory.ts
+++ b/src/mcpd/src/services/secret-backends/factory.ts
@@ -0,0 +1,59 @@
+/**
+ * Build a `SecretBackendDriver` from a `SecretBackend` row.
+ *
+ * Lives separate from the service because it's the only place aware of every
+ * driver type — adding a new backend means adding one case here and one
+ * driver file. Everything else (service, routes, CLI) is type-agnostic.
+ */
+import type { SecretBackend } from '@prisma/client';
+import type { SecretBackendDriver, SecretRefResolver } from './types.js';
+import { PlaintextDriver, type PlaintextDriverDeps } from './plaintext.js';
+import { OpenBaoDriver, type OpenBaoConfig } from './openbao.js';
+
+export interface DriverFactoryDeps {
+  plaintext: PlaintextDriverDeps;
+  /** Resolves `{secretName, key}` against the plaintext backend — used by remote drivers' auth. */
+  secretRefResolver: SecretRefResolver;
+  /** Overridable for tests. */
+  fetch?: typeof globalThis.fetch;
+}
+
+export function createDriver(row: SecretBackend, deps: DriverFactoryDeps): SecretBackendDriver {
+  switch (row.type) {
+    case 'plaintext':
+      return new PlaintextDriver(deps.plaintext);
+
+    case 'openbao': {
+      const cfg = row.config as unknown as OpenBaoConfig;
+      if (!cfg.url) {
+        throw new Error(`SecretBackend '${row.name}' (openbao): config.url is required`);
+      }
+      const auth = cfg.auth ?? 'token';
+      if (auth === 'token') {
+        const t = cfg as Extract<OpenBaoConfig, { auth?: 'token' }>;
+        if (!t.tokenSecretRef?.name || !t.tokenSecretRef?.key) {
+          throw new Error(
+            `SecretBackend '${row.name}' (openbao token auth): config.tokenSecretRef {name, key} is required`,
+          );
+        }
+      } else if (auth === 'kubernetes') {
+        const k = cfg as Extract<OpenBaoConfig, { auth: 'kubernetes' }>;
+        if (!k.role) {
+          throw new Error(
+            `SecretBackend '${row.name}' (openbao kubernetes auth): config.role is required`,
+          );
+        }
+      } else {
+        throw new Error(`SecretBackend '${row.name}' (openbao): unknown auth '${String(auth)}'`);
+      }
+      const driverDeps: { fetch?: typeof globalThis.fetch; secretRefResolver: SecretRefResolver } = {
+        secretRefResolver: deps.secretRefResolver,
+      };
+      if (deps.fetch !== undefined) driverDeps.fetch = deps.fetch;
+      return new OpenBaoDriver(cfg, driverDeps);
+    }
+
+    default:
+      throw new Error(`Unknown SecretBackend type: ${row.type}`);
+  }
+}
--- a/src/mcpd/src/services/secret-backends/openbao.ts
+++ b/src/mcpd/src/services/secret-backends/openbao.ts
@@ -0,0 +1,247 @@
+/**
+ * OpenBao (MPL 2.0 fork of HashiCorp Vault) driver for the KV v2 secrets engine.
+ *
+ * Uses the plain HTTP API — no third-party client — so we don't pick up a
+ * Vault SDK licensing headache. Endpoints touched:
+ *
+ *   POST   <url>/v1/<mount>/data/<path>     -- write
+ *   GET    <url>/v1/<mount>/data/<path>     -- read latest
+ *   DELETE <url>/v1/<mount>/metadata/<path> -- full delete (all versions)
+ *   LIST   <url>/v1/<mount>/metadata/        -- for migration
+ *   POST   <url>/v1/auth/<mount>/login      -- kubernetes auth
+ *
+ * Auth strategies (`config.auth`):
+ *   - `token` (default): static token loaded once via the injected
+ *     SecretRefResolver from a Secret on the plaintext backend
+ *     (`tokenSecretRef = { name, key }`). Cached for the driver's lifetime —
+ *     no expiry handling.
+ *   - `kubernetes`: log in to OpenBao's Kubernetes auth method using the
+ *     pod's projected ServiceAccount token. Vault returns a client token +
+ *     lease TTL; we cache it and renew lazily on TTL expiry, with a 60s
+ *     grace window. No static credentials in the database — the bao-side
+ *     role binds to the mcpd ServiceAccount + namespace.
+ *
+ * Path layout inside OpenBao:
+ *   <mount>/<pathPrefix>/<secretName>
+ * `mount` and `pathPrefix` come from the backend's `config` JSON; defaults are
+ * `secret` and `mcpctl/`.
+ */
+import { readFile } from 'node:fs/promises';
+import type { SecretBackendDriver, SecretData, ExternalRef, SecretRefResolver } from './types.js';
+
+export interface OpenBaoConfigBase {
+  url: string;
+  mount?: string;
+  pathPrefix?: string;
+  namespace?: string;
+}
+
+export interface OpenBaoConfigToken extends OpenBaoConfigBase {
+  auth?: 'token';
+  tokenSecretRef: { name: string; key: string };
+}
+
+export interface OpenBaoConfigKubernetes extends OpenBaoConfigBase {
+  auth: 'kubernetes';
+  /** Vault role to login as (configured server-side at `auth/<authMount>/role/<role>`). */
+  role: string;
+  /** Auth method mount path. Defaults to `kubernetes`. */
+  authMount?: string;
+  /**
+   * Filesystem path to the projected ServiceAccount token. Defaults to
+   * `/var/run/secrets/kubernetes.io/serviceaccount/token` (the standard
+   * mount). Override only for tests or non-default projections.
+   */
+  serviceAccountTokenPath?: string;
+}
+
+export type OpenBaoConfig = OpenBaoConfigToken | OpenBaoConfigKubernetes;
+
+export interface OpenBaoDriverDeps {
+  /** Injected HTTP fetcher — mockable in tests. */
+  fetch?: typeof globalThis.fetch;
+  /** Required only for `auth: 'token'`. */
+  secretRefResolver?: SecretRefResolver;
+  /** Override for the SA-token reader; tests use this to supply a fake JWT. */
+  readServiceAccountToken?: (path: string) => Promise<string>;
+  /** Clock for cache TTL — overridable in tests. */
+  now?: () => number;
+}
+
+const SA_TOKEN_DEFAULT_PATH = '/var/run/secrets/kubernetes.io/serviceaccount/token';
+const TOKEN_RENEW_GRACE_MS = 60_000;
+
+export class OpenBaoDriver implements SecretBackendDriver {
+  readonly kind = 'openbao';
+
+  private readonly url: string;
+  private readonly mount: string;
+  private readonly pathPrefix: string;
+  private readonly namespace: string | undefined;
+  private readonly authStrategy: 'token' | 'kubernetes';
+  private readonly tokenSecretRef: { name: string; key: string } | undefined;
+  private readonly k8sRole: string | undefined;
+  private readonly k8sAuthMount: string;
+  private readonly k8sTokenPath: string;
+  private readonly fetchImpl: typeof globalThis.fetch;
+  private readonly resolver: SecretRefResolver | undefined;
+  private readonly readSaToken: (path: string) => Promise<string>;
+  private readonly nowFn: () => number;
+
+  // Cached vault token + when (epoch ms) it should be considered expired and refetched.
+  private cachedToken: string | undefined;
+  private cachedTokenExpiresAt: number = Number.POSITIVE_INFINITY;
+
+  constructor(config: OpenBaoConfig, deps: OpenBaoDriverDeps) {
+    this.url = config.url.replace(/\/+$/, '');
+    this.mount = (config.mount ?? 'secret').replace(/^\/|\/$/g, '');
+    this.pathPrefix = (config.pathPrefix ?? 'mcpctl').replace(/^\/|\/$/g, '');
+    if (config.namespace !== undefined) this.namespace = config.namespace;
+
+    this.authStrategy = config.auth ?? 'token';
+    if (this.authStrategy === 'kubernetes') {
+      const k = config as OpenBaoConfigKubernetes;
+      if (!k.role) throw new Error('openbao kubernetes auth: `role` is required');
+      this.k8sRole = k.role;
+      this.k8sAuthMount = (k.authMount ?? 'kubernetes').replace(/^\/|\/$/g, '');
+      this.k8sTokenPath = k.serviceAccountTokenPath ?? SA_TOKEN_DEFAULT_PATH;
+    } else {
+      const t = config as OpenBaoConfigToken;
+      if (!t.tokenSecretRef) throw new Error('openbao token auth: `tokenSecretRef` is required');
+      if (deps.secretRefResolver === undefined) {
+        throw new Error('openbao token auth: secretRefResolver dependency is required');
+      }
+      this.tokenSecretRef = t.tokenSecretRef;
+      this.k8sAuthMount = 'kubernetes';
+      this.k8sTokenPath = SA_TOKEN_DEFAULT_PATH;
+    }
+
+    this.fetchImpl = deps.fetch ?? globalThis.fetch;
+    if (deps.secretRefResolver !== undefined) this.resolver = deps.secretRefResolver;
+    this.readSaToken = deps.readServiceAccountToken ?? ((path) => readFile(path, 'utf-8').then((s) => s.trim()));
+    this.nowFn = deps.now ?? (() => Date.now());
+  }
+
+  async read(input: { name: string; externalRef: ExternalRef; data: SecretData }): Promise<SecretData> {
+    const path = this.pathFor(input.name);
+    const res = await this.request('GET', `/v1/${this.mount}/data/${path}`);
+    if (res.status === 404) {
+      throw new Error(`OpenBao: secret '${input.name}' not found at ${path}`);
+    }
+    if (!res.ok) throw new Error(`OpenBao read ${path}: HTTP ${res.status}`);
+    const body = await res.json() as { data?: { data?: SecretData } };
+    return body.data?.data ?? {};
+  }
+
+  async write(input: { name: string; data: SecretData }): Promise<{ externalRef: ExternalRef; storedData: SecretData }> {
+    const path = this.pathFor(input.name);
+    const res = await this.request('POST', `/v1/${this.mount}/data/${path}`, { data: input.data });
+    if (!res.ok) throw new Error(`OpenBao write ${path}: HTTP ${res.status}`);
+    return { externalRef: `${this.mount}/${path}`, storedData: {} };
+  }
+
+  async delete(input: { name: string; externalRef: ExternalRef }): Promise<void> {
+    const path = this.pathFor(input.name);
+    const res = await this.request('DELETE', `/v1/${this.mount}/metadata/${path}`);
+    if (!res.ok && res.status !== 404) {
+      throw new Error(`OpenBao delete ${path}: HTTP ${res.status}`);
+    }
+  }
+
+  async list(): Promise<Array<{ name: string; externalRef: ExternalRef }>> {
+    const listPath = this.pathPrefix === '' ? '' : `${this.pathPrefix}/`;
+    const res = await this.request('LIST', `/v1/${this.mount}/metadata/${listPath}`);
+    if (res.status === 404) return [];
+    if (!res.ok) throw new Error(`OpenBao list: HTTP ${res.status}`);
+    const body = await res.json() as { data?: { keys?: string[] } };
+    const keys = body.data?.keys ?? [];
+    return keys
+      .filter((k) => !k.endsWith('/'))
+      .map((k) => ({
+        name: k,
+        externalRef: `${this.mount}/${this.pathPrefix === '' ? '' : `${this.pathPrefix}/`}${k}`,
+      }));
+  }
+
+  async healthCheck(): Promise<{ ok: boolean; detail?: string }> {
+    try {
+      const res = await this.request('GET', '/v1/sys/health');
+      return { ok: res.ok, detail: `HTTP ${res.status}` };
+    } catch (err) {
+      return { ok: false, detail: err instanceof Error ? err.message : String(err) };
+    }
+  }
+
+  private pathFor(name: string): string {
+    const safe = encodeURIComponent(name);
+    return this.pathPrefix === '' ? safe : `${this.pathPrefix}/${safe}`;
+  }
+
+  private async getToken(): Promise<string> {
+    if (this.cachedToken !== undefined && this.nowFn() < this.cachedTokenExpiresAt - TOKEN_RENEW_GRACE_MS) {
+      return this.cachedToken;
+    }
+
+    if (this.authStrategy === 'token') {
+      // Static token from a plaintext Secret. No TTL — cache for the driver's lifetime.
+      const token = await this.resolver!.resolve(this.tokenSecretRef!.name, this.tokenSecretRef!.key);
+      this.cachedToken = token;
+      this.cachedTokenExpiresAt = Number.POSITIVE_INFINITY;
+      return token;
+    }
+
+    // Kubernetes auth: read the projected SA JWT, exchange it for a Vault token.
+    const jwt = await this.readSaToken(this.k8sTokenPath);
+    const loginUrl = `${this.url}/v1/auth/${this.k8sAuthMount}/login`;
+    const headers: Record<string, string> = { 'Content-Type': 'application/json' };
+    if (this.namespace !== undefined) headers['X-Vault-Namespace'] = this.namespace;
+    const res = await this.fetchImpl(loginUrl, {
+      method: 'POST',
+      headers,
+      body: JSON.stringify({ role: this.k8sRole, jwt }),
+    });
+    if (!res.ok) {
+      const text = await res.text().catch(() => '');
+      throw new Error(`OpenBao kubernetes login (role=${this.k8sRole!}): HTTP ${String(res.status)} ${text}`);
+    }
+    const body = await res.json() as { auth?: { client_token?: string; lease_duration?: number } };
+    const clientToken = body.auth?.client_token;
+    if (clientToken === undefined || clientToken === '') {
+      throw new Error(`OpenBao kubernetes login: response missing auth.client_token`);
+    }
+    // lease_duration is seconds; 0 means token doesn't expire (rare for k8s auth).
+    const leaseSec = body.auth?.lease_duration ?? 0;
+    this.cachedToken = clientToken;
+    this.cachedTokenExpiresAt = leaseSec > 0
+      ? this.nowFn() + leaseSec * 1000
+      : Number.POSITIVE_INFINITY;
+    return clientToken;
+  }
+
+  private async request(method: string, path: string, body?: unknown): Promise<Response> {
+    const token = await this.getToken();
+    const headers: Record<string, string> = { 'X-Vault-Token': token };
+    if (this.namespace !== undefined) headers['X-Vault-Namespace'] = this.namespace;
+    if (body !== undefined) headers['Content-Type'] = 'application/json';
+
+    const init: RequestInit = { method, headers };
+    if (body !== undefined) init.body = JSON.stringify(body);
+
+    const res = await this.fetchImpl(`${this.url}${path}`, init);
+
+    // If the cached token expired between cache-check and request (k8s clock
+    // skew, server-side revocation, etc.), purge cache and retry once.
+    if (res.status === 403 && this.cachedToken !== undefined) {
+      this.cachedToken = undefined;
+      this.cachedTokenExpiresAt = 0;
+      const fresh = await this.getToken();
+      const retryHeaders: Record<string, string> = { 'X-Vault-Token': fresh };
+      if (this.namespace !== undefined) retryHeaders['X-Vault-Namespace'] = this.namespace;
+      if (body !== undefined) retryHeaders['Content-Type'] = 'application/json';
+      const retryInit: RequestInit = { method, headers: retryHeaders };
+      if (body !== undefined) retryInit.body = JSON.stringify(body);
+      return this.fetchImpl(`${this.url}${path}`, retryInit);
+    }
+    return res;
+  }
+}
--- a/src/mcpd/src/services/secret-backends/plaintext.ts
+++ b/src/mcpd/src/services/secret-backends/plaintext.ts
@@ -0,0 +1,44 @@
+/**
+ * Plaintext backend driver — stores Secret.data directly in the DB column.
+ *
+ * This is the bootstrap/default backend. It always exists (seeded on startup)
+ * so the system can hold its own backends' auth credentials (e.g. OpenBao
+ * token) somewhere before the real backend is configured.
+ *
+ * The driver is deliberately almost a no-op: the service writes to and reads
+ * from `Secret.data` directly. We still route through the driver interface so
+ * the service layer can stay uniform.
+ */
+import type { SecretBackendDriver, SecretData, ExternalRef } from './types.js';
+
+export interface PlaintextDriverDeps {
+  /** Queries `prisma.secret.findMany(...)` for the `list` method (migration path). */
+  listAllPlaintext: () => Promise<Array<{ name: string; data: SecretData }>>;
+}
+
+export class PlaintextDriver implements SecretBackendDriver {
+  readonly kind = 'plaintext';
+
+  constructor(private readonly deps: PlaintextDriverDeps) {}
+
+  async read(input: { name: string; externalRef: ExternalRef; data: SecretData }): Promise<SecretData> {
+    return input.data;
+  }
+
+  async write(input: { name: string; data: SecretData }): Promise<{ externalRef: ExternalRef; storedData: SecretData }> {
+    return { externalRef: '', storedData: input.data };
+  }
+
+  async delete(_input: { name: string; externalRef: ExternalRef }): Promise<void> {
+    // The row deletion itself is the secret service's job; nothing remote to clean up here.
+  }
+
+  async list(): Promise<Array<{ name: string; externalRef: ExternalRef }>> {
+    const rows = await this.deps.listAllPlaintext();
+    return rows.map((r) => ({ name: r.name, externalRef: '' }));
+  }
+
+  async healthCheck(): Promise<{ ok: boolean; detail?: string }> {
+    return { ok: true, detail: 'plaintext backend (DB)' };
+  }
+}
--- a/src/mcpd/src/services/secret-backends/types.ts
+++ b/src/mcpd/src/services/secret-backends/types.ts
@@ -0,0 +1,68 @@
+/**
+ * SecretBackend driver interface.
+ *
+ * The plaintext backend stores `data` in the DB column directly.
+ * Remote backends (openbao, vault, cloud KV) store an opaque `externalRef`
+ * and fetch the actual data on demand.
+ *
+ * Drivers are stateless factories keyed on a `SecretBackend` config row.
+ * Secret management (CRUD, naming) stays in the service layer; drivers
+ * handle only the storage I/O.
+ */
+
+/**
+ * Opaque reference written by a driver on `write` and read back on `read`.
+ *
+ * For the plaintext driver this is unused — the data itself lives in
+ * `Secret.data`. For openbao it's a string like `secret/data/mcpctl/mysecret`
+ * that tells the driver where to fetch on next `read`.
+ */
+export type ExternalRef = string;
+
+/** The shape of secret data — a flat map of key → value. */
+export type SecretData = Record<string, string>;
+
+export interface SecretBackendDriver {
+  /** Human-readable identifier, included in errors. */
+  readonly kind: string;
+
+  /**
+   * Read the stored secret. For plaintext this is a no-op — the data is
+   * already in the Secret row and passed in here for symmetry. For remote
+   * backends this makes the network call.
+   */
+  read(input: { name: string; externalRef: ExternalRef; data: SecretData }): Promise<SecretData>;
+
+  /**
+   * Store a new secret (or a new version of an existing one). Returns the
+   * reference (or an empty string for plaintext) + the `data` object that
+   * should be persisted on the Secret row (empty for remote backends).
+   */
+  write(input: { name: string; data: SecretData }): Promise<{ externalRef: ExternalRef; storedData: SecretData }>;
+
+  /** Remove the secret from the backend. Idempotent — missing is OK. */
+  delete(input: { name: string; externalRef: ExternalRef }): Promise<void>;
+
+  /** List everything the backend knows about. Used for migration + drift detection. */
+  list(): Promise<Array<{ name: string; externalRef: ExternalRef }>>;
+
+  /** Optional: health probe. Used by `mcpctl describe secretbackend`. */
+  healthCheck?(): Promise<{ ok: boolean; detail?: string }>;
+}
+
+/** Stored config for a SecretBackend row; dispatched on `type`. */
+export interface BackendRow {
+  id: string;
+  name: string;
+  type: string;
+  config: Record<string, unknown>;
+}
+
+/**
+ * Dependency passed to the openbao driver so it can resolve its own auth
+ * token (stored in the plaintext backend — chicken-and-egg bootstrap).
+ * Implemented by the SecretService so we don't have a circular import.
+ */
+export interface SecretRefResolver {
+  resolve(secretName: string, key: string): Promise<string>;
+}
--- a/src/mcpd/src/services/secret-migrate.service.ts
+++ b/src/mcpd/src/services/secret-migrate.service.ts
@@ -0,0 +1,113 @@
+/**
+ * Move secrets from one SecretBackend to another.
+ *
+ * Per-secret atomicity: for each secret we
+ *   1. resolve the data via the source driver,
+ *   2. write it to the destination driver,
+ *   3. update the Secret row (flip backendId + set new externalRef, clear data),
+ *   4. optionally delete from source.
+ *
+ * If the process dies between 2 and 3, the destination has an orphan entry
+ * but the row still points at the source — restart is idempotent (skips rows
+ * already on destination). We never run a batch-wide transaction because each
+ * remote driver write is a real HTTP call that can't roll back.
+ */
+import type { Secret } from '@prisma/client';
+import type { ISecretRepository } from '../repositories/interfaces.js';
+import type { SecretBackendService } from './secret-backend.service.js';
+
+export interface MigrateOptions {
+  /** Source backend name. */
+  from: string;
+  /** Destination backend name. */
+  to: string;
+  /** If provided, only migrate secrets with these names. Otherwise migrate all. */
+  names?: string[];
+  /** Leave the source copy intact after migration. Default false. */
+  keepSource?: boolean;
+}
+
+export interface MigrateResult {
+  migrated: Array<{ name: string }>;
+  skipped: Array<{ name: string; reason: string }>;
+  failed: Array<{ name: string; error: string }>;
+}
+
+export class SecretMigrateService {
+  constructor(
+    private readonly secretRepo: ISecretRepository,
+    private readonly backends: SecretBackendService,
+  ) {}
+
+  async migrate(opts: MigrateOptions): Promise<MigrateResult> {
+    const source = await this.backends.getByName(opts.from);
+    const dest = await this.backends.getByName(opts.to);
+    if (source.id === dest.id) {
+      return { migrated: [], skipped: [], failed: [{ name: '*', error: 'source and destination are the same backend' }] };
+    }
+
+    const sourceDriver = this.backends.driverFor(source);
+    const destDriver = this.backends.driverFor(dest);
+
+    let secrets = await this.secretRepo.findByBackend(source.id);
+    if (opts.names && opts.names.length > 0) {
+      const wanted = new Set(opts.names);
+      secrets = secrets.filter((s) => wanted.has(s.name));
+    }
+
+    const result: MigrateResult = { migrated: [], skipped: [], failed: [] };
+    for (const secret of secrets) {
+      try {
+        // Skip if somehow already on destination (re-run safety).
+        if (secret.backendId === dest.id) {
+          result.skipped.push({ name: secret.name, reason: 'already on destination' });
+          continue;
+        }
+
+        const data = await sourceDriver.read({
+          name: secret.name,
+          externalRef: secret.externalRef,
+          data: secret.data as Record<string, string>,
+        });
+        const written = await destDriver.write({ name: secret.name, data });
+
+        await this.secretRepo.update(secret.id, {
+          backendId: dest.id,
+          data: written.storedData,
+          externalRef: written.externalRef,
+        });
+
+        if (opts.keepSource !== true) {
+          await sourceDriver.delete({ name: secret.name, externalRef: secret.externalRef })
+            .catch((err: unknown) => {
+              // Destination is intact; best-effort source cleanup. Log + continue.
+              const msg = err instanceof Error ? err.message : String(err);
+              result.skipped.push({ name: secret.name, reason: `migrated OK; source cleanup failed: ${msg}` });
+            });
+        }
+
+        result.migrated.push({ name: secret.name });
+      } catch (err) {
+        const msg = err instanceof Error ? err.message : String(err);
+        result.failed.push({ name: secret.name, error: msg });
+      }
+    }
+
+    return result;
+  }
+
+  /** Track which secrets would be touched by a migrate run, without performing it. */
+  async dryRun(opts: MigrateOptions): Promise<Array<Secret>> {
+    const source = await this.backends.getByName(opts.from);
+    let secrets = await this.secretRepo.findByBackend(source.id);
+    if (opts.names && opts.names.length > 0) {
+      const wanted = new Set(opts.names);
+      secrets = secrets.filter((s) => wanted.has(s.name));
+    }
+    return secrets;
+  }
+}
+
+export interface SecretMigrateRouteDeps {
+  migrateService: SecretMigrateService;
+}
--- a/src/mcpd/src/services/secret.service.ts
+++ b/src/mcpd/src/services/secret.service.ts
@@ -1,10 +1,23 @@
+/**
+ * SecretService — CRUD over `Secret` rows.
+ *
+ * Dispatches storage I/O through the `SecretBackendService`: on create/update
+ * the default backend's driver writes, and the resulting {externalRef,
+ * storedData} is persisted on the row. On read (`resolveData`) the row's
+ * `backendId` selects the driver, which fetches the actual data.
+ */
 import type { Secret } from '@prisma/client';
 import type { ISecretRepository } from '../repositories/interfaces.js';
+import type { SecretBackendService } from './secret-backend.service.js';
 import { CreateSecretSchema, UpdateSecretSchema } from '../validation/secret.schema.js';
 import { NotFoundError, ConflictError } from './mcp-server.service.js';
+import type { SecretRefResolver } from './secret-backends/types.js';

-export class SecretService {
-  constructor(private readonly repo: ISecretRepository) {}
+export class SecretService implements SecretRefResolver {
+  constructor(
+    private readonly repo: ISecretRepository,
+    private readonly backends: SecretBackendService,
+  ) {}

  async list(): Promise<Secret[]> {
    return this.repo.findAll();
@@ -26,47 +39,79 @@ export class SecretService {
    return secret;
  }

+  /** Return the secret's actual data by dispatching through its backend driver. */
+  async resolveData(secret: Secret): Promise<Record<string, string>> {
+    const backend = await this.backends.getById(secret.backendId);
+    const driver = this.backends.driverFor(backend);
+    return driver.read({
+      name: secret.name,
+      externalRef: secret.externalRef,
+      data: secret.data as Record<string, string>,
+    });
+  }
+
+  /** Convenience: resolve {secretName, key} → string. Implements SecretRefResolver. */
+  async resolve(secretName: string, key: string): Promise<string> {
+    const secret = await this.getByName(secretName);
+    const data = await this.resolveData(secret);
+    const value = data[key];
+    if (value === undefined) {
+      throw new NotFoundError(`Secret '${secretName}' has no key '${key}'`);
+    }
+    return value;
+  }
+
  async create(input: unknown): Promise<Secret> {
    const data = CreateSecretSchema.parse(input);
-
    const existing = await this.repo.findByName(data.name);
    if (existing !== null) {
      throw new ConflictError(`Secret already exists: ${data.name}`);
    }
-
-    return this.repo.create(data);
+    const backend = await this.backends.getDefault();
+    const driver = this.backends.driverFor(backend);
+    const written = await driver.write({ name: data.name, data: data.data });
+    return this.repo.create({
+      name: data.name,
+      backendId: backend.id,
+      data: written.storedData,
+      externalRef: written.externalRef,
+    });
  }

  async update(id: string, input: unknown): Promise<Secret> {
    const data = UpdateSecretSchema.parse(input);
-
-    // Verify exists
-    await this.getById(id);
-
-    return this.repo.update(id, data);
+    const existing = await this.getById(id);
+    const backend = await this.backends.getById(existing.backendId);
+    const driver = this.backends.driverFor(backend);
+    const written = await driver.write({ name: existing.name, data: data.data });
+    return this.repo.update(id, {
+      data: written.storedData,
+      externalRef: written.externalRef,
+    });
  }

  async delete(id: string): Promise<void> {
-    // Verify exists
-    await this.getById(id);
+    const existing = await this.getById(id);
+    const backend = await this.backends.getById(existing.backendId);
+    const driver = this.backends.driverFor(backend);
+    await driver.delete({ name: existing.name, externalRef: existing.externalRef });
    await this.repo.delete(id);
  }

-  // ── Backup/restore helpers ──
+  // ── Backup/restore helpers (preserved) ──

  async upsertByName(data: Record<string, unknown>): Promise<Secret> {
    const name = data['name'] as string;
    const existing = await this.repo.findByName(name);
    if (existing !== null) {
-      const { name: _, ...updateFields } = data;
-      return this.repo.update(existing.id, updateFields as Parameters<ISecretRepository['update']>[1]);
+      return this.update(existing.id, data);
    }
-    return this.repo.create(data as Parameters<ISecretRepository['create']>[0]);
+    return this.create(data);
  }

  async deleteByName(name: string): Promise<void> {
    const existing = await this.repo.findByName(name);
    if (existing === null) return;
-    await this.repo.delete(existing.id);
+    await this.delete(existing.id);
  }
 }
--- a/src/mcpd/src/services/transport/persistent-stdio.ts
+++ b/src/mcpd/src/services/transport/persistent-stdio.ts
@@ -1,14 +1,24 @@
 import type { McpOrchestrator, InteractiveExec } from '../orchestrator.js';
 import type { McpProxyResponse } from '../mcp-proxy-service.js';

+export type StdioMode =
+  | { kind: 'exec'; command: string[] }
+  | { kind: 'attach' };
+
 /**
- * Persistent STDIO connection to an MCP server running inside a Docker container.
+ * Persistent STDIO connection to an MCP server running inside a container.
 *
- * Instead of cold-starting a new process per call (docker exec one-shot), this keeps
- * a long-running `docker exec -i <cmd>` session alive. The MCP init handshake runs
- * once, then tool calls are multiplexed over the same stdin/stdout pipe.
+ * Two modes:
+ *   exec   — start a new process in the container (`docker exec -i <cmd>` /
+ *            `kubectl exec -i`) and speak MCP to it. Used for runner-image
+ *            servers where mcpctl launches the MCP binary itself.
+ *   attach — attach to the container's PID 1 stdin/stdout. Used for
+ *            docker-image servers whose entrypoint IS the MCP server
+ *            (e.g. gitea-mcp-server, docmost-mcp).
 *
- * Falls back gracefully: if the process dies, the next call will reconnect.
+ * In both modes the MCP init handshake runs once; subsequent tool calls
+ * are multiplexed over the same pipe. If the session dies, the next call
+ * will reconnect.
 */
 export class PersistentStdioClient {
  private exec: InteractiveExec | null = null;
@@ -25,7 +35,7 @@ export class PersistentStdioClient {
  constructor(
    private readonly orchestrator: McpOrchestrator,
    private readonly containerId: string,
-    private readonly command: string[],
+    private readonly mode: StdioMode,
    private readonly timeoutMs = 120_000,
  ) {}

@@ -90,11 +100,18 @@ export class PersistentStdioClient {
  private async connect(): Promise<void> {
    this.close();

-    if (!this.orchestrator.execInteractive) {
-      throw new Error('Orchestrator does not support interactive exec');
+    let exec: InteractiveExec;
+    if (this.mode.kind === 'attach') {
+      if (!this.orchestrator.attachInteractive) {
+        throw new Error('Orchestrator does not support attach');
+      }
+      exec = await this.orchestrator.attachInteractive(this.containerId);
+    } else {
+      if (!this.orchestrator.execInteractive) {
+        throw new Error('Orchestrator does not support interactive exec');
+      }
+      exec = await this.orchestrator.execInteractive(this.containerId, this.mode.command);
    }
-
-    const exec = await this.orchestrator.execInteractive(this.containerId, this.command);
    this.exec = exec;
    this.buffer = '';

--- a/src/mcpd/src/validation/llm.schema.ts
+++ b/src/mcpd/src/validation/llm.schema.ts
@@ -0,0 +1,39 @@
+import { z } from 'zod';
+
+export const LLM_TYPES = ['anthropic', 'openai', 'deepseek', 'vllm', 'ollama', 'gemini-cli'] as const;
+export const LLM_TIERS = ['fast', 'heavy'] as const;
+
+/**
+ * Reference to a key inside a Secret. `name` is the Secret resource name;
+ * `key` is the JSON key inside that secret's `data` map. mcpd resolves the
+ * pair through SecretService at inference time, so credentials never leave
+ * the server.
+ */
+export const ApiKeyRefSchema = z.object({
+  name: z.string().min(1),
+  key: z.string().min(1),
+});
+
+export const CreateLlmSchema = z.object({
+  name: z.string().min(1).max(100).regex(/^[a-z0-9-]+$/, 'Name must be lowercase alphanumeric with hyphens'),
+  type: z.enum(LLM_TYPES),
+  model: z.string().min(1),
+  url: z.string().url().optional(),
+  tier: z.enum(LLM_TIERS).default('fast'),
+  description: z.string().max(500).default(''),
+  apiKeyRef: ApiKeyRefSchema.optional(),
+  extraConfig: z.record(z.unknown()).default({}),
+});
+
+export const UpdateLlmSchema = z.object({
+  model: z.string().min(1).optional(),
+  url: z.string().url().or(z.literal('')).optional(),
+  tier: z.enum(LLM_TIERS).optional(),
+  description: z.string().max(500).optional(),
+  apiKeyRef: ApiKeyRefSchema.nullable().optional(),
+  extraConfig: z.record(z.unknown()).optional(),
+});
+
+export type CreateLlmInput = z.infer<typeof CreateLlmSchema>;
+export type UpdateLlmInput = z.infer<typeof UpdateLlmSchema>;
+export type ApiKeyRef = z.infer<typeof ApiKeyRefSchema>;
--- a/src/mcpd/src/validation/mcp-token.schema.ts
+++ b/src/mcpd/src/validation/mcp-token.schema.ts
@@ -0,0 +1,21 @@
+import { z } from 'zod';
+import { RbacRoleBindingSchema } from './rbac-definition.schema.js';
+
+export const McpTokenRbacMode = z.enum(['empty', 'clone']);
+export type McpTokenRbacMode = z.infer<typeof McpTokenRbacMode>;
+
+export const CreateMcpTokenSchema = z.object({
+  name: z
+    .string()
+    .min(1)
+    .max(100)
+    .regex(/^[a-z0-9-]+$/, 'Name must be lowercase alphanumeric with hyphens'),
+  projectId: z.string().min(1),
+  description: z.string().optional(),
+  expiresAt: z.union([z.string().datetime(), z.date(), z.null()]).optional(),
+  rbacMode: McpTokenRbacMode.default('empty'),
+  /** Explicit bindings, added on top of the `rbacMode` base (empty or clone). */
+  bindings: z.array(RbacRoleBindingSchema).default([]),
+});
+
+export type CreateMcpTokenInput = z.infer<typeof CreateMcpTokenSchema>;
--- a/src/mcpd/src/validation/rbac-definition.schema.ts
+++ b/src/mcpd/src/validation/rbac-definition.schema.ts
@@ -1,7 +1,7 @@
 import { z } from 'zod';

 export const RBAC_ROLES = ['edit', 'view', 'create', 'delete', 'run', 'expose'] as const;
-export const RBAC_RESOURCES = ['*', 'servers', 'instances', 'secrets', 'projects', 'templates', 'users', 'groups', 'rbac', 'prompts', 'promptrequests'] as const;
+export const RBAC_RESOURCES = ['*', 'servers', 'instances', 'secrets', 'secretbackends', 'llms', 'projects', 'templates', 'users', 'groups', 'rbac', 'prompts', 'promptrequests', 'mcptokens'] as const;

 /** Singular→plural map for resource names. */
 const RESOURCE_ALIASES: Record<string, string> = {
@@ -14,6 +14,9 @@ const RESOURCE_ALIASES: Record<string, string> = {
  group: 'groups',
  prompt: 'prompts',
  promptrequest: 'promptrequests',
+  mcptoken: 'mcptokens',
+  secretbackend: 'secretbackends',
+  llm: 'llms',
 };

 /** Normalize a resource name to its canonical plural form. */
@@ -22,7 +25,7 @@ export function normalizeResource(resource: string): string {
 }

 export const RbacSubjectSchema = z.object({
-  kind: z.enum(['User', 'Group', 'ServiceAccount']),
+  kind: z.enum(['User', 'Group', 'ServiceAccount', 'McpToken']),
  name: z.string().min(1),
 });

--- a/src/mcpd/tests/auth.test.ts
+++ b/src/mcpd/tests/auth.test.ts
@@ -99,3 +99,76 @@ describe('auth middleware', () => {
    expect(findSession).toHaveBeenCalledWith('my-token');
  });
 });
+
+describe('auth middleware — McpToken dispatch', () => {
+  async function setupAppWithMcpToken(deps: Parameters<typeof createAuthMiddleware>[0]) {
+    app = Fastify({ logger: false });
+    const authMiddleware = createAuthMiddleware(deps);
+    app.addHook('preHandler', authMiddleware);
+    app.get('/protected', async (request) => ({
+      userId: request.userId,
+      mcpToken: request.mcpToken,
+    }));
+    return app.ready();
+  }
+
+  it('routes mcpctl_pat_ bearers to findMcpToken and skips findSession', async () => {
+    const findSession = vi.fn(async () => null);
+    const findMcpToken = vi.fn(async () => ({
+      tokenId: 'ctok1',
+      tokenName: 'mytok',
+      tokenSha: 'deadbeef',
+      projectId: 'cproj1',
+      projectName: 'myproj',
+      ownerId: 'cuser1',
+      expiresAt: null,
+      revokedAt: null,
+    }));
+    await setupAppWithMcpToken({ findSession, findMcpToken });
+    const res = await app.inject({
+      method: 'GET',
+      url: '/protected',
+      headers: { authorization: 'Bearer mcpctl_pat_abcdefghij' },
+    });
+    expect(res.statusCode).toBe(200);
+    expect(findSession).not.toHaveBeenCalled();
+    expect(findMcpToken).toHaveBeenCalledTimes(1);
+    const body = res.json<{ userId: string; mcpToken: { tokenName: string; projectName: string } }>();
+    expect(body.userId).toBe('cuser1');
+    expect(body.mcpToken.tokenName).toBe('mytok');
+    expect(body.mcpToken.projectName).toBe('myproj');
+  });
+
+  it('returns 401 for a revoked McpToken', async () => {
+    await setupAppWithMcpToken({
+      findSession: async () => null,
+      findMcpToken: async () => ({
+        tokenId: 'ctok1',
+        tokenName: 'mytok',
+        tokenSha: 'x',
+        projectId: 'p',
+        projectName: 'p',
+        ownerId: 'u',
+        expiresAt: null,
+        revokedAt: new Date(),
+      }),
+    });
+    const res = await app.inject({
+      method: 'GET',
+      url: '/protected',
+      headers: { authorization: 'Bearer mcpctl_pat_revoked' },
+    });
+    expect(res.statusCode).toBe(401);
+    expect(res.json<{ error: string }>().error).toContain('revoked');
+  });
+
+  it('returns 401 when a mcpctl_pat_ bearer arrives but findMcpToken is not configured', async () => {
+    await setupAppWithMcpToken({ findSession: async () => null });
+    const res = await app.inject({
+      method: 'GET',
+      url: '/protected',
+      headers: { authorization: 'Bearer mcpctl_pat_no-lookup-wired' },
+    });
+    expect(res.statusCode).toBe(401);
+  });
+});
--- a/src/mcpd/tests/backup.test.ts
+++ b/src/mcpd/tests/backup.test.ts
@@ -9,6 +9,25 @@ import type { IProjectRepository } from '../src/repositories/project.repository.
 import type { IUserRepository } from '../src/repositories/user.repository.js';
 import type { IGroupRepository } from '../src/repositories/group.repository.js';
 import type { IRbacDefinitionRepository } from '../src/repositories/rbac-definition.repository.js';
+import type { SecretService } from '../src/services/secret.service.js';
+
+/**
+ * Minimal SecretService shim over a mock repo — just the `.create()` / `.update()`
+ * methods that RestoreService calls. We don't need the backend-dispatch path
+ * here since the restore happy-path tests don't exercise remote backends.
+ */
+function mockSecretService(repo: ISecretRepository): SecretService {
+  return {
+    create: vi.fn(async (input: unknown) => {
+      const data = input as { name: string; data: Record<string, string> };
+      return repo.create({ name: data.name, backendId: 'backend-plaintext', data: data.data, externalRef: '' });
+    }),
+    update: vi.fn(async (id: string, input: unknown) => {
+      const data = input as { data: Record<string, string> };
+      return repo.update(id, { data: data.data });
+    }),
+  } as unknown as SecretService;
+}

 // Mock data
 const mockServers = [
@@ -295,7 +314,7 @@ describe('RestoreService', () => {
    (userRepo.findByEmail as ReturnType<typeof vi.fn>).mockResolvedValue(null);
    (groupRepo.findByName as ReturnType<typeof vi.fn>).mockResolvedValue(null);
    (rbacRepo.findByName as ReturnType<typeof vi.fn>).mockResolvedValue(null);
-    restoreService = new RestoreService(serverRepo, projectRepo, secretRepo, userRepo, groupRepo, rbacRepo);
+    restoreService = new RestoreService(serverRepo, projectRepo, secretRepo, mockSecretService(secretRepo), userRepo, groupRepo, rbacRepo);
  });

  const validBundle = {
@@ -576,7 +595,7 @@ describe('Backup Routes', () => {
    (rGroupRepo.findByName as ReturnType<typeof vi.fn>).mockResolvedValue(null);
    const rRbacRepo = mockRbacRepo();
    (rRbacRepo.findByName as ReturnType<typeof vi.fn>).mockResolvedValue(null);
-    restoreService = new RestoreService(rSRepo, rPrRepo, rSecRepo, rUserRepo, rGroupRepo, rRbacRepo);
+    restoreService = new RestoreService(rSRepo, rPrRepo, rSecRepo, mockSecretService(rSecRepo), rUserRepo, rGroupRepo, rRbacRepo);
  });

  async function buildApp() {
--- a/src/mcpd/tests/env-resolver.test.ts
+++ b/src/mcpd/tests/env-resolver.test.ts
@@ -1,6 +1,5 @@
 import { describe, it, expect, vi } from 'vitest';
-import { resolveServerEnv } from '../src/services/env-resolver.js';
-import type { ISecretRepository } from '../src/repositories/interfaces.js';
+import { resolveServerEnv, type SecretResolver } from '../src/services/env-resolver.js';
 import type { McpServer } from '@prisma/client';

 function makeServer(env: unknown[]): McpServer {
@@ -23,18 +22,16 @@ function makeServer(env: unknown[]): McpServer {
  } as McpServer;
 }

-function mockSecretRepo(secrets: Record<string, Record<string, string>>): ISecretRepository {
+/** A SecretResolver backed by a {secretName: {key: value}} map. */
+function mockResolver(secrets: Record<string, Record<string, string>>): SecretResolver {
  return {
-    findAll: vi.fn(async () => []),
-    findById: vi.fn(async () => null),
-    findByName: vi.fn(async (name: string) => {
+    resolve: vi.fn(async (name: string, key: string): Promise<string> => {
      const data = secrets[name];
-      if (!data) return null;
-      return { id: `sec-${name}`, name, data, version: 1, createdAt: new Date(), updatedAt: new Date() };
+      if (!data) throw new Error(`Secret '${name}' not found`);
+      const value = data[key];
+      if (value === undefined) throw new Error(`Key '${key}' not found in secret '${name}'`);
+      return value;
    }),
-    create: vi.fn(async () => ({} as never)),
-    update: vi.fn(async () => ({} as never)),
-    delete: vi.fn(async () => {}),
  };
 }

@@ -44,8 +41,7 @@ describe('resolveServerEnv', () => {
      { name: 'FOO', value: 'bar' },
      { name: 'BAZ', value: 'qux' },
    ]);
-    const repo = mockSecretRepo({});
-    const result = await resolveServerEnv(server, repo);
+    const result = await resolveServerEnv(server, mockResolver({}));
    expect(result).toEqual({ FOO: 'bar', BAZ: 'qux' });
  });

@@ -53,10 +49,8 @@ describe('resolveServerEnv', () => {
    const server = makeServer([
      { name: 'TOKEN', valueFrom: { secretRef: { name: 'ha-creds', key: 'HOMEASSISTANT_TOKEN' } } },
    ]);
-    const repo = mockSecretRepo({
-      'ha-creds': { HOMEASSISTANT_TOKEN: 'secret-token-123' },
-    });
-    const result = await resolveServerEnv(server, repo);
+    const resolver = mockResolver({ 'ha-creds': { HOMEASSISTANT_TOKEN: 'secret-token-123' } });
+    const result = await resolveServerEnv(server, resolver);
    expect(result).toEqual({ TOKEN: 'secret-token-123' });
  });

@@ -65,48 +59,42 @@ describe('resolveServerEnv', () => {
      { name: 'URL', value: 'https://ha.local' },
      { name: 'TOKEN', valueFrom: { secretRef: { name: 'creds', key: 'TOKEN' } } },
    ]);
-    const repo = mockSecretRepo({
-      creds: { TOKEN: 'my-token' },
-    });
-    const result = await resolveServerEnv(server, repo);
+    const resolver = mockResolver({ creds: { TOKEN: 'my-token' } });
+    const result = await resolveServerEnv(server, resolver);
    expect(result).toEqual({ URL: 'https://ha.local', TOKEN: 'my-token' });
  });

-  it('caches secret lookups', async () => {
+  it('calls the resolver once per distinct ref', async () => {
    const server = makeServer([
      { name: 'A', valueFrom: { secretRef: { name: 'shared', key: 'KEY_A' } } },
      { name: 'B', valueFrom: { secretRef: { name: 'shared', key: 'KEY_B' } } },
    ]);
-    const repo = mockSecretRepo({
-      shared: { KEY_A: 'val-a', KEY_B: 'val-b' },
-    });
-    const result = await resolveServerEnv(server, repo);
+    const resolver = mockResolver({ shared: { KEY_A: 'val-a', KEY_B: 'val-b' } });
+    const result = await resolveServerEnv(server, resolver);
    expect(result).toEqual({ A: 'val-a', B: 'val-b' });
-    expect(repo.findByName).toHaveBeenCalledTimes(1);
+    // Resolver is called per-entry now — caching moved to the SecretService layer,
+    // which is where downstream drivers can be hit at most once per (name, key) pair.
+    expect(resolver.resolve).toHaveBeenCalledTimes(2);
  });

  it('throws when secret not found', async () => {
    const server = makeServer([
      { name: 'TOKEN', valueFrom: { secretRef: { name: 'missing', key: 'TOKEN' } } },
    ]);
-    const repo = mockSecretRepo({});
-    await expect(resolveServerEnv(server, repo)).rejects.toThrow("Secret 'missing' not found");
+    await expect(resolveServerEnv(server, mockResolver({}))).rejects.toThrow(/Secret 'missing' not found/);
  });

  it('throws when secret key not found', async () => {
    const server = makeServer([
      { name: 'TOKEN', valueFrom: { secretRef: { name: 'creds', key: 'NONEXISTENT' } } },
    ]);
-    const repo = mockSecretRepo({
-      creds: { OTHER_KEY: 'val' },
-    });
-    await expect(resolveServerEnv(server, repo)).rejects.toThrow("Key 'NONEXISTENT' not found in secret 'creds'");
+    const resolver = mockResolver({ creds: { OTHER_KEY: 'val' } });
+    await expect(resolveServerEnv(server, resolver)).rejects.toThrow(/Key 'NONEXISTENT' not found/);
  });

  it('returns empty map for empty env', async () => {
    const server = makeServer([]);
-    const repo = mockSecretRepo({});
-    const result = await resolveServerEnv(server, repo);
+    const result = await resolveServerEnv(server, mockResolver({}));
    expect(result).toEqual({});
  });
 });
--- a/src/mcpd/tests/llm-adapters.test.ts
+++ b/src/mcpd/tests/llm-adapters.test.ts
@@ -0,0 +1,210 @@
+import { describe, it, expect, vi } from 'vitest';
+import { OpenAiPassthroughAdapter } from '../src/services/llm/adapters/openai-passthrough.js';
+import { AnthropicAdapter } from '../src/services/llm/adapters/anthropic.js';
+import { LlmAdapterRegistry, UnsupportedProviderError } from '../src/services/llm/dispatcher.js';
+import type { InferContext } from '../src/services/llm/types.js';
+
+function mockFetch(responses: Array<{ match: RegExp; status: number; body?: unknown; text?: string }>): ReturnType<typeof vi.fn> {
+  return vi.fn(async (input: string | URL, _init?: RequestInit) => {
+    const url = String(input);
+    const match = responses.find((r) => r.match.test(url));
+    if (!match) throw new Error(`unexpected fetch: ${url}`);
+    const body = match.body !== undefined ? JSON.stringify(match.body) : (match.text ?? '');
+    return new Response(body, { status: match.status, headers: { 'Content-Type': 'application/json' } });
+  });
+}
+
+function makeCtx(overrides: Partial<InferContext> = {}): InferContext {
+  return {
+    body: { model: '', messages: [{ role: 'user', content: 'hello' }] },
+    modelOverride: 'default-model',
+    apiKey: 'test-key',
+    url: '',
+    extraConfig: {},
+    ...overrides,
+  };
+}
+
+// Helper to build a streaming Response from SSE lines.
+function sseResponse(events: string[]): Response {
+  const body = events.join('\n\n') + '\n\n';
+  const stream = new ReadableStream<Uint8Array>({
+    start(controller) {
+      controller.enqueue(new TextEncoder().encode(body));
+      controller.close();
+    },
+  });
+  return new Response(stream, { status: 200, headers: { 'Content-Type': 'text/event-stream' } });
+}
+
+describe('OpenAiPassthroughAdapter', () => {
+  it('infer: POSTs to <url>/v1/chat/completions with Authorization + body', async () => {
+    const fetchFn = mockFetch([{
+      match: /\/v1\/chat\/completions$/,
+      status: 200,
+      body: { id: 'x', choices: [{ message: { role: 'assistant', content: 'hi' } }] },
+    }]);
+    const adapter = new OpenAiPassthroughAdapter('openai', { fetch: fetchFn as unknown as typeof fetch });
+    const ctx = makeCtx({ url: 'https://api.example.com' });
+    const res = await adapter.infer(ctx);
+    expect(res.status).toBe(200);
+    const [url, init] = fetchFn.mock.calls[0] as [string, RequestInit];
+    expect(url).toBe('https://api.example.com/v1/chat/completions');
+    expect(init.method).toBe('POST');
+    const headers = init.headers as Record<string, string>;
+    expect(headers['Authorization']).toBe('Bearer test-key');
+    const sent = JSON.parse(init.body as string) as { model: string; stream: boolean };
+    expect(sent.model).toBe('default-model');  // filled from modelOverride
+    expect(sent.stream).toBe(false);
+  });
+
+  it('infer: uses default URL for openai when url is empty', async () => {
+    const fetchFn = mockFetch([{ match: /api\.openai\.com/, status: 200, body: {} }]);
+    const adapter = new OpenAiPassthroughAdapter('openai', { fetch: fetchFn as unknown as typeof fetch });
+    await adapter.infer(makeCtx());
+    const [url] = fetchFn.mock.calls[0] as [string, RequestInit];
+    expect(url).toBe('https://api.openai.com/v1/chat/completions');
+  });
+
+  it('infer: throws for vllm when url is empty (no default)', async () => {
+    const adapter = new OpenAiPassthroughAdapter('vllm', { fetch: vi.fn() as unknown as typeof fetch });
+    await expect(adapter.infer(makeCtx())).rejects.toThrow(/no default endpoint/);
+  });
+
+  it('infer: omits Authorization when apiKey is empty', async () => {
+    const fetchFn = mockFetch([{ match: /ollama/, status: 200, body: {} }]);
+    const adapter = new OpenAiPassthroughAdapter('ollama', { fetch: fetchFn as unknown as typeof fetch });
+    await adapter.infer(makeCtx({ url: 'http://ollama:11434', apiKey: '' }));
+    const [, init] = fetchFn.mock.calls[0] as [string, RequestInit];
+    const headers = init.headers as Record<string, string>;
+    expect(headers['Authorization']).toBeUndefined();
+  });
+
+  it('stream: forwards SSE chunks and emits terminal [DONE]', async () => {
+    const fetchFn = vi.fn(async () => sseResponse([
+      'data: {"choices":[{"delta":{"content":"hi"}}]}',
+      'data: {"choices":[{"delta":{"content":"!"}}]}',
+      'data: [DONE]',
+    ]));
+    const adapter = new OpenAiPassthroughAdapter('openai', { fetch: fetchFn as unknown as typeof fetch });
+    const ctx = makeCtx({ url: 'http://example', body: { model: '', messages: [], stream: true } });
+    const chunks: { data: string; done?: boolean }[] = [];
+    for await (const c of adapter.stream(ctx)) chunks.push(c);
+    expect(chunks).toHaveLength(3);
+    expect(chunks[2]?.done).toBe(true);
+  });
+});
+
+describe('AnthropicAdapter', () => {
+  it('infer: translates system+user messages, posts to /v1/messages', async () => {
+    const fetchFn = mockFetch([{
+      match: /\/v1\/messages$/,
+      status: 200,
+      body: {
+        id: 'msg_01', model: 'claude-3-5-sonnet-20241022', role: 'assistant',
+        content: [{ type: 'text', text: 'howdy' }],
+        stop_reason: 'end_turn',
+        usage: { input_tokens: 5, output_tokens: 2 },
+      },
+    }]);
+    const adapter = new AnthropicAdapter({ fetch: fetchFn as unknown as typeof fetch });
+    const ctx = makeCtx({
+      body: {
+        model: '', messages: [
+          { role: 'system', content: 'be nice' },
+          { role: 'user', content: 'hi' },
+        ],
+      },
+      modelOverride: 'claude-3-5-sonnet-20241022',
+    });
+    const res = await adapter.infer(ctx);
+    expect(res.status).toBe(200);
+
+    const [url, init] = fetchFn.mock.calls[0] as [string, RequestInit];
+    expect(url).toBe('https://api.anthropic.com/v1/messages');
+    const headers = init.headers as Record<string, string>;
+    expect(headers['x-api-key']).toBe('test-key');
+    expect(headers['anthropic-version']).toBeDefined();
+
+    const sent = JSON.parse(init.body as string) as {
+      model: string; system: string; messages: Array<{ role: string; content: string }>; max_tokens: number;
+    };
+    expect(sent.model).toBe('claude-3-5-sonnet-20241022');
+    expect(sent.system).toBe('be nice');
+    expect(sent.messages).toEqual([{ role: 'user', content: 'hi' }]);
+    expect(sent.max_tokens).toBe(1024); // default
+
+    // Response shape: OpenAI chat.completion
+    const body = res.body as { choices: Array<{ message: { content: string }; finish_reason: string }>; usage: { total_tokens: number } };
+    expect(body.choices[0]!.message.content).toBe('howdy');
+    expect(body.choices[0]!.finish_reason).toBe('stop');
+    expect(body.usage.total_tokens).toBe(7);
+  });
+
+  it('infer: returns a synthetic error body on non-2xx', async () => {
+    const fetchFn = vi.fn(async () => new Response('boom', { status: 500 }));
+    const adapter = new AnthropicAdapter({ fetch: fetchFn as unknown as typeof fetch });
+    const res = await adapter.infer(makeCtx({ body: { model: '', messages: [{ role: 'user', content: 'x' }] } }));
+    expect(res.status).toBe(500);
+    const body = res.body as { error: { message: string } };
+    expect(body.error.message).toMatch(/HTTP 500/);
+  });
+
+  it('stream: translates anthropic event stream into OpenAI chunks', async () => {
+    const events = [
+      'event: message_start\ndata: {"type":"message_start","message":{"id":"m","content":[]}}',
+      'event: content_block_delta\ndata: {"type":"content_block_delta","delta":{"type":"text_delta","text":"he"}}',
+      'event: content_block_delta\ndata: {"type":"content_block_delta","delta":{"type":"text_delta","text":"llo"}}',
+      'event: message_delta\ndata: {"type":"message_delta","delta":{"stop_reason":"end_turn"}}',
+      'event: message_stop\ndata: {"type":"message_stop"}',
+    ];
+    const fetchFn = vi.fn(async () => sseResponse(events));
+    const adapter = new AnthropicAdapter({ fetch: fetchFn as unknown as typeof fetch });
+    const ctx = makeCtx({ body: { model: '', messages: [{ role: 'user', content: 'hi' }], stream: true } });
+
+    const chunks: { data: string; done?: boolean }[] = [];
+    for await (const c of adapter.stream(ctx)) chunks.push(c);
+
+    // Expect: role-prime, two text deltas, finish-reason, [DONE]
+    expect(chunks[chunks.length - 1]?.data).toBe('[DONE]');
+    expect(chunks[chunks.length - 1]?.done).toBe(true);
+
+    // First chunk is the role-prime (role: assistant, content: '').
+    const first = JSON.parse(chunks[0]!.data) as { choices: [{ delta: { role: string; content: string } }] };
+    expect(first.choices[0]!.delta.role).toBe('assistant');
+
+    // Next two chunks carry the text.
+    const d1 = JSON.parse(chunks[1]!.data) as { choices: [{ delta: { content: string } }] };
+    const d2 = JSON.parse(chunks[2]!.data) as { choices: [{ delta: { content: string } }] };
+    expect(d1.choices[0]!.delta.content).toBe('he');
+    expect(d2.choices[0]!.delta.content).toBe('llo');
+
+    // Finish-reason chunk.
+    const stopped = JSON.parse(chunks[3]!.data) as { choices: [{ finish_reason: string }] };
+    expect(stopped.choices[0]!.finish_reason).toBe('stop');
+  });
+});
+
+describe('LlmAdapterRegistry', () => {
+  it('returns the right adapter kind for each type', () => {
+    const reg = new LlmAdapterRegistry();
+    expect(reg.get('openai').kind).toBe('openai');
+    expect(reg.get('vllm').kind).toBe('vllm');
+    expect(reg.get('deepseek').kind).toBe('deepseek');
+    expect(reg.get('ollama').kind).toBe('ollama');
+    expect(reg.get('anthropic').kind).toBe('anthropic');
+  });
+
+  it('caches adapters between calls', () => {
+    const reg = new LlmAdapterRegistry();
+    const a = reg.get('openai');
+    const b = reg.get('openai');
+    expect(a).toBe(b);
+  });
+
+  it('rejects unsupported providers (gemini-cli is deferred)', () => {
+    const reg = new LlmAdapterRegistry();
+    expect(() => reg.get('gemini-cli')).toThrow(UnsupportedProviderError);
+    expect(() => reg.get('bogus')).toThrow(UnsupportedProviderError);
+  });
+});
--- a/src/mcpd/tests/llm-infer-route.test.ts
+++ b/src/mcpd/tests/llm-infer-route.test.ts
@@ -0,0 +1,208 @@
+import { describe, it, expect, vi, afterEach } from 'vitest';
+import Fastify from 'fastify';
+import type { FastifyInstance } from 'fastify';
+import { registerLlmInferRoutes } from '../src/routes/llm-infer.js';
+import { LlmAdapterRegistry } from '../src/services/llm/dispatcher.js';
+import { errorHandler } from '../src/middleware/error-handler.js';
+import type { LlmView } from '../src/services/llm.service.js';
+import { NotFoundError } from '../src/services/mcp-server.service.js';
+
+let app: FastifyInstance;
+
+function makeLlmView(overrides: Partial<LlmView> = {}): LlmView {
+  return {
+    id: 'llm-1',
+    name: 'claude',
+    type: 'anthropic',
+    model: 'claude-3-5-sonnet-20241022',
+    url: '',
+    tier: 'heavy',
+    description: '',
+    apiKeyRef: { name: 'anthropic-key', key: 'token' },
+    extraConfig: {},
+    version: 1,
+    createdAt: new Date(),
+    updatedAt: new Date(),
+    ...overrides,
+  };
+}
+
+afterEach(async () => {
+  if (app) await app.close();
+});
+
+function sseResponse(events: string[]): Response {
+  const body = events.join('\n\n') + '\n\n';
+  const stream = new ReadableStream<Uint8Array>({
+    start(controller) {
+      controller.enqueue(new TextEncoder().encode(body));
+      controller.close();
+    },
+  });
+  return new Response(stream, { status: 200 });
+}
+
+interface LlmServiceLike {
+  getByName: (name: string) => Promise<LlmView>;
+  resolveApiKey: (name: string) => Promise<string>;
+}
+
+async function setupApp(
+  llmService: LlmServiceLike,
+  adapters: LlmAdapterRegistry,
+  onInferenceEvent?: Parameters<typeof registerLlmInferRoutes>[1]['onInferenceEvent'],
+): Promise<FastifyInstance> {
+  app = Fastify({ logger: false });
+  app.setErrorHandler(errorHandler);
+  const deps: Parameters<typeof registerLlmInferRoutes>[1] = {
+    // eslint-disable-next-line @typescript-eslint/no-explicit-any
+    llmService: llmService as any,
+    adapters,
+  };
+  if (onInferenceEvent !== undefined) deps.onInferenceEvent = onInferenceEvent;
+  registerLlmInferRoutes(app, deps);
+  await app.ready();
+  return app;
+}
+
+describe('POST /api/v1/llms/:name/infer', () => {
+  it('returns 404 when the Llm does not exist', async () => {
+    const svc: LlmServiceLike = {
+      getByName: async () => { throw new NotFoundError('Llm not found: missing'); },
+      resolveApiKey: async () => '',
+    };
+    await setupApp(svc, new LlmAdapterRegistry());
+    const res = await app.inject({
+      method: 'POST',
+      url: '/api/v1/llms/missing/infer',
+      payload: { messages: [{ role: 'user', content: 'hi' }] },
+    });
+    expect(res.statusCode).toBe(404);
+  });
+
+  it('returns 400 when messages is missing', async () => {
+    const svc: LlmServiceLike = {
+      getByName: async () => makeLlmView({ apiKeyRef: null }),
+      resolveApiKey: async () => '',
+    };
+    await setupApp(svc, new LlmAdapterRegistry());
+    const res = await app.inject({
+      method: 'POST',
+      url: '/api/v1/llms/claude/infer',
+      payload: {},
+    });
+    expect(res.statusCode).toBe(400);
+  });
+
+  it('dispatches non-streaming to the adapter and returns its JSON', async () => {
+    const fetchFn = vi.fn(async () => new Response(JSON.stringify({
+      id: 'msg_1', model: 'claude-3-5-sonnet-20241022', role: 'assistant',
+      content: [{ type: 'text', text: 'hello' }],
+      stop_reason: 'end_turn',
+      usage: { input_tokens: 1, output_tokens: 1 },
+    }), { status: 200 }));
+    const adapters = new LlmAdapterRegistry({ fetch: fetchFn as unknown as typeof fetch });
+    const svc: LlmServiceLike = {
+      getByName: async () => makeLlmView(),
+      resolveApiKey: async () => 'sk-ant-xyz',
+    };
+    const events: unknown[] = [];
+    await setupApp(svc, adapters, (e) => events.push(e));
+
+    const res = await app.inject({
+      method: 'POST',
+      url: '/api/v1/llms/claude/infer',
+      payload: { messages: [{ role: 'user', content: 'hi' }] },
+    });
+    expect(res.statusCode).toBe(200);
+    const body = res.json<{ choices: Array<{ message: { content: string } }> }>();
+    expect(body.choices[0]!.message.content).toBe('hello');
+
+    // Audit event emitted
+    expect(events).toHaveLength(1);
+    expect((events[0] as { kind: string; llmName: string; status: number }).kind).toBe('llm_inference_call');
+    expect((events[0] as { llmName: string }).llmName).toBe('claude');
+    expect((events[0] as { streaming: boolean }).streaming).toBe(false);
+    expect((events[0] as { status: number }).status).toBe(200);
+  });
+
+  it('500s when apiKey resolution fails', async () => {
+    const adapters = new LlmAdapterRegistry();
+    const svc: LlmServiceLike = {
+      getByName: async () => makeLlmView(),
+      resolveApiKey: async () => { throw new Error('secret not found'); },
+    };
+    await setupApp(svc, adapters);
+
+    const res = await app.inject({
+      method: 'POST',
+      url: '/api/v1/llms/claude/infer',
+      payload: { messages: [{ role: 'user', content: 'hi' }] },
+    });
+    expect(res.statusCode).toBe(500);
+    expect(res.json<{ error: string }>().error).toMatch(/secret not found/);
+  });
+
+  it('skips apiKey resolution when the Llm has no apiKeyRef', async () => {
+    const fetchFn = vi.fn(async () => new Response(JSON.stringify({ id: 'x', choices: [] }), { status: 200 }));
+    const adapters = new LlmAdapterRegistry({ fetch: fetchFn as unknown as typeof fetch });
+    const resolveSpy = vi.fn();
+    const svc: LlmServiceLike = {
+      getByName: async () => makeLlmView({ type: 'ollama', url: 'http://ollama:11434', apiKeyRef: null }),
+      resolveApiKey: resolveSpy as unknown as LlmServiceLike['resolveApiKey'],
+    };
+    await setupApp(svc, adapters);
+
+    const res = await app.inject({
+      method: 'POST',
+      url: '/api/v1/llms/ollama-local/infer',
+      payload: { messages: [{ role: 'user', content: 'hi' }] },
+    });
+    expect(res.statusCode).toBe(200);
+    expect(resolveSpy).not.toHaveBeenCalled();
+  });
+
+  it('streams SSE chunks for stream: true', async () => {
+    const fetchFn = vi.fn(async () => sseResponse([
+      'event: content_block_delta\ndata: {"type":"content_block_delta","delta":{"type":"text_delta","text":"hi"}}',
+      'event: message_stop\ndata: {"type":"message_stop"}',
+    ]));
+    const adapters = new LlmAdapterRegistry({ fetch: fetchFn as unknown as typeof fetch });
+    const svc: LlmServiceLike = {
+      getByName: async () => makeLlmView(),
+      resolveApiKey: async () => 'sk-ant-xyz',
+    };
+    const events: Array<{ streaming: boolean; status: number }> = [];
+    // eslint-disable-next-line @typescript-eslint/no-explicit-any
+    await setupApp(svc, adapters, ((e: any) => events.push(e)) as any);
+
+    const res = await app.inject({
+      method: 'POST',
+      url: '/api/v1/llms/claude/infer',
+      payload: { messages: [{ role: 'user', content: 'hi' }], stream: true },
+    });
+    expect(res.statusCode).toBe(200);
+    expect(res.body).toContain('data:');
+    expect(res.body).toContain('[DONE]');
+    expect(events).toHaveLength(1);
+    expect(events[0]!.streaming).toBe(true);
+  });
+
+  it('502s on adapter errors (non-streaming)', async () => {
+    const fetchFn = vi.fn(async () => { throw new Error('upstream down'); });
+    const adapters = new LlmAdapterRegistry({ fetch: fetchFn as unknown as typeof fetch });
+    const svc: LlmServiceLike = {
+      getByName: async () => makeLlmView({ type: 'openai', url: 'http://example', apiKeyRef: null }),
+      resolveApiKey: async () => '',
+    };
+    await setupApp(svc, adapters);
+
+    const res = await app.inject({
+      method: 'POST',
+      url: '/api/v1/llms/x/infer',
+      payload: { messages: [{ role: 'user', content: 'hi' }] },
+    });
+    expect(res.statusCode).toBe(502);
+    expect(res.json<{ error: string }>().error).toMatch(/upstream down/);
+  });
+});
--- a/src/mcpd/tests/llm-routes.test.ts
+++ b/src/mcpd/tests/llm-routes.test.ts
@@ -0,0 +1,194 @@
+import { describe, it, expect, vi, afterEach } from 'vitest';
+import Fastify from 'fastify';
+import type { FastifyInstance } from 'fastify';
+import { registerLlmRoutes } from '../src/routes/llms.js';
+import { LlmService } from '../src/services/llm.service.js';
+import { errorHandler } from '../src/middleware/error-handler.js';
+import type { ILlmRepository } from '../src/repositories/llm.repository.js';
+import type { Llm, Secret } from '@prisma/client';
+
+let app: FastifyInstance;
+
+function makeLlm(overrides: Partial<Llm> = {}): Llm {
+  return {
+    id: 'llm-1',
+    name: 'claude',
+    type: 'anthropic',
+    model: 'claude-3-5-sonnet-20241022',
+    url: '',
+    tier: 'heavy',
+    description: '',
+    apiKeySecretId: null,
+    apiKeySecretKey: null,
+    extraConfig: {},
+    version: 1,
+    createdAt: new Date(),
+    updatedAt: new Date(),
+    ...overrides,
+  };
+}
+
+function mockRepo(initial: Llm[] = []): ILlmRepository {
+  const rows = new Map(initial.map((r) => [r.id, r]));
+  return {
+    findAll: vi.fn(async () => [...rows.values()]),
+    findById: vi.fn(async (id: string) => rows.get(id) ?? null),
+    findByName: vi.fn(async (name: string) => {
+      for (const r of rows.values()) if (r.name === name) return r;
+      return null;
+    }),
+    findByTier: vi.fn(async () => []),
+    create: vi.fn(async (data) => {
+      const row = makeLlm({ id: 'new-id', name: data.name, type: data.type, model: data.model });
+      rows.set(row.id, row);
+      return row;
+    }),
+    update: vi.fn(async (id, data) => {
+      const existing = rows.get(id)!;
+      const next: Llm = {
+        ...existing,
+        ...(data.model !== undefined ? { model: data.model } : {}),
+      };
+      rows.set(id, next);
+      return next;
+    }),
+    delete: vi.fn(async (id) => { rows.delete(id); }),
+  };
+}
+
+function mockSecretService() {
+  const sec: Secret = {
+    id: 'sec-1', name: 'anthropic-key', backendId: 'b', data: {}, externalRef: '',
+    version: 1, createdAt: new Date(), updatedAt: new Date(),
+  };
+  return {
+    getById: vi.fn(async (id: string) => {
+      if (id === sec.id) return sec;
+      throw new Error('not found');
+    }),
+    getByName: vi.fn(async (name: string) => {
+      if (name === sec.name) return sec;
+      throw new Error('not found');
+    }),
+    resolveData: vi.fn(async () => ({ token: 'sk-ant-xyz' })),
+  };
+}
+
+afterEach(async () => {
+  if (app) await app.close();
+});
+
+async function createApp(repo: ILlmRepository): Promise<FastifyInstance> {
+  app = Fastify({ logger: false });
+  app.setErrorHandler(errorHandler);
+  // eslint-disable-next-line @typescript-eslint/no-explicit-any
+  const service = new LlmService(repo, mockSecretService() as any);
+  registerLlmRoutes(app, service);
+  await app.ready();
+  return app;
+}
+
+describe('Llm Routes', () => {
+  it('GET /api/v1/llms returns a list', async () => {
+    await createApp(mockRepo([makeLlm()]));
+    const res = await app.inject({ method: 'GET', url: '/api/v1/llms' });
+    expect(res.statusCode).toBe(200);
+    const body = res.json<Array<{ name: string }>>();
+    expect(body).toHaveLength(1);
+    expect(body[0]!.name).toBe('claude');
+  });
+
+  it('GET /api/v1/llms/:id returns 404 when missing', async () => {
+    await createApp(mockRepo());
+    const res = await app.inject({ method: 'GET', url: '/api/v1/llms/missing' });
+    expect(res.statusCode).toBe(404);
+  });
+
+  it('GET /api/v1/llms/:nameOrId resolves by human name when not a CUID', async () => {
+    await createApp(mockRepo([makeLlm({ id: 'llm-1', name: 'claude' })]));
+    const res = await app.inject({ method: 'GET', url: '/api/v1/llms/claude' });
+    expect(res.statusCode).toBe(200);
+    expect(res.json<{ name: string; id: string }>().name).toBe('claude');
+  });
+
+  it('HEAD /api/v1/llms/:name returns 200 for an existing Llm (failover RBAC pre-check)', async () => {
+    await createApp(mockRepo([makeLlm({ name: 'claude' })]));
+    const res = await app.inject({ method: 'HEAD', url: '/api/v1/llms/claude' });
+    expect(res.statusCode).toBe(200);
+  });
+
+  it('HEAD /api/v1/llms/:name returns 404 for a missing Llm', async () => {
+    await createApp(mockRepo());
+    const res = await app.inject({ method: 'HEAD', url: '/api/v1/llms/missing' });
+    expect(res.statusCode).toBe(404);
+  });
+
+  it('POST /api/v1/llms creates and returns 201', async () => {
+    await createApp(mockRepo());
+    const res = await app.inject({
+      method: 'POST',
+      url: '/api/v1/llms',
+      payload: {
+        name: 'ollama-local',
+        type: 'ollama',
+        model: 'llama3',
+        url: 'http://localhost:11434',
+      },
+    });
+    expect(res.statusCode).toBe(201);
+    expect(res.json<{ name: string }>().name).toBe('ollama-local');
+  });
+
+  it('POST /api/v1/llms rejects bad input with 400', async () => {
+    await createApp(mockRepo());
+    const res = await app.inject({
+      method: 'POST',
+      url: '/api/v1/llms',
+      payload: { name: '', type: 'anthropic', model: 'x' },
+    });
+    expect(res.statusCode).toBe(400);
+  });
+
+  it('POST /api/v1/llms returns 409 when name exists', async () => {
+    await createApp(mockRepo([makeLlm({ name: 'claude' })]));
+    const res = await app.inject({
+      method: 'POST',
+      url: '/api/v1/llms',
+      payload: { name: 'claude', type: 'anthropic', model: 'x' },
+    });
+    expect(res.statusCode).toBe(409);
+  });
+
+  it('PUT /api/v1/llms/:id updates model', async () => {
+    await createApp(mockRepo([makeLlm({ id: 'llm-1' })]));
+    const res = await app.inject({
+      method: 'PUT',
+      url: '/api/v1/llms/llm-1',
+      payload: { model: 'claude-3-opus' },
+    });
+    expect(res.statusCode).toBe(200);
+    expect(res.json<{ model: string }>().model).toBe('claude-3-opus');
+  });
+
+  it('PUT /api/v1/llms/:id returns 404 when missing', async () => {
+    await createApp(mockRepo());
+    const res = await app.inject({
+      method: 'PUT',
+      url: '/api/v1/llms/missing',
+      payload: { model: 'x' },
+    });
+    expect(res.statusCode).toBe(404);
+  });
+
+  it('DELETE /api/v1/llms/:id returns 204', async () => {
+    await createApp(mockRepo([makeLlm({ id: 'llm-1' })]));
+    const res = await app.inject({ method: 'DELETE', url: '/api/v1/llms/llm-1' });
+    expect(res.statusCode).toBe(204);
+  });
+
+  it('DELETE /api/v1/llms/:id returns 404 when missing', async () => {
+    await createApp(mockRepo());
+    const res = await app.inject({ method: 'DELETE', url: '/api/v1/llms/missing' });
+    expect(res.statusCode).toBe(404);
+  });
+});
--- a/src/mcpd/tests/llm-service.test.ts
+++ b/src/mcpd/tests/llm-service.test.ts
@@ -0,0 +1,232 @@
+import { describe, it, expect, vi } from 'vitest';
+import { LlmService } from '../src/services/llm.service.js';
+import type { ILlmRepository } from '../src/repositories/llm.repository.js';
+import type { Llm, Secret } from '@prisma/client';
+
+function makeLlm(overrides: Partial<Llm> = {}): Llm {
+  return {
+    id: 'llm-1',
+    name: 'claude',
+    type: 'anthropic',
+    model: 'claude-3-5-sonnet-20241022',
+    url: '',
+    tier: 'heavy',
+    description: '',
+    apiKeySecretId: null,
+    apiKeySecretKey: null,
+    extraConfig: {},
+    version: 1,
+    createdAt: new Date(),
+    updatedAt: new Date(),
+    ...overrides,
+  };
+}
+
+function makeSecret(overrides: Partial<Secret> = {}): Secret {
+  return {
+    id: 'sec-anthropic',
+    name: 'anthropic-key',
+    backendId: 'backend-plaintext',
+    data: {},
+    externalRef: '',
+    version: 1,
+    createdAt: new Date(),
+    updatedAt: new Date(),
+    ...overrides,
+  };
+}
+
+function mockRepo(initial: Llm[] = []): ILlmRepository {
+  const rows = new Map<string, Llm>(initial.map((r) => [r.id, r]));
+  return {
+    findAll: vi.fn(async () => [...rows.values()]),
+    findById: vi.fn(async (id: string) => rows.get(id) ?? null),
+    findByName: vi.fn(async (name: string) => {
+      for (const r of rows.values()) if (r.name === name) return r;
+      return null;
+    }),
+    findByTier: vi.fn(async (tier: string) => [...rows.values()].filter((r) => r.tier === tier)),
+    create: vi.fn(async (data) => {
+      const row = makeLlm({
+        id: `llm-${String(rows.size + 1)}`,
+        name: data.name,
+        type: data.type,
+        model: data.model,
+        url: data.url ?? '',
+        tier: data.tier ?? 'fast',
+        description: data.description ?? '',
+        apiKeySecretId: data.apiKeySecretId ?? null,
+        apiKeySecretKey: data.apiKeySecretKey ?? null,
+        extraConfig: (data.extraConfig ?? {}) as Llm['extraConfig'],
+      });
+      rows.set(row.id, row);
+      return row;
+    }),
+    update: vi.fn(async (id, data) => {
+      const existing = rows.get(id);
+      if (!existing) throw new Error('not found');
+      const next: Llm = {
+        ...existing,
+        ...(data.model !== undefined ? { model: data.model } : {}),
+        ...(data.url !== undefined ? { url: data.url } : {}),
+        ...(data.tier !== undefined ? { tier: data.tier } : {}),
+        ...(data.description !== undefined ? { description: data.description } : {}),
+        ...(data.apiKeySecretId !== undefined ? { apiKeySecretId: data.apiKeySecretId } : {}),
+        ...(data.apiKeySecretKey !== undefined ? { apiKeySecretKey: data.apiKeySecretKey } : {}),
+        ...(data.extraConfig !== undefined ? { extraConfig: data.extraConfig as Llm['extraConfig'] } : {}),
+      };
+      rows.set(id, next);
+      return next;
+    }),
+    delete: vi.fn(async (id) => { rows.delete(id); }),
+  };
+}
+
+function mockSecrets(secretByName: Record<string, Secret>, resolved: Record<string, string> = {}): {
+  getById: ReturnType<typeof vi.fn>;
+  getByName: ReturnType<typeof vi.fn>;
+  resolveData: ReturnType<typeof vi.fn>;
+} {
+  return {
+    getById: vi.fn(async (id: string) => {
+      for (const s of Object.values(secretByName)) if (s.id === id) return s;
+      throw new Error(`secret not found: ${id}`);
+    }),
+    getByName: vi.fn(async (name: string) => {
+      const s = secretByName[name];
+      if (!s) throw new Error(`secret not found: ${name}`);
+      return s;
+    }),
+    resolveData: vi.fn(async () => resolved),
+  };
+}
+
+describe('LlmService', () => {
+  it('create parses input and resolves apiKeyRef → secret id', async () => {
+    const repo = mockRepo();
+    const sec = makeSecret();
+    const secrets = mockSecrets({ 'anthropic-key': sec });
+    // eslint-disable-next-line @typescript-eslint/no-explicit-any
+    const svc = new LlmService(repo, secrets as any);
+
+    const view = await svc.create({
+      name: 'claude',
+      type: 'anthropic',
+      model: 'claude-3-5-sonnet-20241022',
+      tier: 'heavy',
+      apiKeyRef: { name: 'anthropic-key', key: 'token' },
+    });
+
+    expect(view.name).toBe('claude');
+    expect(view.apiKeyRef).toEqual({ name: 'anthropic-key', key: 'token' });
+    expect(secrets.getByName).toHaveBeenCalledWith('anthropic-key');
+    expect(repo.create).toHaveBeenCalledWith(expect.objectContaining({
+      apiKeySecretId: sec.id,
+      apiKeySecretKey: 'token',
+    }));
+  });
+
+  it('create without apiKeyRef leaves FK columns null', async () => {
+    const repo = mockRepo();
+    const secrets = mockSecrets({});
+    // eslint-disable-next-line @typescript-eslint/no-explicit-any
+    const svc = new LlmService(repo, secrets as any);
+
+    const view = await svc.create({
+      name: 'ollama-local',
+      type: 'ollama',
+      model: 'llama3',
+      url: 'http://localhost:11434',
+      tier: 'fast',
+    });
+
+    expect(view.apiKeyRef).toBeNull();
+    expect(secrets.getByName).not.toHaveBeenCalled();
+  });
+
+  it('create rejects duplicate name', async () => {
+    const repo = mockRepo([makeLlm({ name: 'claude' })]);
+    // eslint-disable-next-line @typescript-eslint/no-explicit-any
+    const svc = new LlmService(repo, mockSecrets({}) as any);
+    await expect(svc.create({
+      name: 'claude', type: 'anthropic', model: 'x',
+    })).rejects.toThrow(/already exists/);
+  });
+
+  it('update with apiKeyRef null unlinks the secret', async () => {
+    const sec = makeSecret();
+    const repo = mockRepo([makeLlm({ apiKeySecretId: sec.id, apiKeySecretKey: 'token' })]);
+    const secrets = mockSecrets({ 'anthropic-key': sec });
+    // eslint-disable-next-line @typescript-eslint/no-explicit-any
+    const svc = new LlmService(repo, secrets as any);
+
+    await svc.update('llm-1', { apiKeyRef: null });
+    expect(repo.update).toHaveBeenCalledWith('llm-1', expect.objectContaining({
+      apiKeySecretId: null,
+      apiKeySecretKey: null,
+    }));
+  });
+
+  it('resolveApiKey reads through SecretService', async () => {
+    const sec = makeSecret();
+    const repo = mockRepo([makeLlm({ apiKeySecretId: sec.id, apiKeySecretKey: 'token' })]);
+    const secrets = mockSecrets({ 'anthropic-key': sec }, { token: 'sk-ant-xyz' });
+    // eslint-disable-next-line @typescript-eslint/no-explicit-any
+    const svc = new LlmService(repo, secrets as any);
+
+    const key = await svc.resolveApiKey('claude');
+    expect(key).toBe('sk-ant-xyz');
+  });
+
+  it('resolveApiKey throws when Llm has no apiKeyRef', async () => {
+    const repo = mockRepo([makeLlm()]);
+    // eslint-disable-next-line @typescript-eslint/no-explicit-any
+    const svc = new LlmService(repo, mockSecrets({}) as any);
+    await expect(svc.resolveApiKey('claude')).rejects.toThrow(/no apiKeyRef/);
+  });
+
+  it('resolveApiKey throws when the secret key is missing', async () => {
+    const sec = makeSecret();
+    const repo = mockRepo([makeLlm({ apiKeySecretId: sec.id, apiKeySecretKey: 'missing-key' })]);
+    const secrets = mockSecrets({ 'anthropic-key': sec }, { token: 'x' });
+    // eslint-disable-next-line @typescript-eslint/no-explicit-any
+    const svc = new LlmService(repo, secrets as any);
+    await expect(svc.resolveApiKey('claude')).rejects.toThrow(/no key 'missing-key'/);
+  });
+
+  it('list returns views with apiKeyRef rendered from secret name', async () => {
+    const sec = makeSecret();
+    const repo = mockRepo([makeLlm({ apiKeySecretId: sec.id, apiKeySecretKey: 'token' })]);
+    const secrets = mockSecrets({ 'anthropic-key': sec });
+    // eslint-disable-next-line @typescript-eslint/no-explicit-any
+    const svc = new LlmService(repo, secrets as any);
+
+    const items = await svc.list();
+    expect(items).toHaveLength(1);
+    expect(items[0]!.apiKeyRef).toEqual({ name: 'anthropic-key', key: 'token' });
+  });
+
+  it('delete happy path', async () => {
+    const repo = mockRepo([makeLlm()]);
+    // eslint-disable-next-line @typescript-eslint/no-explicit-any
+    const svc = new LlmService(repo, mockSecrets({}) as any);
+    await svc.delete('llm-1');
+    expect(repo.delete).toHaveBeenCalledWith('llm-1');
+  });
+
+  it('validation: rejects invalid type', async () => {
+    const repo = mockRepo();
+    // eslint-disable-next-line @typescript-eslint/no-explicit-any
+    const svc = new LlmService(repo, mockSecrets({}) as any);
+    await expect(svc.create({ name: 'x', type: 'bogus', model: 'y' })).rejects.toThrow();
+  });
+
+  it('validation: rejects invalid tier', async () => {
+    const repo = mockRepo();
+    // eslint-disable-next-line @typescript-eslint/no-explicit-any
+    const svc = new LlmService(repo, mockSecrets({}) as any);
+    await expect(svc.create({
+      name: 'x', type: 'openai', model: 'gpt-4', tier: 'warp-speed',
+    })).rejects.toThrow();
+  });
+});
--- a/src/mcpd/tests/mcp-token-service.test.ts
+++ b/src/mcpd/tests/mcp-token-service.test.ts
@@ -0,0 +1,246 @@
+import { describe, it, expect, vi, beforeEach } from 'vitest';
+import { McpTokenService, PermissionCeilingError } from '../src/services/mcp-token.service.js';
+import { NotFoundError, ConflictError } from '../src/services/mcp-server.service.js';
+import type { IMcpTokenRepository, McpTokenWithRelations } from '../src/repositories/interfaces.js';
+import type { IProjectRepository } from '../src/repositories/project.repository.js';
+import type { IRbacDefinitionRepository } from '../src/repositories/rbac-definition.repository.js';
+import type { RbacService } from '../src/services/rbac.service.js';
+import { hashToken, isMcpToken, TOKEN_PREFIX } from '@mcpctl/shared';
+
+const PROJECT = { id: 'cproj1', name: 'myproj' };
+
+function makeRow(overrides: Partial<McpTokenWithRelations> = {}): McpTokenWithRelations {
+  return {
+    id: 'ctok1',
+    name: 'mytok',
+    projectId: PROJECT.id,
+    tokenHash: 'deadbeef',
+    tokenPrefix: 'mcpctl_pat_abcd',
+    ownerId: 'cuser1',
+    description: '',
+    createdAt: new Date(),
+    expiresAt: null,
+    lastUsedAt: null,
+    revokedAt: null,
+    project: PROJECT,
+    owner: { id: 'cuser1', email: 'alice@example.com' },
+    ...overrides,
+  };
+}
+
+function mockTokenRepo(): IMcpTokenRepository {
+  return {
+    findAll: vi.fn(async () => []),
+    findById: vi.fn(async () => null),
+    findByHash: vi.fn(async () => null),
+    findByNameAndProject: vi.fn(async () => null),
+    create: vi.fn(async (input) => makeRow({
+      name: input.name,
+      projectId: input.projectId,
+      tokenHash: input.tokenHash,
+      tokenPrefix: input.tokenPrefix,
+      ownerId: input.ownerId,
+      description: input.description ?? '',
+      expiresAt: input.expiresAt ?? null,
+    })),
+    revoke: vi.fn(async (id) => makeRow({ id, revokedAt: new Date() })),
+    touchLastUsed: vi.fn(async () => {}),
+    delete: vi.fn(async () => {}),
+  };
+}
+
+function mockProjectRepo(): IProjectRepository {
+  return {
+    findById: vi.fn(async (id) => (id === PROJECT.id ? PROJECT : null)),
+    findByName: vi.fn(async (name) => (name === PROJECT.name ? PROJECT : null)),
+    // minimal stubs for the rest — not exercised in these tests
+    findAll: vi.fn(async () => []),
+    create: vi.fn(),
+    update: vi.fn(),
+    delete: vi.fn(),
+    attachServer: vi.fn(),
+    detachServer: vi.fn(),
+    listServers: vi.fn(async () => []),
+  } as unknown as IProjectRepository;
+}
+
+function mockRbacRepo(): IRbacDefinitionRepository {
+  return {
+    findAll: vi.fn(async () => []),
+    findById: vi.fn(async () => null),
+    findByName: vi.fn(async () => null),
+    create: vi.fn(async () => ({ id: 'rbac-1', name: 'x', subjects: [], roleBindings: [], version: 1, createdAt: new Date(), updatedAt: new Date() })),
+    update: vi.fn(),
+    delete: vi.fn(async () => {}),
+  };
+}
+
+function mockRbacService(overrides: Partial<RbacService> = {}): RbacService {
+  return {
+    canAccess: vi.fn(async () => true),
+    canRunOperation: vi.fn(async () => true),
+    getAllowedScope: vi.fn(async () => ({ wildcard: true, names: new Set() })),
+    getPermissions: vi.fn(async () => []),
+    ...overrides,
+  } as unknown as RbacService;
+}
+
+describe('McpTokenService.create', () => {
+  let tokenRepo: ReturnType<typeof mockTokenRepo>;
+  let projectRepo: IProjectRepository;
+  let rbacRepo: ReturnType<typeof mockRbacRepo>;
+  let rbacService: RbacService;
+  let service: McpTokenService;
+
+  beforeEach(() => {
+    tokenRepo = mockTokenRepo();
+    projectRepo = mockProjectRepo();
+    rbacRepo = mockRbacRepo();
+    rbacService = mockRbacService();
+    service = new McpTokenService(tokenRepo, projectRepo, rbacRepo, rbacService);
+  });
+
+  it('creates a token with no bindings (rbacMode=empty, default)', async () => {
+    const result = await service.create('cuser1', {
+      name: 'mytok',
+      projectId: PROJECT.id,
+    });
+    expect(result.raw).toMatch(new RegExp(`^${TOKEN_PREFIX}`));
+    expect(isMcpToken(result.raw)).toBe(true);
+    expect(tokenRepo.create).toHaveBeenCalledTimes(1);
+    // Hash must be persisted, never raw
+    const args = vi.mocked(tokenRepo.create).mock.calls[0]![0];
+    expect(args.tokenHash).toBe(hashToken(result.raw));
+    expect(args.tokenPrefix).toBe(result.raw.slice(0, 16));
+    // No RBAC definition should be created when there are no bindings
+    expect(rbacRepo.create).not.toHaveBeenCalled();
+  });
+
+  it('creates an RbacDefinition with subject McpToken:<sha> when bindings are given', async () => {
+    const result = await service.create('cuser1', {
+      name: 'mytok',
+      projectId: PROJECT.id,
+      bindings: [{ role: 'view', resource: 'servers' }],
+    });
+    expect(rbacRepo.create).toHaveBeenCalledTimes(1);
+    const defArgs = vi.mocked(rbacRepo.create).mock.calls[0]![0];
+    const subjects = defArgs.subjects as Array<{ kind: string; name: string }>;
+    expect(subjects).toEqual([{ kind: 'McpToken', name: hashToken(result.raw) }]);
+    expect(defArgs.roleBindings).toEqual([{ role: 'view', resource: 'servers' }]);
+  });
+
+  it('rejects bindings the creator does not have (ceiling violation)', async () => {
+    rbacService = mockRbacService({
+      canAccess: vi.fn(async () => false),
+    } as Partial<RbacService>);
+    service = new McpTokenService(tokenRepo, projectRepo, rbacRepo, rbacService);
+
+    await expect(
+      service.create('cuser1', {
+        name: 'mytok',
+        projectId: PROJECT.id,
+        bindings: [{ role: 'edit', resource: 'servers' }],
+      }),
+    ).rejects.toThrow(PermissionCeilingError);
+    expect(tokenRepo.create).not.toHaveBeenCalled();
+  });
+
+  it('clones the creator\'s permissions when rbacMode=clone', async () => {
+    rbacService = mockRbacService({
+      getPermissions: vi.fn(async () => [
+        { role: 'view', resource: 'servers' },
+        { role: 'run', action: 'logs' },
+      ]),
+    } as Partial<RbacService>);
+    service = new McpTokenService(tokenRepo, projectRepo, rbacRepo, rbacService);
+
+    await service.create('cuser1', {
+      name: 'mytok',
+      projectId: PROJECT.id,
+      rbacMode: 'clone',
+    });
+    expect(rbacRepo.create).toHaveBeenCalledTimes(1);
+    const defArgs = vi.mocked(rbacRepo.create).mock.calls[0]![0];
+    expect(defArgs.roleBindings).toEqual([
+      { role: 'view', resource: 'servers' },
+      { role: 'run', action: 'logs' },
+    ]);
+  });
+
+  it('throws NotFoundError if project does not exist', async () => {
+    await expect(
+      service.create('cuser1', { name: 'mytok', projectId: 'nope' }),
+    ).rejects.toThrow(NotFoundError);
+  });
+
+  it('throws ConflictError if active token with same name in same project exists', async () => {
+    vi.mocked(tokenRepo.findByNameAndProject).mockResolvedValueOnce(makeRow());
+    await expect(
+      service.create('cuser1', { name: 'mytok', projectId: PROJECT.id }),
+    ).rejects.toThrow(ConflictError);
+  });
+});
+
+describe('McpTokenService.introspectRaw', () => {
+  let tokenRepo: ReturnType<typeof mockTokenRepo>;
+  let service: McpTokenService;
+
+  beforeEach(() => {
+    tokenRepo = mockTokenRepo();
+    service = new McpTokenService(tokenRepo, mockProjectRepo(), mockRbacRepo(), mockRbacService());
+  });
+
+  it('returns ok=false for unknown tokens', async () => {
+    const result = await service.introspectRaw(`${TOKEN_PREFIX}unknown`);
+    expect(result.ok).toBe(false);
+    expect(result.tokenName).toBeUndefined();
+  });
+
+  it('returns ok=true and principal info for active tokens, and updates lastUsedAt', async () => {
+    const raw = `${TOKEN_PREFIX}aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa`;
+    const hash = hashToken(raw);
+    vi.mocked(tokenRepo.findByHash).mockResolvedValueOnce(makeRow({ tokenHash: hash }));
+    const result = await service.introspectRaw(raw);
+    expect(result.ok).toBe(true);
+    expect(result.projectName).toBe(PROJECT.name);
+    expect(result.tokenName).toBe('mytok');
+    expect(tokenRepo.touchLastUsed).toHaveBeenCalled();
+  });
+
+  it('rejects revoked tokens', async () => {
+    const raw = `${TOKEN_PREFIX}bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb`;
+    vi.mocked(tokenRepo.findByHash).mockResolvedValueOnce(makeRow({ tokenHash: hashToken(raw), revokedAt: new Date() }));
+    const result = await service.introspectRaw(raw);
+    expect(result.ok).toBe(false);
+    expect(result.revoked).toBe(true);
+  });
+
+  it('rejects expired tokens', async () => {
+    const raw = `${TOKEN_PREFIX}cccccccccccccccccccccccccccccccc`;
+    const past = new Date(Date.now() - 60_000);
+    vi.mocked(tokenRepo.findByHash).mockResolvedValueOnce(makeRow({ tokenHash: hashToken(raw), expiresAt: past }));
+    const result = await service.introspectRaw(raw);
+    expect(result.ok).toBe(false);
+    expect(result.expired).toBe(true);
+  });
+});
+
+describe('McpTokenService.revoke', () => {
+  it('marks revokedAt and removes the auto-created RbacDefinition', async () => {
+    const tokenRepo = mockTokenRepo();
+    const rbacRepo = mockRbacRepo();
+    const service = new McpTokenService(tokenRepo, mockProjectRepo(), rbacRepo, mockRbacService());
+
+    const row = makeRow();
+    vi.mocked(tokenRepo.findById).mockResolvedValue(row);
+    vi.mocked(rbacRepo.findByName).mockResolvedValue({
+      id: 'rbac-ctok1', name: 'mcptoken-ctok1', subjects: [], roleBindings: [], version: 1, createdAt: new Date(), updatedAt: new Date(),
+    });
+
+    await service.revoke('ctok1');
+
+    expect(tokenRepo.revoke).toHaveBeenCalledWith('ctok1');
+    expect(rbacRepo.findByName).toHaveBeenCalledWith('mcptoken-ctok1');
+    expect(rbacRepo.delete).toHaveBeenCalledWith('rbac-ctok1');
+  });
+});
--- a/src/mcpd/tests/persistent-stdio.test.ts
+++ b/src/mcpd/tests/persistent-stdio.test.ts
@@ -0,0 +1,111 @@
+import { describe, it, expect, vi } from 'vitest';
+import { PassThrough } from 'node:stream';
+import { PersistentStdioClient } from '../src/services/transport/persistent-stdio.js';
+import type { InteractiveExec, McpOrchestrator } from '../src/services/orchestrator.js';
+
+function makeFakeExec(): {
+  iexec: InteractiveExec;
+  written: string[];
+  emit: (line: unknown) => void;
+} {
+  const stdout = new PassThrough();
+  const written: string[] = [];
+  const iexec: InteractiveExec = {
+    stdout,
+    write(data) { written.push(data); },
+    close() { stdout.destroy(); },
+  };
+  const emit = (msg: unknown) => {
+    stdout.write(JSON.stringify(msg) + '\n');
+  };
+  return { iexec, written, emit };
+}
+
+function makeOrchestrator(overrides: Partial<McpOrchestrator> = {}): McpOrchestrator {
+  return {
+    pullImage: vi.fn(),
+    createContainer: vi.fn(),
+    stopContainer: vi.fn(),
+    removeContainer: vi.fn(),
+    inspectContainer: vi.fn(),
+    getContainerLogs: vi.fn(),
+    execInContainer: vi.fn(),
+    ping: vi.fn(),
+    ...overrides,
+  } as McpOrchestrator;
+}
+
+describe('PersistentStdioClient', () => {
+  it('exec mode calls execInteractive with the command', async () => {
+    const fake = makeFakeExec();
+    const execInteractive = vi.fn(async () => fake.iexec);
+    const orch = makeOrchestrator({ execInteractive });
+
+    const client = new PersistentStdioClient(
+      orch,
+      'container-1',
+      { kind: 'exec', command: ['node', 'index.js'] },
+    );
+
+    // Drive the handshake: respond to the first init request (id=1)
+    // then to the subsequent tools/list request (id=2).
+    const sendPromise = client.send('tools/list');
+    await new Promise((r) => setTimeout(r, 10));
+
+    const init = JSON.parse(fake.written[0]!);
+    expect(init.method).toBe('initialize');
+    fake.emit({ jsonrpc: '2.0', id: init.id, result: { capabilities: {} } });
+    await new Promise((r) => setTimeout(r, 150));
+
+    // Second written msg is notifications/initialized; third is tools/list
+    const toolsReq = JSON.parse(fake.written[2]!);
+    expect(toolsReq.method).toBe('tools/list');
+    fake.emit({ jsonrpc: '2.0', id: toolsReq.id, result: { tools: [] } });
+
+    const res = await sendPromise;
+    expect(res.result).toEqual({ tools: [] });
+    expect(execInteractive).toHaveBeenCalledWith('container-1', ['node', 'index.js']);
+    client.close();
+  });
+
+  it('attach mode calls attachInteractive and never execInteractive', async () => {
+    const fake = makeFakeExec();
+    const attachInteractive = vi.fn(async () => fake.iexec);
+    const execInteractive = vi.fn();
+    const orch = makeOrchestrator({ attachInteractive, execInteractive });
+
+    const client = new PersistentStdioClient(
+      orch,
+      'container-gitea',
+      { kind: 'attach' },
+    );
+
+    const sendPromise = client.send('tools/list');
+    await new Promise((r) => setTimeout(r, 10));
+
+    const init = JSON.parse(fake.written[0]!);
+    fake.emit({ jsonrpc: '2.0', id: init.id, result: { capabilities: {} } });
+    await new Promise((r) => setTimeout(r, 150));
+
+    const req = JSON.parse(fake.written[2]!);
+    fake.emit({ jsonrpc: '2.0', id: req.id, result: { tools: [{ name: 'list_repos' }] } });
+
+    const res = await sendPromise;
+    expect((res.result as { tools: unknown[] }).tools).toHaveLength(1);
+    expect(attachInteractive).toHaveBeenCalledWith('container-gitea');
+    expect(execInteractive).not.toHaveBeenCalled();
+    client.close();
+  });
+
+  it('attach mode throws if orchestrator does not support attach', async () => {
+    const orch = makeOrchestrator({}); // no attachInteractive
+    const client = new PersistentStdioClient(orch, 'c', { kind: 'attach' });
+    await expect(client.send('tools/list')).rejects.toThrow(/attach/i);
+  });
+
+  it('exec mode throws if orchestrator does not support execInteractive', async () => {
+    const orch = makeOrchestrator({}); // no execInteractive
+    const client = new PersistentStdioClient(orch, 'c', { kind: 'exec', command: ['x'] });
+    await expect(client.send('tools/list')).rejects.toThrow(/interactive exec/i);
+  });
+});
--- a/src/mcpd/tests/secret-backend-rotator.test.ts
+++ b/src/mcpd/tests/secret-backend-rotator.test.ts
@@ -0,0 +1,276 @@
+import { describe, it, expect, vi } from 'vitest';
+import { SecretBackendRotator } from '../src/services/secret-backend-rotator.service.js';
+import type { SecretBackend, Secret } from '@prisma/client';
+
+function makeBackend(overrides: Partial<SecretBackend> = {}): SecretBackend {
+  return {
+    id: 'backend-1',
+    name: 'bao',
+    type: 'openbao',
+    config: {
+      url: 'http://bao.example:8200',
+      auth: 'token',
+      mount: 'secret',
+      pathPrefix: 'mcpd',
+      tokenSecretRef: { name: 'bao-creds', key: 'token' },
+      rotation: { enabled: true, tokenRole: 'app-mcpd-role', intervalHours: 24 },
+    } as unknown as SecretBackend['config'],
+    tokenMeta: { rotatable: true } as unknown as SecretBackend['tokenMeta'],
+    isDefault: false,
+    description: '',
+    version: 1,
+    createdAt: new Date(),
+    updatedAt: new Date(),
+    ...overrides,
+  };
+}
+
+function makeSecret(overrides: Partial<Secret> = {}): Secret {
+  return {
+    id: 'sec-1',
+    name: 'bao-creds',
+    backendId: 'backend-plaintext',
+    data: { token: 'old.token.value' },
+    externalRef: '',
+    version: 1,
+    createdAt: new Date(),
+    updatedAt: new Date(),
+    ...overrides,
+  };
+}
+
+interface MockState {
+  backend: SecretBackend;
+  secret: Secret;
+  secretData: Record<string, string>;
+  lastTokenMeta: Record<string, unknown> | null;
+  lastSecretUpdate: Record<string, unknown> | null;
+}
+
+function mockDeps(state: MockState, vaultResponses: Array<{ match: RegExp; status: number; body?: unknown }>) {
+  const fetchFn = vi.fn(async (url: string | URL, init?: RequestInit) => {
+    const key = `${init?.method ?? 'GET'} ${String(url)}`;
+    const match = vaultResponses.find((r) => r.match.test(key) || r.match.test(String(url)));
+    if (!match) throw new Error(`unexpected vault call: ${key}`);
+    const body = match.body !== undefined ? JSON.stringify(match.body) : '';
+    return new Response(body, { status: match.status });
+  });
+
+  const backends = {
+    getById: vi.fn(async (id: string) => {
+      if (id === state.backend.id) return state.backend;
+      throw new Error(`not found: ${id}`);
+    }),
+    updateTokenMeta: vi.fn(async (id: string, meta: Record<string, unknown>) => {
+      expect(id).toBe(state.backend.id);
+      state.lastTokenMeta = meta;
+      state.backend = { ...state.backend, tokenMeta: meta as unknown as SecretBackend['tokenMeta'] };
+      return state.backend;
+    }),
+  };
+
+  const secrets = {
+    getByName: vi.fn(async (name: string) => {
+      if (name === state.secret.name) return state.secret;
+      throw new Error(`secret not found: ${name}`);
+    }),
+    resolveData: vi.fn(async () => ({ ...state.secretData })),
+    update: vi.fn(async (id: string, input: { data: Record<string, string> }) => {
+      expect(id).toBe(state.secret.id);
+      state.secretData = { ...input.data };
+      state.lastSecretUpdate = input as unknown as Record<string, unknown>;
+      return state.secret;
+    }),
+  };
+
+  return { fetchFn, backends, secrets };
+}
+
+describe('SecretBackendRotator', () => {
+  it('isRotatable: true for wizard-provisioned openbao', () => {
+    const state: MockState = {
+      backend: makeBackend(),
+      secret: makeSecret(),
+      secretData: { token: 'x' },
+      lastTokenMeta: null,
+      lastSecretUpdate: null,
+    };
+    const { backends, secrets } = mockDeps(state, []);
+    const r = new SecretBackendRotator({
+      backends: backends as unknown as Parameters<typeof SecretBackendRotator.prototype.rotateOne>[0] extends never ? never : never,
+      // eslint-disable-next-line @typescript-eslint/no-explicit-any
+    } as any);
+    // Use a real rotator with both deps filled.
+    const rotator = new SecretBackendRotator({
+      backends: backends as never,
+      secrets: secrets as never,
+    });
+    expect(rotator.isRotatable(state.backend)).toBe(true);
+    expect(r).toBeDefined();
+  });
+
+  it('isRotatable: false for kubernetes-auth openbao', () => {
+    const state: MockState = {
+      backend: makeBackend({
+        config: {
+          url: 'http://bao', auth: 'kubernetes', role: 'r',
+          rotation: { enabled: true, tokenRole: 'app-mcpd-role' },
+        } as unknown as SecretBackend['config'],
+      }),
+      secret: makeSecret(),
+      secretData: {},
+      lastTokenMeta: null,
+      lastSecretUpdate: null,
+    };
+    const { backends, secrets } = mockDeps(state, []);
+    const rotator = new SecretBackendRotator({ backends: backends as never, secrets: secrets as never });
+    expect(rotator.isRotatable(state.backend)).toBe(false);
+  });
+
+  it('rotateOne: mints → verifies → persists → revokes old → updates tokenMeta', async () => {
+    const state: MockState = {
+      backend: makeBackend({ tokenMeta: { rotatable: true, currentAccessor: 'old-accessor' } as unknown as SecretBackend['tokenMeta'] }),
+      secret: makeSecret({ data: { token: 'old.token.value' } as Secret['data'] }),
+      secretData: { token: 'old.token.value' },
+      lastTokenMeta: null,
+      lastSecretUpdate: null,
+    };
+    const { fetchFn, backends, secrets } = mockDeps(state, [
+      { match: /POST .*auth\/token\/create\/app-mcpd-role$/, status: 200, body: { auth: { client_token: 'new.token.value', accessor: 'new-accessor', lease_duration: 720 * 3600, renewable: true } } },
+      { match: /GET .*auth\/token\/lookup-self$/, status: 200, body: { data: { accessor: 'new-accessor', ttl: 720 * 3600 } } },
+      { match: /POST .*auth\/token\/revoke-accessor$/, status: 200 },
+    ]);
+
+    const rotator = new SecretBackendRotator({
+      backends: backends as never,
+      secrets: secrets as never,
+      fetch: fetchFn as unknown as typeof fetch,
+      now: () => new Date('2026-04-20T10:00:00Z'),
+    });
+
+    const meta = await rotator.rotateOne(state.backend.id);
+
+    // Correct order of HTTP calls: create (with OLD token) → lookup (with NEW token) → revoke (with NEW token)
+    const calls = fetchFn.mock.calls.map((c) => `${(c[1] as RequestInit).method ?? 'GET'} ${String(c[0])}`);
+    expect(calls[0]).toMatch(/POST .*create\/app-mcpd-role/);
+    expect(calls[1]).toMatch(/GET .*lookup-self/);
+    expect(calls[2]).toMatch(/POST .*revoke-accessor/);
+    expect((fetchFn.mock.calls[0]![1] as RequestInit).headers).toMatchObject({ 'X-Vault-Token': 'old.token.value' });
+    expect((fetchFn.mock.calls[1]![1] as RequestInit).headers).toMatchObject({ 'X-Vault-Token': 'new.token.value' });
+    expect((fetchFn.mock.calls[2]![1] as RequestInit).headers).toMatchObject({ 'X-Vault-Token': 'new.token.value' });
+
+    // Secret was updated BEFORE revoke — state reflects ordering by sequence above.
+    expect(state.secretData.token).toBe('new.token.value');
+
+    // tokenMeta carries fresh timestamps + accessor
+    expect(meta.currentAccessor).toBe('new-accessor');
+    expect(meta.lastRotationError).toBeNull();
+    expect(meta.generatedAt).toBe('2026-04-20T10:00:00.000Z');
+    expect(meta.nextRenewalAt).toBe('2026-04-21T10:00:00.000Z');
+    expect(meta.validUntil).toBe('2026-05-20T10:00:00.000Z');
+    expect(state.lastTokenMeta?.rotatable).toBe(true);
+  });
+
+  it('rotateOne: on mint failure, records lastRotationError and keeps old token', async () => {
+    const state: MockState = {
+      backend: makeBackend(),
+      secret: makeSecret({ data: { token: 'old.token' } as Secret['data'] }),
+      secretData: { token: 'old.token' },
+      lastTokenMeta: null,
+      lastSecretUpdate: null,
+    };
+    const { fetchFn, backends, secrets } = mockDeps(state, [
+      { match: /create\/app-mcpd-role$/, status: 403, body: { errors: ['permission denied'] } },
+    ]);
+    const rotator = new SecretBackendRotator({
+      backends: backends as never,
+      secrets: secrets as never,
+      fetch: fetchFn as unknown as typeof fetch,
+    });
+
+    await expect(rotator.rotateOne(state.backend.id)).rejects.toThrow(/HTTP 403/);
+
+    // Secret was NOT updated
+    expect(state.secretData.token).toBe('old.token');
+    expect(secrets.update).not.toHaveBeenCalled();
+    // tokenMeta records the error
+    expect(state.lastTokenMeta?.lastRotationError).toMatch(/HTTP 403/);
+  });
+
+  it('rotateOne: rejects when minted token is not renewable', async () => {
+    const state: MockState = {
+      backend: makeBackend(),
+      secret: makeSecret({ data: { token: 'old' } as Secret['data'] }),
+      secretData: { token: 'old' },
+      lastTokenMeta: null,
+      lastSecretUpdate: null,
+    };
+    const { fetchFn, backends, secrets } = mockDeps(state, [
+      { match: /create\/app-mcpd-role$/, status: 200, body: { auth: { client_token: 'new', accessor: 'a', lease_duration: 100, renewable: false } } },
+    ]);
+    const rotator = new SecretBackendRotator({
+      backends: backends as never,
+      secrets: secrets as never,
+      fetch: fetchFn as unknown as typeof fetch,
+    });
+    await expect(rotator.rotateOne(state.backend.id)).rejects.toThrow(/not renewable/);
+    expect(state.secretData.token).toBe('old');
+  });
+
+  it('rotateOne: continues despite revoke-accessor failure (old token expires anyway)', async () => {
+    const state: MockState = {
+      backend: makeBackend({ tokenMeta: { rotatable: true, currentAccessor: 'old-accessor' } as unknown as SecretBackend['tokenMeta'] }),
+      secret: makeSecret({ data: { token: 'old' } as Secret['data'] }),
+      secretData: { token: 'old' },
+      lastTokenMeta: null,
+      lastSecretUpdate: null,
+    };
+    const { fetchFn, backends, secrets } = mockDeps(state, [
+      { match: /create\/app-mcpd-role$/, status: 200, body: { auth: { client_token: 'new', accessor: 'new-a', lease_duration: 3600, renewable: true } } },
+      { match: /lookup-self$/, status: 200, body: { data: { accessor: 'new-a', ttl: 3600 } } },
+      { match: /revoke-accessor$/, status: 502 },
+    ]);
+    const rotator = new SecretBackendRotator({
+      backends: backends as never,
+      secrets: secrets as never,
+      fetch: fetchFn as unknown as typeof fetch,
+    });
+    const meta = await rotator.rotateOne(state.backend.id);
+    expect(state.secretData.token).toBe('new');
+    expect(meta.lastRotationError).toBeNull();
+  });
+
+  it('isOverdue: true when lastRotationAt missing or >24h old', () => {
+    const state: MockState = {
+      backend: makeBackend({ tokenMeta: { rotatable: true } as unknown as SecretBackend['tokenMeta'] }),
+      secret: makeSecret(),
+      secretData: {},
+      lastTokenMeta: null,
+      lastSecretUpdate: null,
+    };
+    const { backends, secrets } = mockDeps(state, []);
+    const now = () => new Date('2026-04-20T10:00:00Z');
+    const r = new SecretBackendRotator({ backends: backends as never, secrets: secrets as never, now });
+
+    expect(r.isOverdue(state.backend)).toBe(true);
+
+    const fresh = { ...state.backend, tokenMeta: { rotatable: true, lastRotationAt: '2026-04-20T09:00:00Z' } as unknown as SecretBackend['tokenMeta'] };
+    expect(r.isOverdue(fresh)).toBe(false);
+
+    const stale = { ...state.backend, tokenMeta: { rotatable: true, lastRotationAt: '2026-04-18T10:00:00Z' } as unknown as SecretBackend['tokenMeta'] };
+    expect(r.isOverdue(stale)).toBe(true);
+  });
+
+  it('rotateOne: throws when backend is not rotatable', async () => {
+    const state: MockState = {
+      backend: makeBackend({ type: 'plaintext', config: {} as SecretBackend['config'] }),
+      secret: makeSecret(),
+      secretData: {},
+      lastTokenMeta: null,
+      lastSecretUpdate: null,
+    };
+    const { backends, secrets } = mockDeps(state, []);
+    const r = new SecretBackendRotator({ backends: backends as never, secrets: secrets as never });
+    await expect(r.rotateOne(state.backend.id)).rejects.toThrow(/not rotatable/);
+  });
+});
--- a/src/mcpd/tests/secret-backends.test.ts
+++ b/src/mcpd/tests/secret-backends.test.ts
@@ -0,0 +1,244 @@
+import { describe, it, expect, vi } from 'vitest';
+import { PlaintextDriver } from '../src/services/secret-backends/plaintext.js';
+import { OpenBaoDriver } from '../src/services/secret-backends/openbao.js';
+
+describe('PlaintextDriver', () => {
+  const driver = new PlaintextDriver({ listAllPlaintext: async () => [{ name: 'a', data: { k: 'v' } }] });
+
+  it('read returns the data passed in', async () => {
+    const result = await driver.read({ name: 's', externalRef: '', data: { token: 'abc' } });
+    expect(result).toEqual({ token: 'abc' });
+  });
+
+  it('write returns storedData = input, externalRef = empty', async () => {
+    const result = await driver.write({ name: 's', data: { k: 'v' } });
+    expect(result).toEqual({ externalRef: '', storedData: { k: 'v' } });
+  });
+
+  it('list delegates to the injected dep', async () => {
+    const list = await driver.list();
+    expect(list).toEqual([{ name: 'a', externalRef: '' }]);
+  });
+
+  it('delete is a no-op', async () => {
+    await expect(driver.delete({ name: 's', externalRef: '' })).resolves.toBeUndefined();
+  });
+});
+
+describe('OpenBaoDriver', () => {
+  function makeFetch(responses: Array<{ url: RegExp; status: number; body?: unknown }>): ReturnType<typeof vi.fn> {
+    return vi.fn(async (url: string | URL, _init?: RequestInit) => {
+      const urlStr = String(url);
+      const match = responses.find((r) => r.url.test(urlStr));
+      if (!match) throw new Error(`unexpected fetch: ${urlStr}`);
+      return new Response(match.body ? JSON.stringify(match.body) : '', { status: match.status });
+    });
+  }
+
+  const resolver = { resolve: vi.fn(async () => 'test-vault-token') };
+
+  it('write sends POST to .../data/<path> with {data: ...}', async () => {
+    const fetchFn = makeFetch([{ url: /\/v1\/secret\/data\/mcpctl\/mytoken$/, status: 200 }]);
+    const driver = new OpenBaoDriver(
+      { url: 'http://bao.example:8200', tokenSecretRef: { name: 'bao', key: 'token' } },
+      { fetch: fetchFn as unknown as typeof fetch, secretRefResolver: resolver },
+    );
+    const result = await driver.write({ name: 'mytoken', data: { api_key: 'secret-xyz' } });
+    expect(result.externalRef).toBe('secret/mcpctl/mytoken');
+    expect(result.storedData).toEqual({});
+    expect(fetchFn).toHaveBeenCalledTimes(1);
+    const [, init] = fetchFn.mock.calls[0] as [unknown, RequestInit];
+    expect(init.method).toBe('POST');
+    expect(JSON.parse(init.body as string)).toEqual({ data: { api_key: 'secret-xyz' } });
+    const headers = init.headers as Record<string, string>;
+    expect(headers['X-Vault-Token']).toBe('test-vault-token');
+  });
+
+  it('read returns body.data.data', async () => {
+    const fetchFn = makeFetch([{
+      url: /\/v1\/secret\/data\/mcpctl\/mytoken$/,
+      status: 200,
+      body: { data: { data: { api_key: 'secret-xyz' } } },
+    }]);
+    const driver = new OpenBaoDriver(
+      { url: 'http://bao.example:8200', tokenSecretRef: { name: 'bao', key: 'token' } },
+      { fetch: fetchFn as unknown as typeof fetch, secretRefResolver: resolver },
+    );
+    const result = await driver.read({ name: 'mytoken', externalRef: 'secret/mcpctl/mytoken', data: {} });
+    expect(result).toEqual({ api_key: 'secret-xyz' });
+  });
+
+  it('read throws when the path 404s', async () => {
+    const fetchFn = makeFetch([{ url: /\/data\//, status: 404 }]);
+    const driver = new OpenBaoDriver(
+      { url: 'http://bao.example:8200', tokenSecretRef: { name: 'bao', key: 'token' } },
+      { fetch: fetchFn as unknown as typeof fetch, secretRefResolver: resolver },
+    );
+    await expect(driver.read({ name: 'missing', externalRef: '', data: {} })).rejects.toThrow(/not found/);
+  });
+
+  it('delete swallows 404', async () => {
+    const fetchFn = makeFetch([{ url: /\/metadata\//, status: 404 }]);
+    const driver = new OpenBaoDriver(
+      { url: 'http://bao.example:8200', tokenSecretRef: { name: 'bao', key: 'token' } },
+      { fetch: fetchFn as unknown as typeof fetch, secretRefResolver: resolver },
+    );
+    await expect(driver.delete({ name: 'gone', externalRef: '' })).resolves.toBeUndefined();
+  });
+
+  it('list returns names from the metadata LIST call', async () => {
+    const fetchFn = makeFetch([{
+      url: /\/v1\/secret\/metadata\/mcpctl\/$/,
+      status: 200,
+      body: { data: { keys: ['token1', 'token2', 'sub-folder/'] } },
+    }]);
+    const driver = new OpenBaoDriver(
+      { url: 'http://bao.example:8200', tokenSecretRef: { name: 'bao', key: 'token' } },
+      { fetch: fetchFn as unknown as typeof fetch, secretRefResolver: resolver },
+    );
+    const result = await driver.list();
+    // Sub-folders (trailing slash) are excluded; only leaf keys are returned.
+    expect(result).toEqual([
+      { name: 'token1', externalRef: 'secret/mcpctl/token1' },
+      { name: 'token2', externalRef: 'secret/mcpctl/token2' },
+    ]);
+  });
+
+  it('caches the vault token after first resolve', async () => {
+    const fetchFn = makeFetch([
+      { url: /\/v1\/secret\/data\/mcpctl\//, status: 200, body: { data: { data: { k: 'v' } } } },
+    ]);
+    const singleResolver = { resolve: vi.fn(async () => 'test-vault-token') };
+    const driver = new OpenBaoDriver(
+      { url: 'http://bao.example:8200', tokenSecretRef: { name: 'bao', key: 'token' } },
+      { fetch: fetchFn as unknown as typeof fetch, secretRefResolver: singleResolver },
+    );
+    await driver.read({ name: 'a', externalRef: '', data: {} });
+    await driver.read({ name: 'a', externalRef: '', data: {} });
+    expect(singleResolver.resolve).toHaveBeenCalledTimes(1);
+  });
+
+  it('propagates X-Vault-Namespace when configured', async () => {
+    const fetchFn = makeFetch([{ url: /\/v1\/secret\/data\/mcpctl\//, status: 200 }]);
+    const driver = new OpenBaoDriver(
+      { url: 'http://bao.example:8200', namespace: 'myteam', tokenSecretRef: { name: 'bao', key: 'token' } },
+      { fetch: fetchFn as unknown as typeof fetch, secretRefResolver: resolver },
+    );
+    await driver.write({ name: 'x', data: { k: 'v' } });
+    const [, init] = fetchFn.mock.calls[0] as [unknown, RequestInit];
+    const headers = init.headers as Record<string, string>;
+    expect(headers['X-Vault-Namespace']).toBe('myteam');
+  });
+
+  describe('kubernetes auth', () => {
+    it('exchanges the SA JWT for a vault client token via /v1/auth/kubernetes/login', async () => {
+      const calls: Array<{ url: string; init: RequestInit }> = [];
+      const fetchFn = vi.fn(async (url: string | URL, init?: RequestInit) => {
+        const u = String(url);
+        calls.push({ url: u, init: init ?? {} });
+        if (u.endsWith('/v1/auth/kubernetes/login')) {
+          return new Response(JSON.stringify({
+            auth: { client_token: 'vault.client.token.xyz', lease_duration: 3600 },
+          }), { status: 200 });
+        }
+        return new Response(JSON.stringify({}), { status: 200 });
+      });
+
+      const driver = new OpenBaoDriver(
+        { url: 'http://bao.example:8200', auth: 'kubernetes', role: 'mcpctl' },
+        {
+          fetch: fetchFn as unknown as typeof fetch,
+          readServiceAccountToken: async () => 'eyJ.fake.sa.jwt',
+        },
+      );
+      await driver.write({ name: 'x', data: { k: 'v' } });
+
+      // Two calls: login + write
+      expect(calls).toHaveLength(2);
+      expect(calls[0]!.url).toBe('http://bao.example:8200/v1/auth/kubernetes/login');
+      expect(JSON.parse(calls[0]!.init.body as string)).toEqual({ role: 'mcpctl', jwt: 'eyJ.fake.sa.jwt' });
+
+      // Write uses the returned client token
+      const writeHeaders = calls[1]!.init.headers as Record<string, string>;
+      expect(writeHeaders['X-Vault-Token']).toBe('vault.client.token.xyz');
+    });
+
+    it('caches the vault token across requests and renews after lease expiry', async () => {
+      let nowMs = 1_000_000_000_000;
+      let loginCount = 0;
+      const fetchFn = vi.fn(async (url: string | URL) => {
+        const u = String(url);
+        if (u.endsWith('/v1/auth/kubernetes/login')) {
+          loginCount++;
+          // 600s lease leaves 540s of cached window after the 60s grace.
+          return new Response(JSON.stringify({
+            auth: { client_token: `tok-${String(loginCount)}`, lease_duration: 600 },
+          }), { status: 200 });
+        }
+        return new Response(JSON.stringify({}), { status: 200 });
+      });
+
+      const driver = new OpenBaoDriver(
+        { url: 'http://bao.example:8200', auth: 'kubernetes', role: 'mcpctl' },
+        {
+          fetch: fetchFn as unknown as typeof fetch,
+          readServiceAccountToken: async () => 'jwt',
+          now: () => nowMs,
+        },
+      );
+
+      await driver.write({ name: 'a', data: { k: 'v' } });
+      await driver.write({ name: 'b', data: { k: 'v' } });
+      expect(loginCount).toBe(1); // both writes share the cached token
+
+      // Advance past lease - grace window → driver re-logs in
+      nowMs += 600_000;
+      await driver.write({ name: 'c', data: { k: 'v' } });
+      expect(loginCount).toBe(2);
+    });
+
+    it('honours custom authMount path', async () => {
+      const calls: string[] = [];
+      const fetchFn = vi.fn(async (url: string | URL) => {
+        calls.push(String(url));
+        if (String(url).includes('/login')) {
+          return new Response(JSON.stringify({ auth: { client_token: 't', lease_duration: 3600 } }), { status: 200 });
+        }
+        return new Response(JSON.stringify({}), { status: 200 });
+      });
+
+      const driver = new OpenBaoDriver(
+        { url: 'http://bao.example:8200', auth: 'kubernetes', role: 'mcpctl', authMount: 'kubernetes/cluster-a' },
+        { fetch: fetchFn as unknown as typeof fetch, readServiceAccountToken: async () => 'jwt' },
+      );
+      await driver.write({ name: 'x', data: {} });
+      expect(calls[0]).toBe('http://bao.example:8200/v1/auth/kubernetes/cluster-a/login');
+    });
+
+    it('throws on login failure with a clear error', async () => {
+      const fetchFn = vi.fn(async () => new Response('permission denied', { status: 403 }));
+      const driver = new OpenBaoDriver(
+        { url: 'http://bao.example:8200', auth: 'kubernetes', role: 'mcpctl' },
+        { fetch: fetchFn as unknown as typeof fetch, readServiceAccountToken: async () => 'jwt' },
+      );
+      await expect(driver.read({ name: 'x', externalRef: '', data: {} }))
+        .rejects.toThrow(/kubernetes login.*role=mcpctl.*HTTP 403/);
+    });
+
+    it('rejects construction when role is missing', () => {
+      expect(() => new OpenBaoDriver(
+        // eslint-disable-next-line @typescript-eslint/no-explicit-any
+        { url: 'http://bao.example:8200', auth: 'kubernetes' } as any,
+        { fetch: vi.fn() as unknown as typeof fetch, readServiceAccountToken: async () => 'jwt' },
+      )).toThrow(/role.*required/);
+    });
+
+    it('rejects token-auth construction when tokenSecretRef is missing', () => {
+      expect(() => new OpenBaoDriver(
+        // eslint-disable-next-line @typescript-eslint/no-explicit-any
+        { url: 'http://bao.example:8200' } as any,
+        { fetch: vi.fn() as unknown as typeof fetch, secretRefResolver: resolver },
+      )).toThrow(/tokenSecretRef.*required/);
+    });
+  });
+});
--- a/src/mcpd/tests/secret-routes.test.ts
+++ b/src/mcpd/tests/secret-routes.test.ts
@@ -3,43 +3,68 @@ import Fastify from 'fastify';
 import type { FastifyInstance } from 'fastify';
 import { registerSecretRoutes } from '../src/routes/secrets.js';
 import { SecretService } from '../src/services/secret.service.js';
+import { SecretBackendService } from '../src/services/secret-backend.service.js';
 import { errorHandler } from '../src/middleware/error-handler.js';
 import type { ISecretRepository } from '../src/repositories/interfaces.js';
+import type { ISecretBackendRepository } from '../src/repositories/secret-backend.repository.js';
+import type { SecretBackend } from '@prisma/client';

 let app: FastifyInstance;

-function mockRepo(): ISecretRepository {
-  let lastCreated: Record<string, unknown> | null = null;
+const PLAINTEXT_BACKEND: SecretBackend = {
+  id: 'backend-plaintext',
+  name: 'default',
+  type: 'plaintext',
+  config: {},
+  isDefault: true,
+  description: '',
+  version: 1,
+  createdAt: new Date(),
+  updatedAt: new Date(),
+};
+
+function makeSecret(overrides: Partial<{ id: string; name: string; data: Record<string, string>; externalRef: string; backendId: string }> = {}) {
  return {
-    findAll: vi.fn(async () => [
-      { id: '1', name: 'ha-creds', data: { TOKEN: 'abc' }, version: 1, createdAt: new Date(), updatedAt: new Date() },
-    ]),
+    id: overrides.id ?? 'sec-1',
+    name: overrides.name ?? 'ha-creds',
+    backendId: overrides.backendId ?? PLAINTEXT_BACKEND.id,
+    data: overrides.data ?? { TOKEN: 'abc' },
+    externalRef: overrides.externalRef ?? '',
+    version: 1,
+    createdAt: new Date(),
+    updatedAt: new Date(),
+  };
+}
+
+function mockRepo(): ISecretRepository {
+  let lastCreated: ReturnType<typeof makeSecret> | null = null;
+  return {
+    findAll: vi.fn(async () => [makeSecret()]),
    findById: vi.fn(async (id: string) => {
-      if (lastCreated && (lastCreated as { id: string }).id === id) return lastCreated as never;
+      if (lastCreated && lastCreated.id === id) return lastCreated;
      return null;
    }),
    findByName: vi.fn(async () => null),
+    findByBackend: vi.fn(async () => []),
    create: vi.fn(async (data) => {
-      const secret = {
+      const secret = makeSecret({
        id: 'new-id',
        name: data.name,
        data: data.data ?? {},
-        version: 1,
-        createdAt: new Date(),
-        updatedAt: new Date(),
-      };
+        externalRef: data.externalRef ?? '',
+        backendId: data.backendId,
+      });
      lastCreated = secret;
      return secret;
    }),
    update: vi.fn(async (id, data) => {
-      const secret = {
+      const secret = makeSecret({
        id,
-        name: 'ha-creds',
+        name: lastCreated?.name ?? 'ha-creds',
        data: data.data,
-        version: 2,
-        createdAt: new Date(),
-        updatedAt: new Date(),
-      };
+        externalRef: data.externalRef,
+        backendId: data.backendId ?? PLAINTEXT_BACKEND.id,
+      });
      lastCreated = secret;
      return secret;
    }),
@@ -47,14 +72,32 @@ function mockRepo(): ISecretRepository {
  };
 }

+function mockBackendRepo(): ISecretBackendRepository {
+  return {
+    findAll: vi.fn(async () => [PLAINTEXT_BACKEND]),
+    findById: vi.fn(async (id) => (id === PLAINTEXT_BACKEND.id ? PLAINTEXT_BACKEND : null)),
+    findByName: vi.fn(async (name) => (name === PLAINTEXT_BACKEND.name ? PLAINTEXT_BACKEND : null)),
+    findDefault: vi.fn(async () => PLAINTEXT_BACKEND),
+    create: vi.fn(async () => PLAINTEXT_BACKEND),
+    update: vi.fn(async () => PLAINTEXT_BACKEND),
+    setAsDefault: vi.fn(async () => PLAINTEXT_BACKEND),
+    delete: vi.fn(async () => {}),
+    countReferencingSecrets: vi.fn(async () => 0),
+  };
+}
+
 afterEach(async () => {
  if (app) await app.close();
 });

-function createApp(repo: ISecretRepository) {
+async function createApp(repo: ISecretRepository) {
  app = Fastify({ logger: false });
  app.setErrorHandler(errorHandler);
-  const service = new SecretService(repo);
+  const backends = new SecretBackendService(mockBackendRepo(), {
+    plaintext: { listAllPlaintext: async () => [] },
+    secretRefResolver: { resolve: async () => '' },
+  });
+  const service = new SecretService(repo, backends);
  registerSecretRoutes(app, service);
  return app.ready();
 }
@@ -129,7 +172,7 @@ describe('Secret Routes', () => {
  describe('PUT /api/v1/secrets/:id', () => {
    it('updates a secret', async () => {
      const repo = mockRepo();
-      vi.mocked(repo.findById).mockResolvedValue({ id: '1', name: 'ha-creds' } as never);
+      vi.mocked(repo.findById).mockResolvedValue(makeSecret({ id: '1' }) as never);
      await createApp(repo);
      const res = await app.inject({
        method: 'PUT',
@@ -154,7 +197,7 @@ describe('Secret Routes', () => {
  describe('DELETE /api/v1/secrets/:id', () => {
    it('deletes a secret and returns 204', async () => {
      const repo = mockRepo();
-      vi.mocked(repo.findById).mockResolvedValue({ id: '1', name: 'ha-creds' } as never);
+      vi.mocked(repo.findById).mockResolvedValue(makeSecret({ id: '1' }) as never);
      await createApp(repo);
      const res = await app.inject({ method: 'DELETE', url: '/api/v1/secrets/1' });
      expect(res.statusCode).toBe(204);
--- a/src/mcpd/tests/services/health-probe.test.ts
+++ b/src/mcpd/tests/services/health-probe.test.ts
@@ -1,8 +1,9 @@
 import { describe, it, expect, vi, beforeEach } from 'vitest';
-import { HealthProbeRunner } from '../../src/services/health-probe.service.js';
+import { HealthProbeRunner, DEFAULT_HEALTH_CHECK } from '../../src/services/health-probe.service.js';
 import type { HealthCheckSpec } from '../../src/services/health-probe.service.js';
 import type { IMcpInstanceRepository, IMcpServerRepository } from '../../src/repositories/interfaces.js';
-import type { McpOrchestrator, ExecResult } from '../../src/services/orchestrator.js';
+import type { McpOrchestrator } from '../../src/services/orchestrator.js';
+import type { McpProxyService, McpProxyResponse } from '../../src/services/mcp-proxy-service.js';
 import type { McpInstance, McpServer } from '@prisma/client';

 function makeInstance(overrides: Partial<McpInstance> = {}): McpInstance {
@@ -87,20 +88,30 @@ function mockOrchestrator(): McpOrchestrator {
  };
 }

+function mockMcpProxyService(): McpProxyService {
+  return {
+    execute: vi.fn(async (): Promise<McpProxyResponse> => ({ jsonrpc: '2.0', id: 1, result: { tools: [] } })),
+    closeAll: vi.fn(),
+    removeClient: vi.fn(),
+  } as unknown as McpProxyService;
+}
+
 describe('HealthProbeRunner', () => {
  let instanceRepo: IMcpInstanceRepository;
  let serverRepo: IMcpServerRepository;
  let orchestrator: McpOrchestrator;
+  let mcpProxyService: McpProxyService;
  let runner: HealthProbeRunner;

  beforeEach(() => {
    instanceRepo = mockInstanceRepo();
    serverRepo = mockServerRepo();
    orchestrator = mockOrchestrator();
-    runner = new HealthProbeRunner(instanceRepo, serverRepo, orchestrator);
+    mcpProxyService = mockMcpProxyService();
+    runner = new HealthProbeRunner(instanceRepo, serverRepo, orchestrator, undefined, mcpProxyService);
  });

-  it('skips instances without healthCheck config', async () => {
+  it('applies default liveness probe when server has no healthCheck config', async () => {
    const instance = makeInstance();
    const server = makeServer({ healthCheck: null });

@@ -109,8 +120,67 @@ describe('HealthProbeRunner', () => {

    await runner.tick();

+    // No exec fallback — liveness goes through mcpProxyService
    expect(orchestrator.execInContainer).not.toHaveBeenCalled();
-    expect(instanceRepo.updateStatus).not.toHaveBeenCalled();
+    expect(mcpProxyService.execute).toHaveBeenCalledWith({ serverId: 'srv-1', method: 'tools/list' });
+    expect(instanceRepo.updateStatus).toHaveBeenCalledWith(
+      'inst-1',
+      'RUNNING',
+      expect.objectContaining({ healthStatus: 'healthy' }),
+    );
+  });
+
+  it('default liveness probe marks unhealthy when tools/list returns JSON-RPC error', async () => {
+    const instance = makeInstance();
+    const server = makeServer({
+      healthCheck: { intervalSeconds: 0, failureThreshold: 1 } as unknown as McpServer['healthCheck'],
+    });
+
+    vi.mocked(instanceRepo.findAll).mockResolvedValue([instance]);
+    vi.mocked(serverRepo.findById).mockResolvedValue(server);
+    vi.mocked(mcpProxyService.execute).mockResolvedValue({
+      jsonrpc: '2.0',
+      id: 1,
+      error: { code: -32603, message: 'Cannot connect to upstream' },
+    });
+
+    await runner.tick();
+
+    expect(instanceRepo.updateStatus).toHaveBeenCalledWith(
+      'inst-1',
+      'RUNNING',
+      expect.objectContaining({
+        healthStatus: 'unhealthy',
+        events: expect.arrayContaining([
+          expect.objectContaining({ type: 'Warning', message: expect.stringContaining('Cannot connect to upstream') }),
+        ]),
+      }),
+    );
+  });
+
+  it('default liveness probe marks unhealthy when mcpProxyService throws', async () => {
+    const instance = makeInstance();
+    const server = makeServer({
+      healthCheck: { intervalSeconds: 0, failureThreshold: 1 } as unknown as McpServer['healthCheck'],
+    });
+
+    vi.mocked(instanceRepo.findAll).mockResolvedValue([instance]);
+    vi.mocked(serverRepo.findById).mockResolvedValue(server);
+    vi.mocked(mcpProxyService.execute).mockRejectedValue(new Error('no running instance'));
+
+    await runner.tick();
+
+    expect(instanceRepo.updateStatus).toHaveBeenCalledWith(
+      'inst-1',
+      'RUNNING',
+      expect.objectContaining({ healthStatus: 'unhealthy' }),
+    );
+  });
+
+  it('DEFAULT_HEALTH_CHECK has no tool set so it acts as liveness', () => {
+    expect(DEFAULT_HEALTH_CHECK.tool).toBeUndefined();
+    expect(DEFAULT_HEALTH_CHECK.intervalSeconds).toBe(30);
+    expect(DEFAULT_HEALTH_CHECK.failureThreshold).toBe(3);
  });

  it('skips non-RUNNING instances', async () => {
--- a/src/mcplocal/package.json
+++ b/src/mcplocal/package.json
@@ -10,6 +10,7 @@
    "clean": "rimraf dist",
    "dev": "tsx watch src/index.ts",
    "start": "node dist/index.js",
+    "serve": "node dist/serve.js",
    "test": "vitest",
    "test:run": "vitest run",
    "test:smoke": "vitest run --config vitest.smoke.config.ts"
--- a/src/mcplocal/src/audit/collector.ts
+++ b/src/mcplocal/src/audit/collector.ts
@@ -10,11 +10,17 @@ import type { McpdClient } from '../http/mcpd-client.js';
 const BATCH_SIZE = 50;
 const FLUSH_INTERVAL_MS = 5_000;

+interface SessionPrincipal {
+  userName?: string;
+  tokenName?: string;
+  tokenSha?: string;
+}
+
 export class AuditCollector {
  private queue: AuditEvent[] = [];
  private flushTimer: ReturnType<typeof setInterval> | null = null;
  private flushing = false;
-  private sessionUserNames = new Map<string, string>();
+  private sessionPrincipals = new Map<string, SessionPrincipal>();

  constructor(
    private readonly mcpdClient: McpdClient,
@@ -25,15 +31,26 @@ export class AuditCollector {

  /** Register a userName for a session. All future events for this session auto-fill it. */
  setSessionUserName(sessionId: string, userName: string): void {
-    this.sessionUserNames.set(sessionId, userName);
+    const existing = this.sessionPrincipals.get(sessionId) ?? {};
+    this.sessionPrincipals.set(sessionId, { ...existing, userName });
  }

-  /** Queue an audit event. Auto-fills projectName and userName (from session map). */
+  /** Register McpToken identity for a session (HTTP-mode authenticated requests). */
+  setSessionMcpToken(sessionId: string, token: { tokenName: string; tokenSha: string }): void {
+    const existing = this.sessionPrincipals.get(sessionId) ?? {};
+    this.sessionPrincipals.set(sessionId, { ...existing, tokenName: token.tokenName, tokenSha: token.tokenSha });
+  }
+
+  /** Queue an audit event. Auto-fills projectName, userName, tokenName, and tokenSha. */
  emit(event: Omit<AuditEvent, 'projectName'>): void {
    const enriched: AuditEvent = { ...event, projectName: this.projectName };
-    if (!enriched.userName && enriched.sessionId) {
-      const name = this.sessionUserNames.get(enriched.sessionId);
-      if (name) enriched.userName = name;
+    if (enriched.sessionId) {
+      const principal = this.sessionPrincipals.get(enriched.sessionId);
+      if (principal) {
+        if (!enriched.userName && principal.userName) enriched.userName = principal.userName;
+        if (!enriched.tokenName && principal.tokenName) enriched.tokenName = principal.tokenName;
+        if (!enriched.tokenSha && principal.tokenSha) enriched.tokenSha = principal.tokenSha;
+      }
    }
    this.queue.push(enriched);
    if (this.queue.length >= BATCH_SIZE) {
--- a/src/mcplocal/src/audit/types.ts
+++ b/src/mcplocal/src/audit/types.ts
@@ -32,5 +32,9 @@ export interface AuditEvent {
  correlationId?: string;
  parentEventId?: string;
  userName?: string;
+  /** Set when the session authenticated via an McpToken (HTTP-mode mcplocal). */
+  tokenName?: string;
+  /** SHA-256 hash of the McpToken that made the request. */
+  tokenSha?: string;
  payload: Record<string, unknown>;
 }
--- a/src/mcplocal/src/discovery.ts
+++ b/src/mcplocal/src/discovery.ts
@@ -1,4 +1,5 @@
 import type { McpdClient } from './http/mcpd-client.js';
+import { DISCOVERY_TIMEOUT_MS } from './http/mcpd-client.js';
 import type { McpRouter } from './router.js';
 import { McpdUpstream } from './upstream/mcpd.js';

@@ -45,14 +46,27 @@ export async function refreshProjectUpstreams(
    servers = await mcpdClient.get<McpdServer[]>(path);
  }

-  return syncUpstreams(router, mcpdClient, servers);
+  // Downstream upstream-proxy calls go through `mcpdClient` too. In HTTP-mode
+  // mcplocal the pod has no credentials of its own, so the default token on
+  // `mcpdClient` is an empty string — every /api/v1/mcp/proxy call would 401.
+  // Bind a per-request client with the caller's bearer so each McpdUpstream
+  // forwards the same identity that passed project discovery.
+  const upstreamClient = authToken ? mcpdClient.withToken(authToken) : mcpdClient;
+  return syncUpstreams(router, upstreamClient, servers);
 }

 /**
 * Fetch a project's LLM config (llmProvider, llmModel) from mcpd.
- * These are the project-level "recommendations" — local overrides take priority.
+ *
+ * Phase 4 redefines `llmProvider` semantically: it names a centralized `Llm`
+ * resource (see `mcpctl get llms`) — NOT a local provider. Consumers should
+ * resolve it through mcpd's inference proxy when reachable. The field remains
+ * a free-form string on the wire for backward compatibility; local overrides
+ * in `~/.mcpctl/config.json` still take priority, and unknown names fall
+ * through to the registry default.
 */
 export interface ProjectLlmConfig {
+  /** Name of an `Llm` resource on mcpd, or 'none' to disable LLM features. */
  llmProvider?: string;
  llmModel?: string;
  proxyModel?: string;
@@ -60,6 +74,31 @@ export interface ProjectLlmConfig {
  serverOverrides?: Record<string, { proxyModel?: string }>;
 }

+/**
+ * Resolve a project's `llmProvider` against mcpd's Llm registry. Returns:
+ *   - 'registered'  — an Llm with this name exists
+ *   - 'disabled'    — value is 'none'
+ *   - 'unregistered'— no Llm matches (consumer should fall back to registry default)
+ *   - 'unreachable' — mcpd couldn't be queried
+ */
+export type LlmReferenceStatus = 'registered' | 'disabled' | 'unregistered' | 'unreachable';
+
+export async function resolveProjectLlmReference(
+  mcpdClient: McpdClient,
+  llmProvider: string | undefined,
+): Promise<LlmReferenceStatus> {
+  if (llmProvider === undefined || llmProvider === '') return 'unregistered';
+  if (llmProvider === 'none') return 'disabled';
+  try {
+    await mcpdClient.get(`/api/v1/llms/${encodeURIComponent(llmProvider)}`);
+    return 'registered';
+  } catch (err) {
+    const msg = err instanceof Error ? err.message : String(err);
+    if (msg.includes('404') || msg.toLowerCase().includes('not found')) return 'unregistered';
+    return 'unreachable';
+  }
+}
+
 export async function fetchProjectLlmConfig(
  mcpdClient: McpdClient,
  projectName: string,
@@ -96,6 +135,10 @@ export async function fetchProjectLlmConfig(
 function syncUpstreams(router: McpRouter, mcpdClient: McpdClient, servers: McpdServer[]): string[] {
  const registered: string[] = [];

+  // Discovery-class calls (`*\/list`) go through a short-timeout client so a single
+  // unreachable upstream cannot stall session init for the full tool-call window.
+  const discoveryClient = mcpdClient.withTimeout(DISCOVERY_TIMEOUT_MS);
+
  // Remove stale upstreams
  const currentNames = new Set(router.getUpstreamNames());
  const serverNames = new Set(servers.map((s) => s.name));
@@ -108,7 +151,7 @@ function syncUpstreams(router: McpRouter, mcpdClient: McpdClient, servers: McpdS
  // Add/update upstreams for each server
  for (const server of servers) {
    if (!currentNames.has(server.name)) {
-      const upstream = new McpdUpstream(server.id, server.name, mcpdClient, server.description);
+      const upstream = new McpdUpstream(server.id, server.name, mcpdClient, server.description, discoveryClient);
      router.addUpstream(upstream);
    }
    registered.push(server.name);
--- a/src/mcplocal/src/http/config.ts
+++ b/src/mcplocal/src/http/config.ts
@@ -64,6 +64,14 @@ export interface LlmProviderFileEntry {
  idleTimeoutMinutes?: number;
  /** vllm-managed: extra args for `vllm serve` */
  extraArgs?: string[];
+  /**
+   * If set, this local provider is allowed to substitute for the centralized
+   * Llm of this name when the mcpd inference proxy is unreachable.
+   * RBAC is still enforced — the caller must have view permission on the
+   * named Llm via mcpd before failover is permitted (fail-closed if mcpd
+   * itself can't be reached).
+   */
+  failoverFor?: string;
 }

 export interface ProjectLlmOverride {
--- a/src/mcplocal/src/http/mcpd-client.ts
+++ b/src/mcplocal/src/http/mcpd-client.ts
@@ -21,7 +21,14 @@ export class ConnectionError extends Error {
 }

 /** Default timeout for mcpd requests (ms). Prevents indefinite hangs on slow upstream tool calls. */
-const DEFAULT_TIMEOUT_MS = 30_000;
+export const DEFAULT_TIMEOUT_MS = 30_000;
+
+/**
+ * Discovery-class operations (tools/list, resources/list, prompts/list) should not share
+ * the full tool-call timeout budget — a single dead upstream would stall session init for
+ * the entire window. Override via `MCPLOCAL_DISCOVERY_TIMEOUT_MS`.
+ */
+export const DISCOVERY_TIMEOUT_MS = Number(process.env['MCPLOCAL_DISCOVERY_TIMEOUT_MS']) || 8_000;

 export class McpdClient {
  private readonly baseUrl: string;
@@ -45,6 +52,24 @@ export class McpdClient {
    return new McpdClient(this.baseUrl, this.token, { ...this.extraHeaders, ...headers }, this.timeoutMs);
  }

+  /**
+   * Create a new client with a different per-request timeout. Used by mcplocal's
+   * discovery path to avoid sharing the slow tool-call budget.
+   */
+  withTimeout(timeoutMs: number): McpdClient {
+    return new McpdClient(this.baseUrl, this.token, { ...this.extraHeaders }, timeoutMs);
+  }
+
+  /**
+   * Create a new client with a different Bearer token. The HTTP-mode mcplocal
+   * pod has no credentials of its own — each incoming client request carries
+   * its McpToken, and this method is how we thread that token through to the
+   * McpdUpstream instances created during project discovery.
+   */
+  withToken(token: string): McpdClient {
+    return new McpdClient(this.baseUrl, token, { ...this.extraHeaders }, this.timeoutMs);
+  }
+
  async get<T>(path: string): Promise<T> {
    return this.request<T>('GET', path);
  }
--- a/src/mcplocal/src/http/project-mcp-endpoint.ts
+++ b/src/mcplocal/src/http/project-mcp-endpoint.ts
@@ -62,21 +62,31 @@ export function registerProjectMcpEndpoint(app: FastifyInstance, mcpdClient: Mcp
      return existing.router;
    }

+    // HTTP-mode mcplocal has no pod-level credentials — the default
+    // `mcpdClient.token` is an empty string. Every downstream call from this
+    // request (upstream discovery, LLM config fetch, prompt index for
+    // begin_session) has to use the CALLER's McpToken as the bearer, or mcpd
+    // rejects with 401. Build one per-request client here and thread it
+    // everywhere instead of sprinkling `.withToken(authToken)` at each call site.
+    const requestClient = authToken ? mcpdClient.withToken(authToken) : mcpdClient;
+
    // Create new router or refresh existing one
    const router = existing?.router ?? new McpRouter();
    await refreshProjectUpstreams(router, mcpdClient, projectName, authToken);

    // Resolve project LLM model: local override → mcpd recommendation → global default
    const localOverride = loadProjectLlmOverride(projectName);
-    const mcpdConfig = await fetchProjectLlmConfig(mcpdClient, projectName);
+    const mcpdConfig = await fetchProjectLlmConfig(requestClient, projectName);
    const resolvedModel = localOverride?.model ?? mcpdConfig.llmModel ?? undefined;

    // If project llmProvider is "none", disable LLM for this project
    const llmDisabled = mcpdConfig.llmProvider === 'none' || localOverride?.provider === 'none';
    const effectiveRegistry = llmDisabled ? null : (providerRegistry ?? null);

-    // Configure prompt resources with SA-scoped client for RBAC
-    const saClient = mcpdClient.withHeaders({ 'X-Service-Account': `project:${projectName}` });
+    // Configure prompt resources with SA-scoped client for RBAC.
+    // Keep the X-Service-Account header for mcpd-side audit tagging, but carry
+    // the caller's bearer so auth passes (the principal resolves as McpToken:<sha>).
+    const saClient = requestClient.withHeaders({ 'X-Service-Account': `project:${projectName}` });
    router.setPromptConfig(saClient, projectName);

    // System prompt fetcher for LLM consumers (uses router's cached fetcher)
@@ -91,13 +101,23 @@ export function registerProjectMcpEndpoint(app: FastifyInstance, mcpdClient: Mcp
      complete: async () => '',
      available: () => false,
    };
-    // Build cache namespace: provider--model--proxymodel
+    // Build cache namespace: provider--model--proxymodel.
+    // Resolution order:
+    //   1. local ~/.mcpctl override
+    //   2. mcpdConfig.llmProvider (Phase 4: name of a centralized Llm)
+    //   3. local registry default (fast tier → active provider)
+    //   4. literal 'none'
+    // If (2) names an Llm the HTTP-mode proxy-model pipeline can route
+    // through mcpd's /api/v1/llms/:name/infer (pivot lands when the client
+    // integrates that path); meanwhile the value is still usable as a cache
+    // key, and the describe-project warning flags stale configs.
    const llmProvider = localOverride?.provider ?? mcpdConfig.llmProvider
      ?? effectiveRegistry?.getTierProviders('fast')[0]
      ?? effectiveRegistry?.getActiveName()
      ?? 'none';
    const llmModel = resolvedModel ?? 'default';
-    const cache = new FileCache(`${llmProvider}--${llmModel}--${proxyModelName}`);
+    const cacheConfig = process.env.MCPLOCAL_CACHE_DIR ? { dir: process.env.MCPLOCAL_CACHE_DIR } : undefined;
+    const cache = new FileCache(`${llmProvider}--${llmModel}--${proxyModelName}`, cacheConfig);
    router.setProxyModel(proxyModelName, llmAdapter, cache);

    // Per-server proxymodel overrides (if mcpd provides them)
@@ -200,6 +220,17 @@ export function registerProjectMcpEndpoint(app: FastifyInstance, mcpdClient: Mcp
          void ensureUserName().then((name) => {
            if (name) collector.setSessionUserName(id, name);
          });
+
+          // HTTP-mode mcplocal: if the token-auth preHandler attached an McpToken
+          // principal to the request, tag the session so audit events carry the
+          // tokenName/tokenSha alongside (or instead of) userName.
+          const principal = request.mcpToken;
+          if (principal) {
+            collector.setSessionMcpToken(id, {
+              tokenName: principal.tokenName,
+              tokenSha: principal.tokenSha,
+            });
+          }
        }

        // Audit: session_bind
@@ -388,7 +419,7 @@ export function registerProjectMcpEndpoint(app: FastifyInstance, mcpdClient: Mcp
    const llmAdapter = providerRegistry
      ? new LLMProviderAdapter(providerRegistry)
      : { complete: async () => '', available: () => false };
-    const cache = new FileCache('dynamic');
+    const cache = new FileCache('dynamic', process.env.MCPLOCAL_CACHE_DIR ? { dir: process.env.MCPLOCAL_CACHE_DIR } : undefined);

    if (serverName && serverProxyModel) {
      entry.router.setServerProxyModel(serverName, serverProxyModel, llmAdapter, cache);
--- a/Show More
+++ b/Show More