Fixes the LiteLLM loop: LiteLLM's /mcp/ proxy doesn't propagate the
mcp-session-id header, so every tool call from qwen3 landed on a fresh
upstream session, which always started gated, so the only visible tool
was begin_session — forever.
The session-id gate works fine for Claude Code (stdio, long-lived), but
breaks through session-stripping proxies. Identity that DOES survive:
the McpToken (always in the Authorization header). So now the gate
keys its ungate state on both:
- sessionId → per-session (unchanged; Claude Code path)
- tokenSha → per-token (NEW; service-token path)
Flow for an McpToken caller:
1. first begin_session succeeds → session ungated + tokenSha cached
2. next request lands on a new mcp-session-id (proxy stripped it)
3. SessionGate.createSession sees tokenSha, finds active token entry,
starts the new session ungated with the prior tags + retrievedPrompts
4. tools/list on the fresh session returns the full upstream set — no
more begin_session loop
Plumbing:
- AuditCollector.getSessionMcpTokenSha(sessionId) exposes the already-
tracked principal.
- PluginSessionContext gets getMcpTokenSha() so plugins can read the
token identity without knowing about the collector.
- SessionGate gains (tokenSha?: string) on createSession/ungate, plus
isTokenUngated and revokeToken. TTL defaults to 1hr; tunable via
MCPLOCAL_TOKEN_UNGATE_TTL_MS env var.
- Gate plugin passes ctx.getMcpTokenSha() at every ungate call site
(begin_session, gated-intercept, intercept-fallback).
Tests: 7 new cases in session-gate.test.ts covering cross-session
persistence, token isolation, STDIO-path unchanged, TTL expiry,
revokeToken, and the empty-string edge case. 21/21 pass; 690/690 in
mcplocal overall.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The proxy-path fix (5d10728) covered upstream tools/call routing via
McpdUpstream, but getOrCreateRouter in project-mcp-endpoint.ts had TWO
more mcpd-bound call sites that silently fell back to the pod's empty
default token:
1. fetchProjectLlmConfig(mcpdClient, projectName)
2. router.setPromptConfig(mcpdClient.withHeaders({...}))
→ which is what gate.ts begin_session uses via ctx.fetchPromptIndex()
to hit /api/v1/projects/:name/prompts/visible
Symptom: in the k8s mcplocal pod, LiteLLM would initialize + tools/list
fine (showing begin_session), but tools/call begin_session returned
`{isError: true, content: "McpError: Authentication failed: invalid or
expired token"}`. Reproduced against the live cluster by driving
LiteLLM's /mcp/ endpoint with qwen3-thinking's exact payload.
Fix: build `requestClient = mcpdClient.withToken(authToken)` once at the
top of getOrCreateRouter and thread it through fetchProjectLlmConfig
and setPromptConfig. withHeaders still adds X-Service-Account for
mcpd-side audit tagging, but the bearer now carries the caller's
McpToken identity (resolves as McpToken:<sha> on mcpd).
Verified: unit tests pass (mock needed withToken/withTimeout stubs).
Next step: rebuild image + roll pod + retest LiteLLM→mcp flow.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Symptom: HTTP-mode mcplocal accepted the incoming mcpctl_pat_ bearer,
but every /api/v1/mcp/proxy call to mcpd for upstream discovery came
back with "Authentication failed: invalid or expired token" — because
those proxy calls were using the pod's DEFAULT McpdClient token,
which in a container with no ~/.mcpctl/credentials is the empty
string. The discovery GET was correct (explicit authOverride in
forward()), but syncUpstreams() then created McpdUpstream instances
bound to the original mcpdClient — so every tools/list to each
upstream went out with `Authorization: Bearer ` (empty) and mcpd's
auth hook rejected it.
Fix: add McpdClient.withToken(token) and have refreshProjectUpstreams
swap to `mcpdClient.withToken(authToken)` before handing the client to
syncUpstreams. This keeps the "pod has no identity" design: the token
used for downstream /api/v1/mcp/proxy calls is the caller's McpToken,
same as the one used for the initial discovery GET and for introspect.
Tested: project-discovery.test.ts + mcpd-upstream.test.ts pass. Next:
rebuild + roll the mcplocal image and retry LiteLLM probe.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Symptom: LiteLLM → mcplocal → mcpd proxy calls for project-scoped MCP
tool discovery all 401'd with "Authentication failed: invalid or
expired token", even though the same mcpctl_pat_ bearer works against
/api/v1/mcptokens/introspect and /api/v1/projects/:name/servers. Result:
the new k8s mcplocal pod could accept the bearer and respond to
/projects/:name/mcp (initialize was 200), but every downstream upstream
discovery call through /api/v1/mcp/proxy failed.
Root cause: registerMcpProxyRoutes installs its own route-scoped
createAuthMiddleware with the `authDeps` parameter it receives. In
main.ts that was being constructed with only `findSession` — missing
the `findMcpToken` that the GLOBAL auth hook already had. So a
mcpctl_pat_ bearer got all the way to the proxy route and then was
handed to an old-shape middleware that knew nothing about the prefix.
Fix: extract authDeps (findSession + findMcpToken) to a named const
and reuse it for both the global hook and the proxy route. Comment at
the declaration site warns future additions to keep the two paths in
sync — they have to agree or McpToken bearers silently break on
whichever one drifts.
Verified against the live cluster: LiteLLM's discoverTools path no
longer 401s; mcplocal logs now show successful upstream proxy calls.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The earlier plan recommended an MCPLOCAL_MCPD_TOKEN env var so the pod
would have a ServiceAccount session into mcpd. It's unnecessary: the
pod forwards every inbound client bearer (mcpctl_pat_...) verbatim to
mcpd for all downstream calls — both introspect and project discovery.
mcpd's auth middleware dispatches on the prefix and resolves the
McpToken principal directly. No pod secret, no rotation story.
Updates:
- serve.ts header: explicit "identity model" section calling this out
so future readers don't restore the env var thinking it's missing.
- docs/mcptoken-implementation.md: drop the "mount MCPLOCAL_MCPD_TOKEN"
Pulumi guidance and the "dedicated ServiceAccount" follow-up item;
state the correct image URL (internal 10.0.0.194 registry) and the
gated-vs-ungated rule for LLM config mounts.
No runtime code changes — serve.ts never actually required the token;
this just fixes the documentation and the header comment.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Verifies the HTTP-mode revocation lag ≤ 5s two ways:
1. Unit (tests/http/token-auth.test.ts, 8 cases): Fastify preHandler
with injected fetch stub exercises the positive/negative cache
directly — first call returns ok:true, we flip the stub to
revoked:true, wait past the short positive TTL, next call gets 401
with "revoked". Plus: non-Bearer 401, non-mcpctl_pat_ 401, wrong-
project 403, mcpd-unreachable 401, happy-path caching (1 fetch for N
requests within TTL), ok:false from mcpd 401.
2. End-to-end (smoke, run manually): added MCPLOCAL_TOKEN_POSITIVE_TTL_MS
and MCPLOCAL_TOKEN_NEGATIVE_TTL_MS env vars to serve.ts so the smoke
can shrink the 30s positive default for testing. Confirmed: with
positive TTL = 2s, the mcptoken.smoke.test.ts revocation case passes
against a local serve.js pointed at prod mcpd.
Operators get the same knobs in production — default behavior unchanged
(30s positive, 5s negative).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two bugs found while trying to point MCPGW_URL=http://localhost:3200
(the systemd mcplocal) so we could get real smoke coverage before the
Pulumi stack for mcp.ad.itaz.eu lands:
1. describe.skipIf(!gatewayUp) was evaluated at parse time, before
beforeAll ran, so gatewayUp was always false and the whole suite
skipped. Switched to the vllm-managed.test.ts pattern: runtime
`if (!gatewayUp) return` at the start of each it().
2. The revocation 401 assertion only makes sense against the
containerized serve.ts entry, which has a 5s negative introspection
cache. Against systemd mcplocal the whole project router is cached
for minutes, so a deleted token with a warm session still succeeds.
Added IS_HTTP_MODE detection (hostname not localhost/127/0.0.0.0,
or MCPGW_IS_HTTP_MODE=true) and skip the assertion otherwise — still
revoking the token so cleanup runs identically.
Run against systemd mcplocal locally:
MCPGW_URL=http://localhost:3200 pnpm --filter @mcpctl/mcplocal \\
exec vitest run --config vitest.smoke.config.ts mcptoken
→ 6/6 pass (revocation case explicitly deferred).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- status.ts + api-client.ts now dispatch on URL scheme so an https
mcpd URL no longer crashes with "Protocol https: not supported".
Caught by fulldeploy smoke runs — status.ts had `import http` only
and was synchronously throwing against https://mcpctl.ad.itaz.eu.
Each http.get call is wrapped so future scheme-mismatch errors also
degrade to "unreachable" instead of a stack trace.
- .dockerignore no longer excludes src/mcplocal/ (the new
Dockerfile.mcplocal needs those files).
- scripts/demo-mcp-call.py: standalone, stdlib-only Python demo that
makes an MCP request (initialize + tools/list, optional tools/call)
using an mcpctl_pat_ bearer. Counterpart to `mcpctl test mcp` for
showing external (e.g. vLLM) clients how the bearer flow works.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Delivers the final piece of the mcptoken stack: a containerized,
network-accessible mcplocal that serves Streamable-HTTP MCP to off-host
clients (the vLLM use case), authenticated by project-scoped McpTokens.
New binary (same package, new entry):
- src/mcplocal/src/serve.ts — HTTP-only entry. Reads MCPLOCAL_MCPD_URL,
MCPLOCAL_MCPD_TOKEN, MCPLOCAL_HTTP_HOST/PORT, MCPLOCAL_CACHE_DIR from
env. No StdioProxyServer, no --upstream.
- src/mcplocal/src/http/token-auth.ts — Fastify preHandler that
validates mcpctl_pat_ bearers via mcpd's /api/v1/mcptokens/introspect.
30s positive / 5s negative TTL. Rejects wrong-project with 403.
Shared HTTP MCP client:
- src/shared/src/mcp-http/ — reusable McpHttpSession with initialize,
listTools, callTool, close. Handles http+https, SSE, id correlation,
distinct McpProtocolError / McpTransportError. Plus mcpHealthCheck
and deriveBaseUrl helpers.
New CLI verb `mcpctl test mcp <url>`:
- Flags: --token (also $MCPCTL_TOKEN), --tool, --args (JSON),
--expect-tools, --timeout, -o text|json, --no-health.
- Exit codes: 0 PASS, 1 TRANSPORT/AUTH FAIL, 2 CONTRACT FAIL.
Container + deploy:
- deploy/Dockerfile.mcplocal (Node 20 alpine, multi-stage, pnpm
workspace, CMD node src/mcplocal/dist/serve.js, VOLUME
/var/lib/mcplocal/cache, HEALTHCHECK on :3200/healthz).
- scripts/build-mcplocal.sh mirrors build-mcpd.sh.
- fulldeploy.sh is now a 4-step pipeline that also builds + rolls out
mcplocal (gated on `kubectl get deployment/mcplocal` so the script
stays green before the Pulumi stack lands).
Audit + cache:
- project-mcp-endpoint.ts passes MCPLOCAL_CACHE_DIR into FileCache at
both construction sites and, when request.mcpToken is present, calls
collector.setSessionMcpToken(id, ...) so audit events carry the
tokenName/tokenSha.
Tests:
- 9 unit cases on `mcpctl test mcp` (happy path, health miss,
expect-tools hit/miss, transport throw, tool isError, json report,
$MCPCTL_TOKEN env fallback, invalid --args).
- Smoke test src/mcplocal/tests/smoke/mcptoken.smoke.test.ts —
gated on healthz($MCPGW_URL), skipped cleanly when unreachable.
Covers happy path, wrong-project 403, --expect-tools contract
failure, and revocation 401 within the negative-cache window.
1773/1773 workspace tests pass. Pulumi resources (Deployment, Service,
Ingress, PVC, Secret, NetworkPolicy) still need to land in
../kubernetes-deployment before the smoke gate flips on.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds the end-to-end CLI surface for McpTokens and the mcpd auth dispatch
that recognizes them.
mcpd auth middleware:
- Dispatch on the `mcpctl_pat_` bearer prefix. McpToken bearers resolve
through a new `findMcpToken(hash)` dep, populating `request.mcpToken`
and `request.userId = ownerId`. Everything else follows the existing
session path.
- Returns 401 for revoked / expired / unknown tokens.
- Global RBAC hook now threads `mcpTokenSha` into `canAccess` /
`canRunOperation` / `getAllowedScope`, and enforces a hard
project-scope check: a McpToken principal can only hit
`/api/v1/projects/<its-project>/...`.
CLI verbs:
- `mcpctl create mcptoken <name> -p <proj> [--rbac empty|clone]
[--bind role:view,resource:servers] [--ttl 30d|never|ISO]
[--description ...] [--force]` — returns the raw token once.
- `mcpctl get mcptokens [-p <proj>]` — table with
NAME/PROJECT/PREFIX/CREATED/LAST USED/EXPIRES/STATUS.
- `mcpctl get mcptoken <name> -p <proj>` and
`mcpctl describe mcptoken <name> -p <proj>` — describe surfaces the
auto-created RBAC bindings.
- `mcpctl delete mcptoken <name> -p <proj>`.
- `apply -f` support with `kind: mcptoken`. Tokens are immutable, so
apply creates if missing and skips if the name is already active.
Audit plumbing:
- `AuditEvent` / collector now carry optional `tokenName` / `tokenSha`.
`setSessionMcpToken` sits alongside `setSessionUserName`; both feed a
per-session principal map used at emit time.
- `AuditEventService` query accepts `tokenName` / `tokenSha` filters.
- Console `AuditEvent` type carries the new fields so a follow-up can
add a TOKEN column.
Completions regenerated. 1764/1764 tests pass workspace-wide.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
BREAKING: `mcpctl create rbac` no longer accepts `--binding` or
`--operation`. Use `--roleBindings` instead with key:value pairs:
# resource binding
--roleBindings role:view,resource:servers
--roleBindings role:view,resource:servers,name:my-ha
# operation binding (role:run is implied by action:)
--roleBindings action:logs
The on-disk YAML shape (`roleBindings: [{role, resource, name?}]` or
`{role:'run', action}`) is unchanged, so Git backups and existing
`apply -f` files continue to work. Only the command-line input format
changes.
The parser is extracted to src/cli/src/commands/rbac-bindings.ts so the
upcoming `mcpctl create mcptoken --bind <kv>` verb can reuse it.
Completions, tests, and the new parser unit test all pass (406/406).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a new McpToken Prisma model (project-scoped, SHA-256 hashed at rest,
optional expiry, revocable) plus backing repository, service, and REST
routes. Tokens are a first-class RBAC subject: new 'McpToken' kind is
added to the subject enum and the service auto-creates an RbacDefinition
with subject McpToken:<sha> when bindings are provided.
Creator-permission ceiling: the service rejects any requested binding
the creator cannot already satisfy themselves (re-uses
rbacService.canAccess / canRunOperation). rbacMode=clone snapshots the
creator's full permissions into the token.
Routes:
POST /api/v1/mcptokens create (returns raw token once)
GET /api/v1/mcptokens list (filter by project)
GET /api/v1/mcptokens/:id describe (no secret in response)
POST /api/v1/mcptokens/:id/revoke soft-delete + remove RbacDef
DELETE /api/v1/mcptokens/:id hard-delete
GET /api/v1/mcptokens/introspect validate raw bearer (used by mcplocal)
Extends AuditEvent with optional tokenName/tokenSha fields (indexed) so
token-driven activity can be filtered later. Adds token helpers in
@mcpctl/shared: TOKEN_PREFIX='mcpctl_pat_', generateToken, hashToken,
isMcpToken, timingSafeEqualHex.
Follow-up PRs add the auth-hook dispatch on the prefix, the CLI verbs,
and the HTTP-mode mcplocal that calls /introspect.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>