michal/mcpctl

feat: McpToken — HTTP-mode mcplocal, CLI verbs, audit plumbing #50

Merged

michal merged 12 commits from feat/mcptoken into main

2026-04-18 16:37:53 +00:00

Author	SHA1	Message	Date
Michal	39df459bb1	feat(mcplocal): per-McpToken gate-ungate cache so service tokens survive proxies All checks were successful CI/CD / lint (pull_request) Successful in 1m0s Details CI/CD / typecheck (pull_request) Successful in 1m51s Details CI/CD / test (pull_request) Successful in 1m3s Details CI/CD / build (pull_request) Successful in 2m13s Details CI/CD / smoke (pull_request) Successful in 4m49s Details CI/CD / publish (pull_request) Has been skipped Details Fixes the LiteLLM loop: LiteLLM's /mcp/ proxy doesn't propagate the mcp-session-id header, so every tool call from qwen3 landed on a fresh upstream session, which always started gated, so the only visible tool was begin_session — forever. The session-id gate works fine for Claude Code (stdio, long-lived), but breaks through session-stripping proxies. Identity that DOES survive: the McpToken (always in the Authorization header). So now the gate keys its ungate state on both: - sessionId → per-session (unchanged; Claude Code path) - tokenSha → per-token (NEW; service-token path) Flow for an McpToken caller: 1. first begin_session succeeds → session ungated + tokenSha cached 2. next request lands on a new mcp-session-id (proxy stripped it) 3. SessionGate.createSession sees tokenSha, finds active token entry, starts the new session ungated with the prior tags + retrievedPrompts 4. tools/list on the fresh session returns the full upstream set — no more begin_session loop Plumbing: - AuditCollector.getSessionMcpTokenSha(sessionId) exposes the already- tracked principal. - PluginSessionContext gets getMcpTokenSha() so plugins can read the token identity without knowing about the collector. - SessionGate gains (tokenSha?: string) on createSession/ungate, plus isTokenUngated and revokeToken. TTL defaults to 1hr; tunable via MCPLOCAL_TOKEN_UNGATE_TTL_MS env var. - Gate plugin passes ctx.getMcpTokenSha() at every ungate call site (begin_session, gated-intercept, intercept-fallback). Tests: 7 new cases in session-gate.test.ts covering cross-session persistence, token isolation, STDIO-path unchanged, TTL expiry, revokeToken, and the empty-string edge case. 21/21 pass; 690/690 in mcplocal overall. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 17:34:28 +01:00
Michal	75fe0533c1	fix(mcplocal): propagate caller's bearer to prompt-index and LLM-config calls All checks were successful CI/CD / typecheck (pull_request) Successful in 51s Details CI/CD / test (pull_request) Successful in 1m3s Details CI/CD / lint (pull_request) Successful in 2m27s Details CI/CD / build (pull_request) Successful in 2m11s Details CI/CD / smoke (pull_request) Successful in 4m56s Details CI/CD / publish (pull_request) Has been skipped Details The proxy-path fix (`5d10728`) covered upstream tools/call routing via McpdUpstream, but getOrCreateRouter in project-mcp-endpoint.ts had TWO more mcpd-bound call sites that silently fell back to the pod's empty default token: 1. fetchProjectLlmConfig(mcpdClient, projectName) 2. router.setPromptConfig(mcpdClient.withHeaders({...})) → which is what gate.ts begin_session uses via ctx.fetchPromptIndex() to hit /api/v1/projects/:name/prompts/visible Symptom: in the k8s mcplocal pod, LiteLLM would initialize + tools/list fine (showing begin_session), but tools/call begin_session returned `{isError: true, content: "McpError: Authentication failed: invalid or expired token"}`. Reproduced against the live cluster by driving LiteLLM's /mcp/ endpoint with qwen3-thinking's exact payload. Fix: build `requestClient = mcpdClient.withToken(authToken)` once at the top of getOrCreateRouter and thread it through fetchProjectLlmConfig and setPromptConfig. withHeaders still adds X-Service-Account for mcpd-side audit tagging, but the bearer now carries the caller's McpToken identity (resolves as McpToken:<sha> on mcpd). Verified: unit tests pass (mock needed withToken/withTimeout stubs). Next step: rebuild image + roll pod + retest LiteLLM→mcp flow. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 04:44:27 +01:00
Michal	5d1072889f	fix(mcplocal): thread client bearer into per-upstream McpdClient Symptom: HTTP-mode mcplocal accepted the incoming mcpctl_pat_ bearer, but every /api/v1/mcp/proxy call to mcpd for upstream discovery came back with "Authentication failed: invalid or expired token" — because those proxy calls were using the pod's DEFAULT McpdClient token, which in a container with no ~/.mcpctl/credentials is the empty string. The discovery GET was correct (explicit authOverride in forward()), but syncUpstreams() then created McpdUpstream instances bound to the original mcpdClient — so every tools/list to each upstream went out with `Authorization: Bearer ` (empty) and mcpd's auth hook rejected it. Fix: add McpdClient.withToken(token) and have refreshProjectUpstreams swap to `mcpdClient.withToken(authToken)` before handing the client to syncUpstreams. This keeps the "pod has no identity" design: the token used for downstream /api/v1/mcp/proxy calls is the caller's McpToken, same as the one used for the initial discovery GET and for introspect. Tested: project-discovery.test.ts + mcpd-upstream.test.ts pass. Next: rebuild + roll the mcplocal image and retry LiteLLM probe. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 03:06:55 +01:00
Michal	dfc53cd15e	fix(mcpd): per-route /api/v1/mcp/proxy auth missed McpToken dispatch Symptom: LiteLLM → mcplocal → mcpd proxy calls for project-scoped MCP tool discovery all 401'd with "Authentication failed: invalid or expired token", even though the same mcpctl_pat_ bearer works against /api/v1/mcptokens/introspect and /api/v1/projects/:name/servers. Result: the new k8s mcplocal pod could accept the bearer and respond to /projects/:name/mcp (initialize was 200), but every downstream upstream discovery call through /api/v1/mcp/proxy failed. Root cause: registerMcpProxyRoutes installs its own route-scoped createAuthMiddleware with the `authDeps` parameter it receives. In main.ts that was being constructed with only `findSession` — missing the `findMcpToken` that the GLOBAL auth hook already had. So a mcpctl_pat_ bearer got all the way to the proxy route and then was handed to an old-shape middleware that knew nothing about the prefix. Fix: extract authDeps (findSession + findMcpToken) to a named const and reuse it for both the global hook and the proxy route. Comment at the declaration site warns future additions to keep the two paths in sync — they have to agree or McpToken bearers silently break on whichever one drifts. Verified against the live cluster: LiteLLM's discoverTools path no longer 401s; mcplocal logs now show successful upstream proxy calls. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 00:23:44 +01:00
Michal	1887d90821	docs: scrub MCPLOCAL_MCPD_TOKEN — pod has no persistent mcpd identity Some checks failed CI/CD / lint (pull_request) Successful in 50s Details CI/CD / test (pull_request) Successful in 1m4s Details CI/CD / typecheck (pull_request) Failing after 7m3s Details CI/CD / smoke (pull_request) Has been skipped Details CI/CD / build (pull_request) Has been skipped Details CI/CD / publish (pull_request) Has been skipped Details The earlier plan recommended an MCPLOCAL_MCPD_TOKEN env var so the pod would have a ServiceAccount session into mcpd. It's unnecessary: the pod forwards every inbound client bearer (mcpctl_pat_...) verbatim to mcpd for all downstream calls — both introspect and project discovery. mcpd's auth middleware dispatches on the prefix and resolves the McpToken principal directly. No pod secret, no rotation story. Updates: - serve.ts header: explicit "identity model" section calling this out so future readers don't restore the env var thinking it's missing. - docs/mcptoken-implementation.md: drop the "mount MCPLOCAL_MCPD_TOKEN" Pulumi guidance and the "dedicated ServiceAccount" follow-up item; state the correct image URL (internal 10.0.0.194 registry) and the gated-vs-ungated rule for LLM config mounts. No runtime code changes — serve.ts never actually required the token; this just fixes the documentation and the header comment. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-17 23:54:46 +01:00
Michal	3061a5f6ae	test+feat: token-auth unit coverage + env-tunable introspection TTLs Some checks failed CI/CD / lint (pull_request) Successful in 51s Details CI/CD / typecheck (pull_request) Successful in 51s Details CI/CD / test (pull_request) Successful in 1m3s Details CI/CD / smoke (pull_request) Failing after 3m24s Details CI/CD / build (pull_request) Successful in 4m45s Details CI/CD / publish (pull_request) Has been skipped Details Verifies the HTTP-mode revocation lag ≤ 5s two ways: 1. Unit (tests/http/token-auth.test.ts, 8 cases): Fastify preHandler with injected fetch stub exercises the positive/negative cache directly — first call returns ok:true, we flip the stub to revoked:true, wait past the short positive TTL, next call gets 401 with "revoked". Plus: non-Bearer 401, non-mcpctl_pat_ 401, wrong- project 403, mcpd-unreachable 401, happy-path caching (1 fetch for N requests within TTL), ok:false from mcpd 401. 2. End-to-end (smoke, run manually): added MCPLOCAL_TOKEN_POSITIVE_TTL_MS and MCPLOCAL_TOKEN_NEGATIVE_TTL_MS env vars to serve.ts so the smoke can shrink the 30s positive default for testing. Confirmed: with positive TTL = 2s, the mcptoken.smoke.test.ts revocation case passes against a local serve.js pointed at prod mcpd. Operators get the same knobs in production — default behavior unchanged (30s positive, 5s negative). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-17 23:25:06 +01:00
Michal	913678e400	fix(smoke): mcptoken — runtime gatewayUp gate + scope revocation case to HTTP-mode All checks were successful CI/CD / lint (pull_request) Successful in 52s Details CI/CD / test (pull_request) Successful in 1m4s Details CI/CD / typecheck (pull_request) Successful in 2m23s Details CI/CD / build (pull_request) Successful in 2m52s Details CI/CD / smoke (pull_request) Successful in 5m40s Details CI/CD / publish (pull_request) Has been skipped Details Two bugs found while trying to point MCPGW_URL=http://localhost:3200 (the systemd mcplocal) so we could get real smoke coverage before the Pulumi stack for mcp.ad.itaz.eu lands: 1. describe.skipIf(!gatewayUp) was evaluated at parse time, before beforeAll ran, so gatewayUp was always false and the whole suite skipped. Switched to the vllm-managed.test.ts pattern: runtime `if (!gatewayUp) return` at the start of each it(). 2. The revocation 401 assertion only makes sense against the containerized serve.ts entry, which has a 5s negative introspection cache. Against systemd mcplocal the whole project router is cached for minutes, so a deleted token with a warm session still succeeds. Added IS_HTTP_MODE detection (hostname not localhost/127/0.0.0.0, or MCPGW_IS_HTTP_MODE=true) and skip the assertion otherwise — still revoking the token so cleanup runs identically. Run against systemd mcplocal locally: MCPGW_URL=http://localhost:3200 pnpm --filter @mcpctl/mcplocal \\ exec vitest run --config vitest.smoke.config.ts mcptoken → 6/6 pass (revocation case explicitly deferred). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-17 23:20:36 +01:00
Michal	f68e123821	fix(cli): https support in status + api-client; add demo-mcp-call.py All checks were successful CI/CD / lint (pull_request) Successful in 1m40s Details CI/CD / typecheck (pull_request) Successful in 1m35s Details CI/CD / test (pull_request) Successful in 2m16s Details CI/CD / build (pull_request) Successful in 2m17s Details CI/CD / smoke (pull_request) Successful in 4m37s Details CI/CD / publish (pull_request) Has been skipped Details - status.ts + api-client.ts now dispatch on URL scheme so an https mcpd URL no longer crashes with "Protocol https: not supported". Caught by fulldeploy smoke runs — status.ts had `import http` only and was synchronously throwing against https://mcpctl.ad.itaz.eu. Each http.get call is wrapped so future scheme-mismatch errors also degrade to "unreachable" instead of a stack trace. - .dockerignore no longer excludes src/mcplocal/ (the new Dockerfile.mcplocal needs those files). - scripts/demo-mcp-call.py: standalone, stdlib-only Python demo that makes an MCP request (initialize + tools/list, optional tools/call) using an mcpctl_pat_ bearer. Counterpart to `mcpctl test mcp` for showing external (e.g. vLLM) clients how the bearer flow works. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-17 22:34:00 +01:00
Michal	2127b41d9f	feat: HTTP-mode mcplocal container + mcpctl test mcp + token-auth preHandler Delivers the final piece of the mcptoken stack: a containerized, network-accessible mcplocal that serves Streamable-HTTP MCP to off-host clients (the vLLM use case), authenticated by project-scoped McpTokens. New binary (same package, new entry): - src/mcplocal/src/serve.ts — HTTP-only entry. Reads MCPLOCAL_MCPD_URL, MCPLOCAL_MCPD_TOKEN, MCPLOCAL_HTTP_HOST/PORT, MCPLOCAL_CACHE_DIR from env. No StdioProxyServer, no --upstream. - src/mcplocal/src/http/token-auth.ts — Fastify preHandler that validates mcpctl_pat_ bearers via mcpd's /api/v1/mcptokens/introspect. 30s positive / 5s negative TTL. Rejects wrong-project with 403. Shared HTTP MCP client: - src/shared/src/mcp-http/ — reusable McpHttpSession with initialize, listTools, callTool, close. Handles http+https, SSE, id correlation, distinct McpProtocolError / McpTransportError. Plus mcpHealthCheck and deriveBaseUrl helpers. New CLI verb `mcpctl test mcp <url>`: - Flags: --token (also $MCPCTL_TOKEN), --tool, --args (JSON), --expect-tools, --timeout, -o text\|json, --no-health. - Exit codes: 0 PASS, 1 TRANSPORT/AUTH FAIL, 2 CONTRACT FAIL. Container + deploy: - deploy/Dockerfile.mcplocal (Node 20 alpine, multi-stage, pnpm workspace, CMD node src/mcplocal/dist/serve.js, VOLUME /var/lib/mcplocal/cache, HEALTHCHECK on :3200/healthz). - scripts/build-mcplocal.sh mirrors build-mcpd.sh. - fulldeploy.sh is now a 4-step pipeline that also builds + rolls out mcplocal (gated on `kubectl get deployment/mcplocal` so the script stays green before the Pulumi stack lands). Audit + cache: - project-mcp-endpoint.ts passes MCPLOCAL_CACHE_DIR into FileCache at both construction sites and, when request.mcpToken is present, calls collector.setSessionMcpToken(id, ...) so audit events carry the tokenName/tokenSha. Tests: - 9 unit cases on `mcpctl test mcp` (happy path, health miss, expect-tools hit/miss, transport throw, tool isError, json report, $MCPCTL_TOKEN env fallback, invalid --args). - Smoke test src/mcplocal/tests/smoke/mcptoken.smoke.test.ts — gated on healthz($MCPGW_URL), skipped cleanly when unreachable. Covers happy path, wrong-project 403, --expect-tools contract failure, and revocation 401 within the negative-cache window. 1773/1773 workspace tests pass. Pulumi resources (Deployment, Service, Ingress, PVC, Secret, NetworkPolicy) still need to land in ../kubernetes-deployment before the smoke gate flips on. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-17 01:21:42 +01:00
Michal	a151b2e756	feat: mcpctl mcptoken verbs + mcpd auth dispatch + audit plumbing Adds the end-to-end CLI surface for McpTokens and the mcpd auth dispatch that recognizes them. mcpd auth middleware: - Dispatch on the `mcpctl_pat_` bearer prefix. McpToken bearers resolve through a new `findMcpToken(hash)` dep, populating `request.mcpToken` and `request.userId = ownerId`. Everything else follows the existing session path. - Returns 401 for revoked / expired / unknown tokens. - Global RBAC hook now threads `mcpTokenSha` into `canAccess` / `canRunOperation` / `getAllowedScope`, and enforces a hard project-scope check: a McpToken principal can only hit `/api/v1/projects/<its-project>/...`. CLI verbs: - `mcpctl create mcptoken <name> -p <proj> [--rbac empty\|clone] [--bind role:view,resource:servers] [--ttl 30d\|never\|ISO] [--description ...] [--force]` — returns the raw token once. - `mcpctl get mcptokens [-p <proj>]` — table with NAME/PROJECT/PREFIX/CREATED/LAST USED/EXPIRES/STATUS. - `mcpctl get mcptoken <name> -p <proj>` and `mcpctl describe mcptoken <name> -p <proj>` — describe surfaces the auto-created RBAC bindings. - `mcpctl delete mcptoken <name> -p <proj>`. - `apply -f` support with `kind: mcptoken`. Tokens are immutable, so apply creates if missing and skips if the name is already active. Audit plumbing: - `AuditEvent` / collector now carry optional `tokenName` / `tokenSha`. `setSessionMcpToken` sits alongside `setSessionUserName`; both feed a per-session principal map used at emit time. - `AuditEventService` query accepts `tokenName` / `tokenSha` filters. - Console `AuditEvent` type carries the new fields so a follow-up can add a TOKEN column. Completions regenerated. 1764/1764 tests pass workspace-wide. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-17 01:12:43 +01:00
Michal	efcfeeab65	feat(cli)!: migrate `create rbac` bindings to --roleBindings kv syntax BREAKING: `mcpctl create rbac` no longer accepts `--binding` or `--operation`. Use `--roleBindings` instead with key:value pairs: # resource binding --roleBindings role:view,resource:servers --roleBindings role:view,resource:servers,name:my-ha # operation binding (role:run is implied by action:) --roleBindings action:logs The on-disk YAML shape (`roleBindings: [{role, resource, name?}]` or `{role:'run', action}`) is unchanged, so Git backups and existing `apply -f` files continue to work. Only the command-line input format changes. The parser is extracted to src/cli/src/commands/rbac-bindings.ts so the upcoming `mcpctl create mcptoken --bind <kv>` verb can reuse it. Completions, tests, and the new parser unit test all pass (406/406). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-17 01:03:57 +01:00
Michal	2ddb493bb0	feat(mcpd): McpToken schema + CRUD routes + introspection Adds a new McpToken Prisma model (project-scoped, SHA-256 hashed at rest, optional expiry, revocable) plus backing repository, service, and REST routes. Tokens are a first-class RBAC subject: new 'McpToken' kind is added to the subject enum and the service auto-creates an RbacDefinition with subject McpToken:<sha> when bindings are provided. Creator-permission ceiling: the service rejects any requested binding the creator cannot already satisfy themselves (re-uses rbacService.canAccess / canRunOperation). rbacMode=clone snapshots the creator's full permissions into the token. Routes: POST /api/v1/mcptokens create (returns raw token once) GET /api/v1/mcptokens list (filter by project) GET /api/v1/mcptokens/:id describe (no secret in response) POST /api/v1/mcptokens/:id/revoke soft-delete + remove RbacDef DELETE /api/v1/mcptokens/:id hard-delete GET /api/v1/mcptokens/introspect validate raw bearer (used by mcplocal) Extends AuditEvent with optional tokenName/tokenSha fields (indexed) so token-driven activity can be filtered later. Adds token helpers in @mcpctl/shared: TOKEN_PREFIX='mcpctl_pat_', generateToken, hashToken, isMcpToken, timingSafeEqualHex. Follow-up PRs add the auth-hook dispatch on the prefix, the CLI verbs, and the HTTP-mode mcplocal that calls /introspect. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-17 01:00:04 +01:00