Symptom: HTTP-mode mcplocal accepted the incoming mcpctl_pat_ bearer,
but every /api/v1/mcp/proxy call to mcpd for upstream discovery came
back with "Authentication failed: invalid or expired token" — because
those proxy calls were using the pod's DEFAULT McpdClient token,
which in a container with no ~/.mcpctl/credentials is the empty
string. The discovery GET was correct (explicit authOverride in
forward()), but syncUpstreams() then created McpdUpstream instances
bound to the original mcpdClient — so every tools/list to each
upstream went out with `Authorization: Bearer ` (empty) and mcpd's
auth hook rejected it.
Fix: add McpdClient.withToken(token) and have refreshProjectUpstreams
swap to `mcpdClient.withToken(authToken)` before handing the client to
syncUpstreams. This keeps the "pod has no identity" design: the token
used for downstream /api/v1/mcp/proxy calls is the caller's McpToken,
same as the one used for the initial discovery GET and for introspect.
Tested: project-discovery.test.ts + mcpd-upstream.test.ts pass. Next:
rebuild + roll the mcplocal image and retry LiteLLM probe.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Symptom: LiteLLM → mcplocal → mcpd proxy calls for project-scoped MCP
tool discovery all 401'd with "Authentication failed: invalid or
expired token", even though the same mcpctl_pat_ bearer works against
/api/v1/mcptokens/introspect and /api/v1/projects/:name/servers. Result:
the new k8s mcplocal pod could accept the bearer and respond to
/projects/:name/mcp (initialize was 200), but every downstream upstream
discovery call through /api/v1/mcp/proxy failed.
Root cause: registerMcpProxyRoutes installs its own route-scoped
createAuthMiddleware with the `authDeps` parameter it receives. In
main.ts that was being constructed with only `findSession` — missing
the `findMcpToken` that the GLOBAL auth hook already had. So a
mcpctl_pat_ bearer got all the way to the proxy route and then was
handed to an old-shape middleware that knew nothing about the prefix.
Fix: extract authDeps (findSession + findMcpToken) to a named const
and reuse it for both the global hook and the proxy route. Comment at
the declaration site warns future additions to keep the two paths in
sync — they have to agree or McpToken bearers silently break on
whichever one drifts.
Verified against the live cluster: LiteLLM's discoverTools path no
longer 401s; mcplocal logs now show successful upstream proxy calls.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The earlier plan recommended an MCPLOCAL_MCPD_TOKEN env var so the pod
would have a ServiceAccount session into mcpd. It's unnecessary: the
pod forwards every inbound client bearer (mcpctl_pat_...) verbatim to
mcpd for all downstream calls — both introspect and project discovery.
mcpd's auth middleware dispatches on the prefix and resolves the
McpToken principal directly. No pod secret, no rotation story.
Updates:
- serve.ts header: explicit "identity model" section calling this out
so future readers don't restore the env var thinking it's missing.
- docs/mcptoken-implementation.md: drop the "mount MCPLOCAL_MCPD_TOKEN"
Pulumi guidance and the "dedicated ServiceAccount" follow-up item;
state the correct image URL (internal 10.0.0.194 registry) and the
gated-vs-ungated rule for LLM config mounts.
No runtime code changes — serve.ts never actually required the token;
this just fixes the documentation and the header comment.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Verifies the HTTP-mode revocation lag ≤ 5s two ways:
1. Unit (tests/http/token-auth.test.ts, 8 cases): Fastify preHandler
with injected fetch stub exercises the positive/negative cache
directly — first call returns ok:true, we flip the stub to
revoked:true, wait past the short positive TTL, next call gets 401
with "revoked". Plus: non-Bearer 401, non-mcpctl_pat_ 401, wrong-
project 403, mcpd-unreachable 401, happy-path caching (1 fetch for N
requests within TTL), ok:false from mcpd 401.
2. End-to-end (smoke, run manually): added MCPLOCAL_TOKEN_POSITIVE_TTL_MS
and MCPLOCAL_TOKEN_NEGATIVE_TTL_MS env vars to serve.ts so the smoke
can shrink the 30s positive default for testing. Confirmed: with
positive TTL = 2s, the mcptoken.smoke.test.ts revocation case passes
against a local serve.js pointed at prod mcpd.
Operators get the same knobs in production — default behavior unchanged
(30s positive, 5s negative).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two bugs found while trying to point MCPGW_URL=http://localhost:3200
(the systemd mcplocal) so we could get real smoke coverage before the
Pulumi stack for mcp.ad.itaz.eu lands:
1. describe.skipIf(!gatewayUp) was evaluated at parse time, before
beforeAll ran, so gatewayUp was always false and the whole suite
skipped. Switched to the vllm-managed.test.ts pattern: runtime
`if (!gatewayUp) return` at the start of each it().
2. The revocation 401 assertion only makes sense against the
containerized serve.ts entry, which has a 5s negative introspection
cache. Against systemd mcplocal the whole project router is cached
for minutes, so a deleted token with a warm session still succeeds.
Added IS_HTTP_MODE detection (hostname not localhost/127/0.0.0.0,
or MCPGW_IS_HTTP_MODE=true) and skip the assertion otherwise — still
revoking the token so cleanup runs identically.
Run against systemd mcplocal locally:
MCPGW_URL=http://localhost:3200 pnpm --filter @mcpctl/mcplocal \\
exec vitest run --config vitest.smoke.config.ts mcptoken
→ 6/6 pass (revocation case explicitly deferred).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- status.ts + api-client.ts now dispatch on URL scheme so an https
mcpd URL no longer crashes with "Protocol https: not supported".
Caught by fulldeploy smoke runs — status.ts had `import http` only
and was synchronously throwing against https://mcpctl.ad.itaz.eu.
Each http.get call is wrapped so future scheme-mismatch errors also
degrade to "unreachable" instead of a stack trace.
- .dockerignore no longer excludes src/mcplocal/ (the new
Dockerfile.mcplocal needs those files).
- scripts/demo-mcp-call.py: standalone, stdlib-only Python demo that
makes an MCP request (initialize + tools/list, optional tools/call)
using an mcpctl_pat_ bearer. Counterpart to `mcpctl test mcp` for
showing external (e.g. vLLM) clients how the bearer flow works.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Delivers the final piece of the mcptoken stack: a containerized,
network-accessible mcplocal that serves Streamable-HTTP MCP to off-host
clients (the vLLM use case), authenticated by project-scoped McpTokens.
New binary (same package, new entry):
- src/mcplocal/src/serve.ts — HTTP-only entry. Reads MCPLOCAL_MCPD_URL,
MCPLOCAL_MCPD_TOKEN, MCPLOCAL_HTTP_HOST/PORT, MCPLOCAL_CACHE_DIR from
env. No StdioProxyServer, no --upstream.
- src/mcplocal/src/http/token-auth.ts — Fastify preHandler that
validates mcpctl_pat_ bearers via mcpd's /api/v1/mcptokens/introspect.
30s positive / 5s negative TTL. Rejects wrong-project with 403.
Shared HTTP MCP client:
- src/shared/src/mcp-http/ — reusable McpHttpSession with initialize,
listTools, callTool, close. Handles http+https, SSE, id correlation,
distinct McpProtocolError / McpTransportError. Plus mcpHealthCheck
and deriveBaseUrl helpers.
New CLI verb `mcpctl test mcp <url>`:
- Flags: --token (also $MCPCTL_TOKEN), --tool, --args (JSON),
--expect-tools, --timeout, -o text|json, --no-health.
- Exit codes: 0 PASS, 1 TRANSPORT/AUTH FAIL, 2 CONTRACT FAIL.
Container + deploy:
- deploy/Dockerfile.mcplocal (Node 20 alpine, multi-stage, pnpm
workspace, CMD node src/mcplocal/dist/serve.js, VOLUME
/var/lib/mcplocal/cache, HEALTHCHECK on :3200/healthz).
- scripts/build-mcplocal.sh mirrors build-mcpd.sh.
- fulldeploy.sh is now a 4-step pipeline that also builds + rolls out
mcplocal (gated on `kubectl get deployment/mcplocal` so the script
stays green before the Pulumi stack lands).
Audit + cache:
- project-mcp-endpoint.ts passes MCPLOCAL_CACHE_DIR into FileCache at
both construction sites and, when request.mcpToken is present, calls
collector.setSessionMcpToken(id, ...) so audit events carry the
tokenName/tokenSha.
Tests:
- 9 unit cases on `mcpctl test mcp` (happy path, health miss,
expect-tools hit/miss, transport throw, tool isError, json report,
$MCPCTL_TOKEN env fallback, invalid --args).
- Smoke test src/mcplocal/tests/smoke/mcptoken.smoke.test.ts —
gated on healthz($MCPGW_URL), skipped cleanly when unreachable.
Covers happy path, wrong-project 403, --expect-tools contract
failure, and revocation 401 within the negative-cache window.
1773/1773 workspace tests pass. Pulumi resources (Deployment, Service,
Ingress, PVC, Secret, NetworkPolicy) still need to land in
../kubernetes-deployment before the smoke gate flips on.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds the end-to-end CLI surface for McpTokens and the mcpd auth dispatch
that recognizes them.
mcpd auth middleware:
- Dispatch on the `mcpctl_pat_` bearer prefix. McpToken bearers resolve
through a new `findMcpToken(hash)` dep, populating `request.mcpToken`
and `request.userId = ownerId`. Everything else follows the existing
session path.
- Returns 401 for revoked / expired / unknown tokens.
- Global RBAC hook now threads `mcpTokenSha` into `canAccess` /
`canRunOperation` / `getAllowedScope`, and enforces a hard
project-scope check: a McpToken principal can only hit
`/api/v1/projects/<its-project>/...`.
CLI verbs:
- `mcpctl create mcptoken <name> -p <proj> [--rbac empty|clone]
[--bind role:view,resource:servers] [--ttl 30d|never|ISO]
[--description ...] [--force]` — returns the raw token once.
- `mcpctl get mcptokens [-p <proj>]` — table with
NAME/PROJECT/PREFIX/CREATED/LAST USED/EXPIRES/STATUS.
- `mcpctl get mcptoken <name> -p <proj>` and
`mcpctl describe mcptoken <name> -p <proj>` — describe surfaces the
auto-created RBAC bindings.
- `mcpctl delete mcptoken <name> -p <proj>`.
- `apply -f` support with `kind: mcptoken`. Tokens are immutable, so
apply creates if missing and skips if the name is already active.
Audit plumbing:
- `AuditEvent` / collector now carry optional `tokenName` / `tokenSha`.
`setSessionMcpToken` sits alongside `setSessionUserName`; both feed a
per-session principal map used at emit time.
- `AuditEventService` query accepts `tokenName` / `tokenSha` filters.
- Console `AuditEvent` type carries the new fields so a follow-up can
add a TOKEN column.
Completions regenerated. 1764/1764 tests pass workspace-wide.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
BREAKING: `mcpctl create rbac` no longer accepts `--binding` or
`--operation`. Use `--roleBindings` instead with key:value pairs:
# resource binding
--roleBindings role:view,resource:servers
--roleBindings role:view,resource:servers,name:my-ha
# operation binding (role:run is implied by action:)
--roleBindings action:logs
The on-disk YAML shape (`roleBindings: [{role, resource, name?}]` or
`{role:'run', action}`) is unchanged, so Git backups and existing
`apply -f` files continue to work. Only the command-line input format
changes.
The parser is extracted to src/cli/src/commands/rbac-bindings.ts so the
upcoming `mcpctl create mcptoken --bind <kv>` verb can reuse it.
Completions, tests, and the new parser unit test all pass (406/406).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a new McpToken Prisma model (project-scoped, SHA-256 hashed at rest,
optional expiry, revocable) plus backing repository, service, and REST
routes. Tokens are a first-class RBAC subject: new 'McpToken' kind is
added to the subject enum and the service auto-creates an RbacDefinition
with subject McpToken:<sha> when bindings are provided.
Creator-permission ceiling: the service rejects any requested binding
the creator cannot already satisfy themselves (re-uses
rbacService.canAccess / canRunOperation). rbacMode=clone snapshots the
creator's full permissions into the token.
Routes:
POST /api/v1/mcptokens create (returns raw token once)
GET /api/v1/mcptokens list (filter by project)
GET /api/v1/mcptokens/:id describe (no secret in response)
POST /api/v1/mcptokens/:id/revoke soft-delete + remove RbacDef
DELETE /api/v1/mcptokens/:id hard-delete
GET /api/v1/mcptokens/introspect validate raw bearer (used by mcplocal)
Extends AuditEvent with optional tokenName/tokenSha fields (indexed) so
token-driven activity can be filtered later. Adds token helpers in
@mcpctl/shared: TOKEN_PREFIX='mcpctl_pat_', generateToken, hashToken,
isMcpToken, timingSafeEqualHex.
Follow-up PRs add the auth-hook dispatch on the prefix, the CLI verbs,
and the HTTP-mode mcplocal that calls /introspect.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a per-server tools/list cache in McpRouter (positive + negative TTL)
so a slow or dead upstream only stalls the first discovery call, not every
subsequent client request. Invalidated on upstream add/remove.
Health probes now apply a default liveness spec (tools/list via the real
production path) to any RUNNING instance without an explicit healthCheck,
so synthetic and real failures converge on the same signal.
Includes supporting updates in mcpd-client, discovery, upstream/mcpd,
seeder, and fulldeploy/release scripts.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Commit 1bd5087 added attachInteractive to the orchestrator interface
but never hooked it up in mcp-proxy-service — sendViaPersistentAttach
was promised in the commit message but missing from the diff. Servers
with a distroless image whose entrypoint IS the MCP server (gitea-mcp)
ended up needing a bogus `command: [node, dist/index.js]` workaround
that silently failed on every exec, leaving clients with empty tool
lists.
Changes:
- PersistentStdioClient: take a StdioMode discriminated union. Exec
mode runs a command via execInteractive; attach mode talks to PID 1
via attachInteractive.
- mcp-proxy-service: dispatch by config — command → exec; packageName
→ exec via runtime runner; dockerImage-only → attach. Error
serialization no longer drops non-Error objects as "[object Object]".
- templates/gitea.yaml: remove the command workaround; the image CMD
runs as PID 1 and mcpd attaches.
- Add unit tests covering both modes and the unsupported-orchestrator
paths.
Also required (separate repo): mcpd's k8s Role needed pods/attach
added alongside pods/exec; updated in kubernetes-deployment/…/mcpctl/server.ts
and kubectl-patched on the live cluster.
Verified end-to-end against mcpctl.ad.itaz.eu:
- gitea (attach): 49 tools listed, real tools/call round-trip.
- aws-docs (exec via packageName): 4 tools, no regression.
- docmost (exec via command): 11 tools, no regression.
- mcpd suite: 634/634 passing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Instance status now reflects actual container state:
- startOne() sets STARTING (not RUNNING) after container creation
- syncStatus() promotes STARTING→RUNNING when pod is ready
- syncStatus() demotes RUNNING→STARTING if pod restarts (CrashLoop)
- External servers still get RUNNING immediately (no container)
Previously, CrashLooping pods showed as RUNNING in mcpctl get instances.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two bugs fixed:
1. Backup completeness: JSON backup API now includes prompts and
templates. Previously these were silently dropped during
backup/restore, causing data loss on migration.
2. STDIO proxy for docker-image servers: servers with dockerImage
but no packageName/command (like docmost) now use k8s Attach
to connect to the container's PID 1 stdin/stdout instead of
exec. This fixes "has no packageName or command" errors.
Changes:
- backup-service.ts: add BackupPrompt/BackupTemplate types, export them
- restore-service.ts: restore prompts (with project FK) and templates
- mcp-proxy-service.ts: sendViaPersistentAttach for docker-image STDIO
- orchestrator.ts: add attachInteractive to McpOrchestrator interface
- kubernetes-orchestrator.ts: implement attachInteractive via k8s Attach
- k8s-client-official.ts: expose Attach client
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
mcpd now runs a periodic reconcileAll() every 30s that:
- Detects crashed/missing containers (syncStatus)
- Cleans up ERROR instances
- Creates replacement pods to match desired replica count
This replaces the old syncStatus-only timer. Servers migrated
from another deployment or recovering from node failures will
automatically get their instances recreated.
6 new tests for reconcileAll covering: missing instances, skip
replicas=0, already-at-count, ERROR cleanup, multi-server,
error isolation.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add MCPD_NODE_SELECTOR env var support in manifest generator
for mixed-arch clusters (e.g. arm64+amd64)
- Fix backup restore: resolve system user ID instead of
hardcoded 'system' string
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The restore service hardcoded ownerId as the literal string 'system'
instead of looking up the actual system user ID. This caused FK
constraint violations when restoring projects to a fresh database.
Now resolves the system user by email, falling back to the first
available user.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
mcpd can now deploy MCP server instances as Kubernetes pods instead of
Docker containers. Set MCPD_ORCHESTRATOR=kubernetes to enable.
- Add @kubernetes/client-node with thin wrapper (context enforcement
via MCPD_K8S_CONTEXT to prevent multi-cluster mishaps)
- Rewrite KubernetesOrchestrator: pod CRUD, pod IP extraction,
exec via SPDY (one-shot + interactive), log streaming
- Manifest generator: stdin:true for STDIO servers, args (not command)
to preserve runner image entrypoint, security hardening
- Orchestrator selection in main.ts via MCPD_ORCHESTRATOR env var
- 25 unit tests for k8s orchestrator, all 624 tests pass
Tested end-to-end on local k3s:
- mcpd deployed via Pulumi, creates pods in mcpctl-servers namespace
- NetworkPolicy verified: only mcpd can reach MCP server pods
- Python runner (uvx) successfully runs aws-documentation-mcp-server
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The act runner (v0.3.0) on NAS can't handle matrix jobs reliably on a
single worker — concurrent matrix entries fail silently. Build both
amd64 and arm64 sequentially in a single job instead.
Merge publish-rpm and publish-deb into a single publish job that
iterates over all RPM/DEB files in dist/.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The publish-rpm step was deleting the existing package by version
before uploading, but Gitea RPM registry keys by version (not
version+arch). When building both amd64 and arm64 in a matrix,
the second job would delete the first job's upload.
Remove the delete-before-upload pattern. Gitea supports multiple
architectures under the same version. Handle 409 (already exists)
gracefully instead.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Server instances require Docker/Podman (mcpd starts them as containers).
CI has no container runtime, so instances will never reach RUNNING.
Tests requiring running instances are already excluded.
Replace the 5-minute wait loop with a quick fixture verification step
that confirms servers, projects, and prompts were applied correctly,
and reports instance status for informational purposes only.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- publish-rpm and publish-deb now depend on both build and smoke jobs,
so packages are only published after all tests pass
- Reduce "Wait for server instance" from 60x5s (5min) to 10x2s (20s)
since Docker containers can't run in CI anyway
- Add debug output to RPM/DEB packaging steps
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- nfpm.yaml: use ${NFPM_ARCH} (Go's ExpandEnv doesn't support :-default)
- arch-helper.sh: export RPM_ARCH (x86_64/aarch64) alongside NFPM_ARCH
- build-rpm/deb.sh: build TypeScript before running tests (tests need
built @mcpctl/shared), generate Prisma client on fresh checkout
- Fix RPM filename matching to use aarch64 not arm64
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Build scripts now check for required tools before building and install
them automatically if missing. Handles both amd64 and arm64 host systems.
- pnpm: installed via corepack or npm
- bun: installed via official install script
- nfpm: downloaded from GitHub for the correct host architecture
- node_modules: runs pnpm install if missing
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add cross-architecture build support so the project can be developed on
ARM64 (Fedora aarch64 laptop) while still producing amd64 packages for
production. All build, package, publish, and install scripts are now
architecture-aware via shared arch-helper.sh detection.
- Add scripts/arch-helper.sh for shared architecture detection
- CI builds both amd64 and arm64 in matrix strategy
- nfpm.yaml uses NFPM_ARCH env var instead of hardcoded amd64
- Build scripts support MCPCTL_TARGET_ARCH for cross-compilation
- installlocal.sh auto-detects RPM/DEB and filters by architecture
- release.sh gains --both-arches flag for dual-arch releases
- Package cleanup is arch-scoped (won't clobber other arch's packages)
- build-mcpd.sh supports --platform and --multi-arch flags
- Add pnpm scripts: rpm:build:amd64, deb:build:arm64, release:both
- Conditional rpm/dpkg-deb checks for cross-distro compatibility
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fan-out discovery methods (tools/list, prompts/list, resources/list)
used synthetic request IDs that couldn't be looked up in the
correlation map. This caused upstream_response events to have no
correlationId, making the console unable to find upstream content
for replay ("No content to replay").
Fix: pass correlationId through RouteContext → discovery methods →
onUpstreamCall callback, so the handler can use it directly.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Internal API calls still use 10.0.0.194:3012, but all user-facing
install instructions now use the public GITEA_PUBLIC_URL.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Support DEB packaging alongside RPM for Debian trixie (13/stable),
forky (14/testing), Ubuntu noble (24.04 LTS), and plucky (25.04).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Configure mcplocal with anthropic (claude-haiku-3.5) in CI using
the ANTHROPIC_API_KEY secret. Writes ~/.mcpctl/config.json and
~/.mcpctl/secrets before starting mcplocal.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Run vitest with --no-file-parallelism to prevent concurrent requests
from crashing mcplocal. Also capture mcplocal output to a log file
and dump it on failure for debugging.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
These tests create MCP sessions to smoke-data which tries to proxy to
the smoke-aws-docs server container. Without Docker in CI, mcplocal
crashes when it attempts to connect to the non-existent container.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The security tests open an SSE connection to /inspect that crashes
mcplocal, cascading into timeouts for audit and proxy-pipeline tests.
They also need LLM providers not available in CI. These tests document
known vulnerabilities and work locally against production.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use `pnpm --filter @mcpctl/db exec` to run the CI user setup script
so @prisma/client resolves correctly under pnpm's strict layout.
Also remove unused bcrypt dependency.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The auth/bootstrap endpoint fails with 409 because mcpd's startup
creates a system user (system@mcpctl.local), making the "no users
exist" check fail. Instead, create the CI user, session token, and
RBAC definition directly in postgres via Prisma.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The curl -sf flag was hiding the actual HTTP error body. Now we capture
and display the full response to diagnose why auth bootstrap fails.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The single-worker Gitea runner consistently hangs when multiple parallel
jobs try to restore the pnpm cache simultaneously. Removing cache: pnpm
from setup-node trades slightly slower installs for reliable execution.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Runs in parallel with the build job after lint/typecheck/test pass.
Spins up PostgreSQL via services, bootstraps auth, starts mcpd and
mcplocal from source, applies smoke fixtures (aws-docs server + 100
prompts), and runs the full smoke test suite.
Container management for upstream MCP servers depends on Docker socket
availability in the runner — emits a warning if unavailable.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The Gitea Act Runner containers lack privileged access needed for
container-in-container builds. Tried: Docker CLI (permission denied),
podman (cannot re-exec), buildah (no /proc/self/uid_map), kaniko
(no standalone binary). Docker builds + deploy continue to work via
bash fulldeploy.sh which runs on the host directly.
CI pipeline now: lint → typecheck → test → build → publish-rpm
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Docker, podman, and buildah all fail in the runner container due to
missing /proc/self/uid_map (no user namespace support). Kaniko is
designed specifically for building Docker images inside containers
without privileged access, Docker daemon, or user namespaces.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Podman fails with "cannot re-exec process" inside runner containers
(no user namespace support). Buildah with --isolation chroot and
--storage-driver vfs can build OCI images without a daemon, without
namespaces, and without privileged mode.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Docker CLI can't connect to the podman socket in the runner container
(permission denied even as root). Switch to podman for building images
locally and skopeo with containers-storage transport for pushing.
Podman builds don't need a daemon socket.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>