Commit Graph

92 Commits

Author SHA1 Message Date
Michal
a84214dad1 fix(cli): status probe accepts reasoning_content for thinking models
Some checks failed
CI/CD / typecheck (pull_request) Successful in 56s
CI/CD / lint (pull_request) Successful in 3m6s
CI/CD / test (pull_request) Successful in 1m9s
CI/CD / build (pull_request) Successful in 2m39s
CI/CD / smoke (pull_request) Failing after 3m58s
CI/CD / publish (pull_request) Has been skipped
Live deploy showed qwen3-thinking failing the probe with "empty
content": at max_tokens=8 the model spent its entire budget on the
reasoning trace and never emitted a final \`content\` block.

Fix:
- Bump max_tokens to 64. Still caps latency at ~1-2 sec on cheap
  models but gives reasoning models enough headroom.
- If \`message.content\` is empty but \`reasoning_content\` is non-empty,
  count it as alive and prefix the preview with "[thinking]" so the
  user knows the model didn't actually answer "hi" but is responsive.
- Replace the prompt with the terser "Reply with just: hi" — closer
  to what a thinking model can short-circuit on.

Tests: existing 25 pass; the failure-path test still asserts on the
"empty content" path because reasoning_content is empty there too.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 12:09:42 +01:00
Michal
e4af16477c feat(cli): live "say hi" probe for server LLMs in mcpctl status
Some checks failed
CI/CD / lint (pull_request) Successful in 55s
CI/CD / test (pull_request) Successful in 1m13s
CI/CD / typecheck (pull_request) Successful in 3m10s
CI/CD / smoke (pull_request) Failing after 1m46s
CI/CD / build (pull_request) Successful in 3m24s
CI/CD / publish (pull_request) Has been skipped
Status was showing the server-side LLM list but not whether each one
actually serves inference. This adds a per-LLM probe that POSTs a
tiny prompt to /api/v1/llms/<name>/infer:

  messages: [{ role: 'user', content: "Say exactly the word 'hi' and nothing else." }]
  max_tokens: 8, temperature: 0

Each registered LLM gets a one-line health line:

  Server LLMs: 2 registered (probing live "say hi"...)
    fast   qwen3-thinking  ✓ "hi" 312ms
              openai → qwen3-thinking  http://litellm.../v1  key:litellm/API_KEY
    heavy  sonnet  ✗ upstream auth failed: 401
              anthropic → claude-sonnet-4-5  provider default  no key

Probes run in parallel so a single slow LLM doesn't gate the others;
each has its own 15-second timeout. JSON/YAML output gains a
\`health: { ok, ms, say?, error? }\` field per server LLM so dashboards
get the same liveness signal.

Tests: 25/25 (was 24, +1 new for the failure-path render). Workspace
suite: 2006/2006 across 149 files.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 12:02:00 +01:00
Michal
0db37e92a4 feat(cli)+fix(mcpd): server-side LLM status + SPA fallback 500
Some checks failed
CI/CD / typecheck (pull_request) Successful in 58s
CI/CD / test (pull_request) Successful in 1m9s
CI/CD / lint (pull_request) Successful in 2m14s
CI/CD / smoke (pull_request) Failing after 1m39s
CI/CD / build (pull_request) Successful in 2m14s
CI/CD / publish (pull_request) Has been skipped
Two related fixes:

1. \`mcpctl status\` now lists mcpd-managed Llm rows (the ones created via
   \`mcpctl create llm\`) under a new "Server LLMs:" section, grouped by
   tier with type, model, upstream URL, and key reference. JSON/YAML
   output gains a \`serverLlms\` array.

   Bearer token (from \`mcpctl auth login\` / saved credentials) is
   passed through; if mcpd is unreachable or returns non-200 the
   section is silently omitted (the existing mcpd connectivity line
   already conveys that). 6 new tests cover happy path, empty list,
   token plumbing, and JSON shape.

2. SPA fallback at \`/ui/<deeplink>\` was returning 500 because we
   registered \`@fastify/static\` with \`decorateReply: false\` and then
   called \`reply.sendFile\`. Read index.html once at startup and serve
   it with \`reply.send(html)\` instead — also dodges a per-request
   stat call. Drop \`decorateReply: false\` so future code can use
   reply.sendFile if it ever needs to.

Full suite: 2005/2005 across 149 files.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 11:27:45 +01:00
Michal
9050918a83 feat(cli): personality flag + create/get/edit/delete personalities (Stage 4)
End-to-end CLI surface for the personality overlay:

  mcpctl create personality grumpy --agent reviewer --description "be terse"
  mcpctl create prompt tone --agent reviewer --content "Be very terse."
  mcpctl get personalities
  mcpctl get personalities --agent reviewer
  mcpctl edit personality <id>
  mcpctl delete personality grumpy --agent reviewer
  mcpctl chat reviewer --personality grumpy

Chat banner gains a "Personality:" line that shows either the active
flag value or the agent's `defaultPersonality` (when no flag given),
so the user knows which overlay is in effect before sending a message.

`--personality` is stripped from `/save` (it's a per-turn override,
not a `defaultParams` field — the agent's defaultPersonality lives on
its own column and is set via PUT /agents).

Backend (small additions to land Stage 4 cleanly):
- `GET /api/v1/personalities[?agent=name]` so `mcpctl get
  personalities` doesn't require an agent filter.
- PersonalityService.listAll() aggregates across agents.

Completions: regenerated fish + bash. `personalities` added as a
canonical resource with `personality` alias; edit-resource list
extended; the per-resource argument completers pick up the new
type automatically.

CLI suite: 430/430. mcpd: 801/801. Typecheck clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 19:32:48 +01:00
Michal
21f406037a feat(chat): print agent + system prompt banner at chat start
Some checks failed
CI/CD / typecheck (pull_request) Successful in 53s
CI/CD / test (pull_request) Successful in 1m5s
CI/CD / lint (pull_request) Successful in 2m29s
CI/CD / smoke (pull_request) Failing after 1m39s
CI/CD / build (pull_request) Successful in 5m30s
CI/CD / publish (pull_request) Has been skipped
When you launch \`mcpctl chat <agent>\` it's not always obvious which
agent, LLM, project, or system prompt you're actually wired to,
especially when --system / --system-append flags are layered on top
of the agent's defaults. The session would just start at \`> \` with
no confirmation of the configuration.

Now both REPL and one-shot modes print a banner to stderr listing:
  - agent name + description
  - LLM + project (if attached)
  - effective system prompt (or --system override) and any
    --system-append addendum, indented for readability
  - active sampling overrides (temperature, top_p, etc.)

Goes through stderr so \`mcpctl chat ... -m "hi" 2>/dev/null\` keeps
piping clean. Best-effort: a metadata fetch failure logs and lets
the chat proceed rather than blocking.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 18:37:06 +01:00
Michal
ae54210a52 fix(chat): pin live tokens/sec ticker to a bottom-row status bar
The previous ticker used cursor save/restore (\x1b[s / \x1b[u) to draw
a stats line one row below the cursor. Save/restore is unreliable when
content scrolls or wraps — the saved row drifts off the visible area
and the restore lands inside content lines, smearing the ticker into
mid-word positions:

  Here are the available tools you can
  ⏵ 7w · 56.5 w/s · 0.1s | thinking 41 use with Docmost:6s

Replace it with a DECSTBM scroll region. Lock the bottom row, scroll
rows 1..N-1 for content, redraw the locked row in place every 250 ms.
This is how htop / tig / mosh status pin their footers — content and
status physically can't overlap.

Lifecycle: install once per chat-session (REPL or one-shot), tear down
on close / Ctrl-D / /quit / SIGINT / SIGTERM / uncaughtException. Pipes
and small terminals (<5 rows) get a no-op StatusBar so output stays
clean. Resize re-emits the scroll region with the new height.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 17:49:26 +01:00
Michal
cc9822d38b feat(chat): live tokens/sec ticker + final stats footer
While streaming, the REPL now shows a live word/sec counter on a status
line one row below the cursor — refreshes every 250ms via ANSI cursor
save+restore so it floats with the content as the response grows.
After each response, a dim stats footer prints on stderr:

  (47w · 12.3 w/s · 3.9s | thinking 234w · 38 w/s · 6.2s)

The ticker is stderr-only and only emits when stderr is a TTY — pipes
to a file stay clean for grepping/redirect. Words are whitespace-
separated tokens (good enough across English/code/Markdown without a
tokenizer dependency; CJK under-counts but the rate is still
directional).

Both phases tracked separately:
  - thinking: reasoning_content from qwen3-thinking / deepseek-reasoner
    / o1, where the model's scratchpad is the long part
  - content: the actual assistant answer

Final stats also added to the --no-stream path: total HTTP duration
and word count, since we don't get per-token timing there.

CLI suite still 430/430.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 17:15:26 +01:00
Michal
7cfa449465 feat(chat): surface reasoning_content as thinking chunks; fix --no-stream timeout
Reasoning models (qwen3-thinking, deepseek-reasoner, OpenAI o1 family) emit
their scratchpad as `delta.reasoning_content` (or `delta.reasoning`,
or `delta.provider_specific_fields.reasoning_content` when LiteLLM passes
through from vLLM) — separate from `delta.content`. Before this commit
mcpd's parseStreamingChunk only watched `content`, so the model's 30-90s
reasoning phase looked like dead air to the REPL: streaming connection
open, no chunks, no progress. Caught during the agents-feature shakedown
when qwen3-thinking sat silent for 90s on a docmost__list_pages call.

mcpd
====
chat.service.ts
  - parseStreamingChunk extracts a `reasoningDelta` from the chunk body,
    accepting all four spellings (reasoning_content / reasoning /
    provider_specific_fields.{reasoning_content,reasoning}). Future
    providers can add their own field names by extending the
    fallback chain.
  - chatStream yields `{ type: 'thinking', delta }` chunks as reasoning
    arrives, alongside the existing `{ type: 'text', delta }` for content.
  - Reasoning is intentionally NOT persisted to the thread. It's the
    model's scratchpad, not part of the conversation. Subsequent turns
    don't see it.
  - Adds 'thinking' to the ChatStreamChunk.type union.

CLI
===
chat.ts
  - streamOnce handles 'thinking' chunks: writes them dim+italic to
    stderr (ANSI 2;3m) so the model's reasoning visually flows like a
    quote block while the final answer streams to stdout. Plain text
    when stderr isn't a TTY (pipe to file → no escape codes leak).
  - chatRequestNonStream replaces the shared ApiClient.post() for the
    --no-stream path. ApiClient defaults to a 10s timeout, way too tight
    for any chat that calls a tool: LLM round + tool dispatch + LLM
    summary easily exceeds 10s. The new helper uses the same 600s timeout
    the streaming path has been using all along.

Tests:
  chat-service.test.ts (+2):
    - reasoning_content deltas surface as `thinking` chunks (not text);
      reasoning is NOT persisted to the assistant turn's content.
    - LiteLLM's provider_specific_fields.reasoning_content shape parses
      identically to the vendor-native shape.

mcpd 777/777, cli 430/430.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 17:04:01 +01:00
Michal
cc225eb70f feat(llm): probe upstream auth at registration time
mcpd now runs a cheap auth probe whenever an Llm is created (or its
apiKeyRef/url is updated). Catches misconfigured tokens / wrong URLs at
registration with a 422 + structured error message, instead of silently
500-ing on first chat with a generic "fetch failed". Caught in the wild
today: the homelab Pulumi config exposed `MCPCTL_GATEWAY_TOKEN` (which
is mcpctl_pat_-prefixed, intended for LiteLLM→mcplocal direction) where
LiteLLM expects `LITELLM_MASTER_KEY` (sk-prefixed). The probe makes
this immediate.

Probe shape (LlmAdapter.verifyAuth):
  - OpenAI passthrough → GET <url>/v1/models. Cheap, idempotent, gated
    by the same auth as chat/completions.
  - Anthropic → POST /v1/messages with max_tokens:1, "ping". Anthropic
    has no list-models endpoint; this is the cheapest auth-exercising
    call.
  - Returns one of:
      { ok: true }
      { ok: false, reason: "auth", status, body }    — 401/403, fail hard
      { ok: false, reason: "unreachable", error }    — network, warn-only
      { ok: false, reason: "unexpected", status, body } — non-auth 4xx, warn-only

Behavior:
  - LlmService.create()/update() runs the probe after resolveApiKey.
    Throws LlmAuthVerificationError on `auth`, logs warn for
    unreachable/unexpected, swallows for offline registration.
  - Probe is skipped when there's no apiKeyRef (nothing to verify) or
    when the caller passes skipAuthCheck=true.
  - update() probes only when apiKeyRef OR url changes — pure
    description/tier updates don't trigger upstream calls.
  - Routes catch LlmAuthVerificationError and return 422 with
    `{ error, status }`. The CLI surfaces the message verbatim via
    ApiError.

Opt-out:
  - CLI: `mcpctl create llm ... --skip-auth-check` for offline
    registration before the upstream is reachable.
  - HTTP: side-channel body field `_skipAuthCheck: true` (stripped
    before validation, never persisted on the row).

Side fix in same commit (caught while testing): src/cli/src/index.ts
read `program.opts()` BEFORE `program.parse()`, so `--direct` was a
no-op for ApiClient — every command went to mcplocal regardless. Some
commands accidentally still worked because mcplocal forwards plain
`/api/v1/*` to mcpd, but flows that need direct SSE streaming (e.g.
`mcpctl chat`) couldn't reach mcpd. Fixed by peeking at process.argv
directly for the two global flags before Commander's parse runs.

Tests:
  - llm-adapters.test.ts (+8): OpenAI 200/401/403/404/network, Anthropic
    200/401/400 (typo'd model = unexpected, NOT auth — registration
    shouldn't block on bad model names that surface at chat time).
  - llm-service.test.ts (+6): create-throws-on-auth-fail (no row
    written), warn-only on unreachable/unexpected, skipAuthCheck
    bypass, no-key skip, update-only-probes-on-auth-affecting-change.

mcpd 775/775, mcplocal 715/715, cli 430/430.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 16:51:55 +01:00
Michal
727e7d628c feat(agents): mcpctl chat REPL + agent CRUD + completions (Stage 5)
This is the moment the user can actually talk to an agent end-to-end:

  mcpctl create llm qwen3-thinking --type openai --model qwen3-thinking \
    --url http://litellm.nvidia-nim.svc.cluster.local:4000/v1 \
    --api-key-ref litellm-key/API_KEY
  mcpctl create agent reviewer --llm qwen3-thinking --project mcpctl-dev \
    --description "I review security design — ask me after each major change."
  mcpctl chat reviewer

Pieces:

* src/cli/src/commands/chat.ts (new) — REPL + one-shot. Streams the SSE
  endpoint and prints text deltas to stdout as they arrive; tool_call /
  tool_result events go to stderr in dim-style brackets so the chat
  output stays clean. LiteLLM-style flags (--temperature / --top-p /
  --top-k / --max-tokens / --seed / --stop / --allow-tool / --extra)
  layer over agent.defaultParams. In-REPL slash-commands: /set KEY VAL,
  /system <text>, /tools (list project's MCP servers), /clear (new
  thread), /save (PATCH agent.defaultParams = current overrides),
  /quit.

* src/cli/src/commands/create.ts — `create agent` mirroring the llm
  pattern. Every yaml-applyable field has a corresponding flag (memory
  rule); --default-temperature / --default-top-p / --default-top-k /
  --default-max-tokens / --default-seed / --default-stop /
  --default-extra / --default-params-file all populate agent.defaultParams.

* src/cli/src/commands/apply.ts — AgentSpecSchema accepts both `llm:
  qwen3-thinking` shorthand and `llm: { name: ... }` long form; runs
  after llms in the apply order so apiKey/llm references resolve. Round-
  trips with `get agent foo -o yaml | apply -f -` (memory rule).

* src/cli/src/commands/get.ts — agentColumns (NAME, LLM, PROJECT,
  DESCRIPTION, ID); RESOURCE_KIND mapping for yaml export.

* src/cli/src/commands/shared.ts — `agent`/`agents`/`thread`/`threads`
  added to RESOURCE_ALIASES.

* src/cli/src/index.ts — wires createChatCommand into the program; passes
  the resolved baseUrl + token so chat can stream SSE without going
  through ApiClient (which only does buffered request/response).

* completions/mcpctl.{fish,bash} regenerated. scripts/generate-completions.ts
  knows about agents (canonical + aliases) and emits a special-case
  `chat)` block that completes the first arg with `mcpctl get agents`
  names. tests/completions.test.ts: +9 new assertions covering agents in
  the resource list, chat in the commands list, --llm flag for create
  agent, agent-name completion for chat, etc.

CLI suite: 430/430 (was 421). Completions --check is clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 17:02:38 +01:00
Michal
9a808877b5 feat(secrets): track key names so list/describe work for backend-stored secrets
Some checks failed
CI/CD / lint (push) Successful in 53s
CI/CD / test (push) Successful in 1m6s
CI/CD / typecheck (push) Successful in 2m11s
CI/CD / smoke (push) Failing after 1m42s
CI/CD / publish (push) Has been cancelled
CI/CD / build (push) Has been cancelled
Post-migration, every Secret on a non-plaintext backend had an empty `data`
column (values live in the backend; only externalRef on the row). The CLI's
\`get secrets\` showed \`KEYS: -\` and \`describe secret\` showed \`(empty)\` for
all 9 migrated secrets — useless without --show-values.

Fix: dedicated \`keyNames Json\` column on Secret that stores the sorted key
list independently from the values. Populated on every write path, lazily
backfilled on first read for pre-existing rows that pre-date the column.
Schema default \`[]\` keeps prisma db push self-healing on rolling upgrades.

- src/db/prisma/schema.prisma: add Secret.keyNames Json @default("[]")
- src/mcpd/src/repositories/secret.repository.ts: pipe keyNames through create
  + update
- src/mcpd/src/services/secret.service.ts:
  - create/update populate keyNames = sorted Object.keys(data)
  - getById lazy-backfills empty keyNames (cheap: derives from data for
    plaintext, single backend read for openbao)
- src/mcpd/src/services/secret-migrate.service.ts: migrate writes keyNames
  alongside the new backendId so freshly-migrated rows are populated without
  a follow-up read
- src/cli/src/commands/get.ts: KEYS column reads keyNames first, falls back
  to Object.keys(data) for older rows
- src/cli/src/commands/describe.ts: shows the Data section keys whenever
  keyNames OR data has entries (so backend-stored secrets render their key
  list); --show-values still resolves through the backend

After deploy, the 9 already-migrated secrets backfill their keyNames on the
next describe-by-id, with no operator action needed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 00:57:06 +01:00
Michal
b1bccee50d test(describe): mock the ?reveal=true path on --show-values
Some checks failed
CI/CD / lint (push) Successful in 54s
CI/CD / test (push) Successful in 1m7s
CI/CD / typecheck (push) Successful in 2m19s
CI/CD / smoke (push) Failing after 5m9s
CI/CD / publish (push) Has been cancelled
CI/CD / build (push) Has been cancelled
Follow-up to faccbb5: the describe-secret test for --show-values used the
old fetchResource shape, so it broke after the route now goes through
client.get directly with ?reveal=true.
2026-04-24 00:49:22 +01:00
Michal
faccbb58e7 fix(secrets): describe --show-values resolves through the backend driver
Some checks failed
CI/CD / lint (push) Successful in 55s
CI/CD / test (push) Failing after 1m5s
CI/CD / typecheck (push) Has started running
CI/CD / smoke (push) Has been cancelled
CI/CD / build (push) Has been cancelled
CI/CD / publish (push) Has been cancelled
Post-migration, every Secret on a non-plaintext backend has empty `Secret.data`
(the actual value lives in the backend; only externalRef is on the row).
`describe secret --show-values` was reading the raw row, so the user saw
"Data: (empty)" for every migrated secret.

- Route GET /api/v1/secrets/:id accepts ?reveal=true; when set, resolves the
  value via SecretService.resolveData() so the response carries the actual
  data dispatched through the right driver.
- CLI --show-values flips that query param. Without --show-values the route
  returns the raw row exactly as before (no leak risk).

Caught running the wizard end-to-end on the live cluster after the
ClusterMesh fix on the kubernetes-deployment side made bao reachable.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 00:46:54 +01:00
Michal
1c5301289c refactor(wizard): rename --admin-token → --setup-token
Some checks failed
CI/CD / typecheck (push) Has been cancelled
CI/CD / test (push) Has been cancelled
CI/CD / smoke (push) Has been cancelled
CI/CD / build (push) Has been cancelled
CI/CD / publish (push) Has been cancelled
CI/CD / lint (push) Has been cancelled
Any token with policy-write + auth/token admin works; root is a convenient
default but a scoped service account is fine too. The previous naming
misrepresented the permission floor as root-only.

- flag: --admin-token → --setup-token
- wizard field: adminToken → setupToken
- prompt label: "OpenBao admin / root token" → "OpenBao setup token (needs
  policy write + auth/token admin perms; root is fine)"
- file doc + one comment reworded
- tests updated for the new label
- regression test (token-absent-from-stdout) kept unchanged

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 17:27:09 +01:00
Michal
dd4246878d feat(openbao): wizard-provisioning + daily token rotation
Some checks failed
CI/CD / typecheck (pull_request) Successful in 55s
CI/CD / test (pull_request) Successful in 1m4s
CI/CD / lint (pull_request) Successful in 2m2s
CI/CD / smoke (pull_request) Failing after 1m36s
CI/CD / build (pull_request) Successful in 4m13s
CI/CD / publish (pull_request) Has been skipped
One-command setup replaces the 6-step manual flow — `mcpctl create
secretbackend bao --type openbao --wizard` takes the OpenBao admin token
once, provisions a narrow policy + token role, mints the first periodic
token, stores it on mcpd, verifies end-to-end, and prints the migration
command. The admin token is NEVER persisted.

The stored credential auto-rotates daily: mcpd mints a successor via the
token role (self-rotation capability is part of the policy it was issued
with), verifies the successor, writes it over the backing Secret, then
revokes the predecessor by accessor. TTL 720h means a week of rotation
failures still leaves 20+ days of runway.

Shared:
- New `@mcpctl/shared/vault` — pure HTTP wrappers (verifyHealth,
  ensureKvV2, writePolicy, ensureTokenRole, mintRoleToken, revokeAccessor,
  lookupSelf, testWriteReadDelete) and policy HCL builder.

mcpd:
- `tokenMeta Json @default("{}")` on SecretBackend. Self-healing schema
  migration — empty default lets `prisma db push` add the column cleanly.
- SecretBackendRotator.rotateOne: mint → verify → persist → revoke-old →
  update tokenMeta. Failures surface via `lastRotationError` on the row;
  the old token keeps working.
- SecretBackendRotatorLoop: on startup rotates overdue backends, schedules
  per-backend timers with ±10min jitter. Stops cleanly on shutdown.
- New `POST /api/v1/secretbackends/:id/rotate` (operation
  `rotate-secretbackend` — added to bootstrap-admin's auto-migrated ops
  alongside migrate-secrets, which was previously missing too).

CLI:
- `--wizard` on `create secretbackend` delegates to the interactive flow.
  All prompts can be pre-answered via flags (--url, --admin-token,
  --mount, --path-prefix, --policy-name, --token-role,
  --no-promote-default) for CI.
- `mcpctl rotate secretbackend <name>` — convenience verb; hits the new
  rotate endpoint.
- `describe secretbackend` renders a Token health section (healthy /
  STALE / WARNING / ERROR) with generated/renewal/expiry timestamps and
  last rotation error. Only shown when tokenMeta.rotatable is true — the
  existing k8s-auth + static-token backends don't surface it.

Tests: 15 vault-client unit tests (shared), 8 rotator unit tests (mcpd),
3 wizard flow tests (cli, including a regression test that the admin
token never appears in stdout). Full suite 1885/1885 (+32). Completions
regenerated for the new flags.

Out of scope (explicit): kubernetes-auth wizard, Vault Enterprise
namespaces in the wizard path, rotation for non-wizard static-token
backends. See plan file for details.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 17:20:37 +01:00
Michal
515206685b feat(openbao): kubernetes ServiceAccount auth — no static token in DB
Some checks failed
CI/CD / lint (push) Successful in 52s
CI/CD / test (push) Successful in 1m5s
CI/CD / typecheck (push) Successful in 2m8s
CI/CD / smoke (push) Failing after 3m38s
CI/CD / build (push) Successful in 4m15s
CI/CD / publish (push) Has been skipped
Why: requiring a static OpenBao root token to live (even once-bootstrap) on
the plaintext backend is the weakest link in the chain. With the bao-side
Kubernetes auth method enabled, mcpd's pod can authenticate using its own
projected SA token, exchange it for a short-lived Vault client token, and
keep the database free of any vault credentials at all.

Driver changes (src/mcpd/src/services/secret-backends/openbao.ts):
- New `OpenBaoConfig.auth = 'token' | 'kubernetes'`. Defaults to 'token' so
  existing rows keep working. Both shapes share url + mount + pathPrefix +
  namespace; auth-specific fields are mutually exclusive in the config schema.
- Kubernetes auth flow: read JWT from /var/run/secrets/.../token, POST to
  /v1/auth/<authMount>/login {role, jwt}, cache the returned client_token
  for `lease_duration - 60s` (grace window), then re-login.
- One-shot 403-retry: if a request comes back 403 (revoked / clock skew),
  purge cache and retry the original request once with a fresh login.
- Reads + writes go through the same getToken() path so token-auth is
  unchanged for existing deployments.

CLI (src/cli/src/commands/create.ts):
- `mcpctl create secretbackend bao --type openbao --auth kubernetes \
     --url https://bao.example:8200 --role mcpctl`
- Optional `--auth-mount` (default 'kubernetes') + `--sa-token-path` (default
  the standard projected-token path) for non-default deployments.
- Token-auth path unchanged: `--auth token --token-secret SECRET/KEY`
  (or omit `--auth` since 'token' is the default).

Validation (factory.ts) gates on the auth strategy: each path enforces its
own required fields and produces a clear error if misconfigured.

Tests: 6 new k8s-auth unit cases (login wire shape, lease-based caching,
custom authMount, 403-on-login, missing-role rejection, missing-tokenSecretRef
rejection). Full suite 1859/1859. Completions regenerated for the new flags.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 23:23:05 +01:00
Michal
de854b1944 feat(project): Project.llmProvider semantically names an Llm resource
Why: Phases 0-3 built the server-managed Llm registry; this phase pivots the
existing Project.llmProvider column from "local provider hint" to "named Llm
reference" so operators can pick a centralised Llm per project. No schema
change — the column stays a free-form string for backward compat.

- `mcpctl create project --llm <name>` (+ `--llm-model <override>`) sets
  llmProvider/llmModel to a centralised Llm reference, or 'none' to disable.
- `mcpctl describe project` fetches the Llm catalogue alongside prompts and
  flags values that don't resolve with a visible warning. 'none' is treated
  as an explicit disable, not an orphan.
- `apply -f` doc comments updated; --llm-provider still accepted but now
  documented as naming an Llm resource.
- New `resolveProjectLlmReference(mcpdClient, name)` helper in mcplocal's
  discovery: returns `registered`/`disabled`/`unregistered`/`unreachable`.
  The HTTP-mode proxy-model pipeline will consume this when it pivots to
  mcpd's /api/v1/llms/:name/infer proxy.
- project-mcp-endpoint.ts cache-namespace path gets a comment explaining
  the new resolution order — behavior unchanged, just clarified.

Tests: 6 resolver unit tests + 3 new describe-warning cases. Full suite
1853/1853 (+9 from Phase 3's 1844). TypeScript clean; completions
regenerated for the new create-project flags.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 18:28:46 +01:00
Michal
6ff90a8228 feat(mcpd): Llm resource — CRUD + CLI + apply
Why: every client that wants an LLM (the agent, HTTP-mode mcplocal, Claude
Code's STDIO mcplocal) today has to know the provider URL + key, and each
user's ~/.mcpctl/config.json carries them. Centralising the catalogue on the
server is the prerequisite for Phase 2 (mcpd proxies inference so credentials
never leave the cluster).

This phase adds the `Llm` resource and its CRUD surface — no proxy yet, no
client pivot yet. Just enough to register what you have.

Schema:
- New `Llm` model: name/type/model/url/tier/description + {apiKeySecretId,
  apiKeySecretKey} FK pair. Reverse `llms` relation on Secret.
- Provider types: anthropic | openai | deepseek | vllm | ollama | gemini-cli.
- Tiers: fast | heavy.

mcpd:
- LlmRepository + LlmService + Zod validation schema + /api/v1/llms routes.
- API surface exposes `apiKeyRef: {name, key}` — the service translates to/
  from the FK pair so clients never deal in cuids.
- `resolveApiKey(llmName)` reads through SecretService (which itself dispatches
  to the right SecretBackend). That's the hook Phase 2's inference proxy uses.
- RBAC: added `'llms'` to RBAC_RESOURCES + resource alias. Standard
  view/create/edit/delete semantics.
- Wired into main.ts (repo, service, routes).

CLI:
- `mcpctl create llm <name> --type X --model Y --tier fast|heavy --api-key-ref SECRET/KEY [--url ...] [--extra k=v ...]`
- `mcpctl get|describe|delete llm` — standard resource verbs.
- `mcpctl apply -f` with `kind: llm` (single- or multi-doc yaml/json).
  Applied after secrets, before servers — apiKeyRef resolves an existing Secret.
- Shell completions regenerated.

Tests: 11 service unit tests + 9 route tests (happy path, 404s, 409, validation).
Full suite 1812/1812 (+20 from the 1792 Phase 0 baseline). TypeScript clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 21:28:43 +01:00
Michal
029c3d5f34 feat(mcpd): pluggable SecretBackend abstraction + OpenBao driver + migrate
All checks were successful
CI/CD / typecheck (pull_request) Successful in 51s
CI/CD / lint (pull_request) Successful in 1m47s
CI/CD / test (pull_request) Successful in 1m3s
CI/CD / smoke (pull_request) Successful in 4m34s
CI/CD / build (pull_request) Successful in 3m50s
CI/CD / publish (pull_request) Has been skipped
Why: API keys live in Postgres as plaintext JSON. A DB read exposes every
credential in the system. Before centralising more secrets (LLM keys, etc.)
we want to be able to point at an external KV store and drop DB access to
sensitive rows.

New model:
- `SecretBackend` resource (CRUD + isDefault invariant) owns how a secret is
  stored. `Secret` gains `backendId` FK and `externalRef`. Reads/writes
  dispatch through a driver.
- `plaintext` driver (near-noop, uses existing Secret.data column) is seeded
  as the `default` row at startup. Acts as trust root / bootstrap.
- `openbao` driver (also HashiCorp Vault KV v2 compatible) talks plain HTTP,
  no SDK dependency. Auth via static token pulled from a plaintext-backed
  `Secret` through the injected SecretRefResolver. Caches resolved token.
- `SecretMigrateService` moves secrets one-at-a-time: read → write dest →
  flip row → best-effort source delete. Interrupted runs are idempotent
  (skips secrets already on destination).

CLI surface:
- `mcpctl create|get|describe|delete secretbackend` + `--default` on create.
- `mcpctl migrate secrets --from X --to Y [--names a,b] [--keep-source] [--dry-run]`
- `apply -f` round-trips secretbackends (yaml/json multi-doc + grouped).
- RBAC: `secretbackends` resource + `run:migrate-secrets` operation.
- Fish + bash completions regenerated.

docs/secret-backends.md covers the OpenBao policy, chicken-and-egg auth flow,
and the migration semantics.

Broke the circular dep (OpenBao needs SecretService to resolve its own token,
SecretService needs SecretBackendService) with a deferred-resolver bridge in
mcpd startup. 11 new driver unit tests; existing env-resolver/secret-route/
backup tests updated for the new service signatures. Full suite: 1792/1792.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 19:29:55 +01:00
Michal
f68e123821 fix(cli): https support in status + api-client; add demo-mcp-call.py
All checks were successful
CI/CD / lint (pull_request) Successful in 1m40s
CI/CD / typecheck (pull_request) Successful in 1m35s
CI/CD / test (pull_request) Successful in 2m16s
CI/CD / build (pull_request) Successful in 2m17s
CI/CD / smoke (pull_request) Successful in 4m37s
CI/CD / publish (pull_request) Has been skipped
- status.ts + api-client.ts now dispatch on URL scheme so an https
  mcpd URL no longer crashes with "Protocol https: not supported".
  Caught by fulldeploy smoke runs — status.ts had `import http` only
  and was synchronously throwing against https://mcpctl.ad.itaz.eu.
  Each http.get call is wrapped so future scheme-mismatch errors also
  degrade to "unreachable" instead of a stack trace.
- .dockerignore no longer excludes src/mcplocal/ (the new
  Dockerfile.mcplocal needs those files).
- scripts/demo-mcp-call.py: standalone, stdlib-only Python demo that
  makes an MCP request (initialize + tools/list, optional tools/call)
  using an mcpctl_pat_ bearer. Counterpart to `mcpctl test mcp` for
  showing external (e.g. vLLM) clients how the bearer flow works.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 22:34:00 +01:00
Michal
2127b41d9f feat: HTTP-mode mcplocal container + mcpctl test mcp + token-auth preHandler
Delivers the final piece of the mcptoken stack: a containerized,
network-accessible mcplocal that serves Streamable-HTTP MCP to off-host
clients (the vLLM use case), authenticated by project-scoped McpTokens.

New binary (same package, new entry):
  - src/mcplocal/src/serve.ts — HTTP-only entry. Reads MCPLOCAL_MCPD_URL,
    MCPLOCAL_MCPD_TOKEN, MCPLOCAL_HTTP_HOST/PORT, MCPLOCAL_CACHE_DIR from
    env. No StdioProxyServer, no --upstream.
  - src/mcplocal/src/http/token-auth.ts — Fastify preHandler that
    validates mcpctl_pat_ bearers via mcpd's /api/v1/mcptokens/introspect.
    30s positive / 5s negative TTL. Rejects wrong-project with 403.

Shared HTTP MCP client:
  - src/shared/src/mcp-http/ — reusable McpHttpSession with initialize,
    listTools, callTool, close. Handles http+https, SSE, id correlation,
    distinct McpProtocolError / McpTransportError. Plus mcpHealthCheck
    and deriveBaseUrl helpers.

New CLI verb `mcpctl test mcp <url>`:
  - Flags: --token (also $MCPCTL_TOKEN), --tool, --args (JSON),
    --expect-tools, --timeout, -o text|json, --no-health.
  - Exit codes: 0 PASS, 1 TRANSPORT/AUTH FAIL, 2 CONTRACT FAIL.

Container + deploy:
  - deploy/Dockerfile.mcplocal (Node 20 alpine, multi-stage, pnpm
    workspace, CMD node src/mcplocal/dist/serve.js, VOLUME
    /var/lib/mcplocal/cache, HEALTHCHECK on :3200/healthz).
  - scripts/build-mcplocal.sh mirrors build-mcpd.sh.
  - fulldeploy.sh is now a 4-step pipeline that also builds + rolls out
    mcplocal (gated on `kubectl get deployment/mcplocal` so the script
    stays green before the Pulumi stack lands).

Audit + cache:
  - project-mcp-endpoint.ts passes MCPLOCAL_CACHE_DIR into FileCache at
    both construction sites and, when request.mcpToken is present, calls
    collector.setSessionMcpToken(id, ...) so audit events carry the
    tokenName/tokenSha.

Tests:
  - 9 unit cases on `mcpctl test mcp` (happy path, health miss,
    expect-tools hit/miss, transport throw, tool isError, json report,
    $MCPCTL_TOKEN env fallback, invalid --args).
  - Smoke test src/mcplocal/tests/smoke/mcptoken.smoke.test.ts —
    gated on healthz($MCPGW_URL), skipped cleanly when unreachable.
    Covers happy path, wrong-project 403, --expect-tools contract
    failure, and revocation 401 within the negative-cache window.

1773/1773 workspace tests pass. Pulumi resources (Deployment, Service,
Ingress, PVC, Secret, NetworkPolicy) still need to land in
../kubernetes-deployment before the smoke gate flips on.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 01:21:42 +01:00
Michal
a151b2e756 feat: mcpctl mcptoken verbs + mcpd auth dispatch + audit plumbing
Adds the end-to-end CLI surface for McpTokens and the mcpd auth dispatch
that recognizes them.

mcpd auth middleware:
  - Dispatch on the `mcpctl_pat_` bearer prefix. McpToken bearers resolve
    through a new `findMcpToken(hash)` dep, populating `request.mcpToken`
    and `request.userId = ownerId`. Everything else follows the existing
    session path.
  - Returns 401 for revoked / expired / unknown tokens.
  - Global RBAC hook now threads `mcpTokenSha` into `canAccess` /
    `canRunOperation` / `getAllowedScope`, and enforces a hard
    project-scope check: a McpToken principal can only hit
    `/api/v1/projects/<its-project>/...`.

CLI verbs:
  - `mcpctl create mcptoken <name> -p <proj> [--rbac empty|clone]
    [--bind role:view,resource:servers] [--ttl 30d|never|ISO]
    [--description ...] [--force]` — returns the raw token once.
  - `mcpctl get mcptokens [-p <proj>]` — table with
    NAME/PROJECT/PREFIX/CREATED/LAST USED/EXPIRES/STATUS.
  - `mcpctl get mcptoken <name> -p <proj>` and
    `mcpctl describe mcptoken <name> -p <proj>` — describe surfaces the
    auto-created RBAC bindings.
  - `mcpctl delete mcptoken <name> -p <proj>`.
  - `apply -f` support with `kind: mcptoken`. Tokens are immutable, so
    apply creates if missing and skips if the name is already active.

Audit plumbing:
  - `AuditEvent` / collector now carry optional `tokenName` / `tokenSha`.
    `setSessionMcpToken` sits alongside `setSessionUserName`; both feed a
    per-session principal map used at emit time.
  - `AuditEventService` query accepts `tokenName` / `tokenSha` filters.
  - Console `AuditEvent` type carries the new fields so a follow-up can
    add a TOKEN column.

Completions regenerated. 1764/1764 tests pass workspace-wide.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 01:12:43 +01:00
Michal
efcfeeab65 feat(cli)!: migrate create rbac bindings to --roleBindings kv syntax
BREAKING: `mcpctl create rbac` no longer accepts `--binding` or
`--operation`. Use `--roleBindings` instead with key:value pairs:

  # resource binding
  --roleBindings role:view,resource:servers
  --roleBindings role:view,resource:servers,name:my-ha

  # operation binding (role:run is implied by action:)
  --roleBindings action:logs

The on-disk YAML shape (`roleBindings: [{role, resource, name?}]` or
`{role:'run', action}`) is unchanged, so Git backups and existing
`apply -f` files continue to work. Only the command-line input format
changes.

The parser is extracted to src/cli/src/commands/rbac-bindings.ts so the
upcoming `mcpctl create mcptoken --bind <kv>` verb can reuse it.

Completions, tests, and the new parser unit test all pass (406/406).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 01:03:57 +01:00
Michal
3149ea3ae7 fix: MCP proxy resilience — discovery cache, default liveness probes
Some checks failed
CI/CD / lint (push) Successful in 52s
CI/CD / typecheck (push) Successful in 1m51s
CI/CD / test (push) Successful in 1m1s
CI/CD / smoke (push) Failing after 3m21s
CI/CD / build (push) Successful in 4m9s
CI/CD / publish (push) Has been skipped
Adds a per-server tools/list cache in McpRouter (positive + negative TTL)
so a slow or dead upstream only stalls the first discovery call, not every
subsequent client request. Invalidated on upstream add/remove.

Health probes now apply a default liveness spec (tools/list via the real
production path) to any RUNNING instance without an explicit healthCheck,
so synthetic and real failures converge on the same signal.

Includes supporting updates in mcpd-client, discovery, upstream/mcpd,
seeder, and fulldeploy/release scripts.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 00:48:57 +01:00
Michal
857f8c72ae fix: MCP proxy resilience — timeouts, parallel discovery, error propagation
All checks were successful
CI/CD / typecheck (pull_request) Successful in 49s
CI/CD / lint (pull_request) Successful in 1m49s
CI/CD / test (pull_request) Successful in 1m4s
CI/CD / build (pull_request) Successful in 1m49s
CI/CD / publish-rpm (pull_request) Has been skipped
CI/CD / publish-deb (pull_request) Has been skipped
CI/CD / smoke (pull_request) Successful in 10m3s
- McpdClient: add 30s AbortSignal timeout to all fetch calls (was infinite)
- CLI bridge: return JSON-RPC error on stdout when HTTP fails (was silent)
- Router: parallel tool/resource discovery via Promise.allSettled (was sequential — one slow server blocked all)
- vllm-managed: 60s error cooldown prevents retry-on-every-call when vLLM is broken
- Tests: McpdClient timeout suite (9), parallel discovery, vllm cooldown, bridge error response

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 18:28:03 +01:00
Michal
af4b3fb702 feat: store backup config in DB secret instead of env var
Move backup SSH keys and repo URL from MCPD_BACKUP_REPO env var to a
"backup-ssh" secret in the database. Keys are auto-generated on first
init and stored back into the secret. Also fix ERR_HTTP_HEADERS_SENT
crash caused by reply.send() without return in routes when onSend hook
is registered.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-08 13:53:12 +00:00
Michal
6bce1431ae fix: backup disabled message now explains how to enable
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-08 13:33:36 +00:00
Michal
98f3a3eda0 refactor: consolidate restore under backup command
mcpctl backup restore list/diff/to instead of separate mcpctl restore.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-08 01:17:03 +00:00
Michal
7818cb2194 feat: Git-based backup system replacing JSON bundle backup/restore
DB is source of truth with git as downstream replica. SSH key generated
on first start, all resource mutations committed as apply-compatible YAML.
Supports manual commit import, conflict resolution (DB wins), disaster
recovery (empty DB restores from git), and timeline branches on restore.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-08 01:14:28 +00:00
Michal
d773419ccd feat: enhanced MCP inspector with proxymodel switching and provenance view
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 23:37:01 +00:00
Michal
a2728f280a feat: file cache, pause queue, hot-reload, and cache CLI commands
- Persistent file cache in ~/.mcpctl/cache/proxymodel/ with LRU eviction
- Pause queue for temporarily holding MCP traffic
- Hot-reload watcher for custom stages and proxymodel definitions
- CLI: mcpctl cache list/clear/stats commands
- HTTP endpoints for cache and pause management

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 23:36:55 +00:00
Michal
0995851810 feat: remove proxyMode — all traffic goes through mcplocal proxy
proxyMode "direct" was a security hole (leaked secrets as plaintext env
vars in .mcp.json) and bypassed all mcplocal features (gating, audit,
RBAC, content pipeline, namespacing). Removed from schema, API, CLI,
and all tests. Old configs with proxyMode are accepted but silently
stripped via Zod .transform() for backward compatibility.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 23:36:36 +00:00
Michal
d9d0a7a374 docs: update README for plugin system, add proxyModel tests
- Rewrite README Content Pipeline section as Plugin System section
  documenting built-in plugins (default, gate, content-pipeline),
  plugin hooks, and the relationship between gating and proxyModel
- Update all README examples to use --proxy-model instead of --gated
- Add unit tests: proxyModel normalization in JSON/YAML output (4 tests),
  Plugin Config section in describe output (2 tests)
- Add smoke tests: yaml/json output shows resolved proxyModel without
  gated field, round-trip compatibility (4 tests)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 01:24:47 +00:00
Michal
f60d40a25b fix: normalize proxyModel in yaml/json output, drop deprecated gated field
Resolves proxyModel from gated boolean when the DB value is empty
(pre-migration projects). The gated field is no longer included in
get -o yaml/json output, making it apply-compatible with the new schema.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 00:45:31 +00:00
Michal
a22a17f8d3 feat: make proxyModel the primary plugin control field
- proxyModel field now determines both YAML pipeline stages AND plugin
  gating behavior ('default'/'gate' = gated, 'content-pipeline' = not)
- Deprecate --gated/--no-gated CLI flags (backward compat preserved:
  --no-gated maps to --proxy-model content-pipeline)
- Replace GATED column with PLUGIN in `get projects` output
- Update `describe project` to show "Plugin Config" section
- Unify proxymodel discovery: GET /proxymodels now returns both YAML
  pipeline models and TypeScript plugins with type field
- `describe proxymodel gate` shows plugin hooks and extends info
- Update CLI apply schema: gated is now optional (not required)
- Regenerate shell completions
- Tests: proxymodel endpoint (5), smoke tests (8)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 00:32:13 +00:00
Michal
86c5a61eaa feat: add userName tracking to audit events
- Add userName column to AuditEvent schema with index and migration
- Add GET /api/v1/auth/me endpoint returning current user identity
- AuditCollector auto-fills userName from session→user map, resolved
  lazily via /auth/me on first session creation
- Support userName and date range (from/to) filtering on audit events
  and sessions endpoints
- Audit console sidebar groups sessions by project → user
- Add date filter presets (d key: all/today/1h/24h/7d) to console
- Add scrolling and page up/down to sidebar navigation
- Tests: auth-me (4), audit-username collector (4), route filters (2),
  smoke tests (2)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 00:18:58 +00:00
Michal
75c44e4ba1 fix: audit console navigation — use arrow keys like main console
- Sidebar open: arrows navigate sessions, Enter selects, Escape closes
- Sidebar closed: arrows navigate timeline, Escape reopens sidebar
- Fix crash on `data.events.reverse()` when API returns non-array
- Fix blinking from useCallback re-creating polling intervals (use useRef)
- Remove 's' key session cycling — use standard arrow+Enter pattern

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 00:00:59 +00:00
Michal
5d859ca7d8 feat: audit console TUI, system prompt management, and CLI improvements
Audit Console Phase 1: tool_call_trace emission from mcplocal router,
session_bind/rbac_decision event kinds, GET /audit/sessions endpoint,
full Ink TUI with session sidebar, event timeline, and detail view
(mcpctl console --audit).

System prompts: move 6 hardcoded LLM prompts to mcpctl-system project
with extensible ResourceRuleRegistry validation framework, template
variable enforcement ({{maxTokens}}, {{pageCount}}), and delete-resets-
to-default behavior. All consumers fetch via SystemPromptFetcher with
hardcoded fallbacks.

CLI: -p shorthand for --project across get/create/delete/config commands,
console auto-scroll improvements, shell completions regenerated.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 23:50:54 +00:00
Michal
03827f11e4 feat: eager vLLM warmup and smart page titles in paginate stage
- Add warmup() to LlmProvider interface for eager subprocess startup
- ManagedVllmProvider.warmup() starts vLLM in background on project load
- ProviderRegistry.warmupAll() triggers all managed providers
- NamedProvider proxies warmup() to inner provider
- paginate stage generates LLM-powered descriptive page titles when
  available, cached by content hash, falls back to generic "Page N"
- project-mcp-endpoint calls warmupAll() on router creation so vLLM
  is loading while the session initializes

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 19:07:39 +00:00
Michal
69867bd47a feat: mcpctl v0.0.1 — first public release
Some checks are pending
CI / lint (push) Waiting to run
CI / typecheck (push) Waiting to run
CI / test (push) Waiting to run
CI / build (push) Blocked by required conditions
CI / package (push) Blocked by required conditions
Comprehensive MCP server management with kubectl-style CLI.

Key features in this release:
- Declarative YAML apply/get round-trip with project cloning support
- Gated sessions with prompt intelligence for Claude
- Interactive MCP console with traffic inspector
- Persistent STDIO connections for containerized servers
- RBAC with name-scoped bindings
- Shell completions (fish + bash) auto-generated
- Rate-limit retry with exponential backoff in apply
- Project-scoped prompt management
- Credential scrubbing from git history

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-27 17:05:05 +00:00
Michal
414a8d3774 fix: stub react-devtools-core for bun compile
Ink statically imports react-devtools-core (only used when DEV=true).
With --external, bun compile leaves a runtime require that fails in the
standalone binary. Instead, provide a no-op stub that bun bundles inline.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 00:06:31 +00:00
Michal
a59d2237b9 feat: interactive MCP console (mcpctl console <project>)
Ink-based TUI that shows exactly what an LLM sees through MCP.
Browse tools/resources/prompts, execute them, and see raw JSON-RPC
traffic in a protocol log. Supports gated session flow with
begin_session, raw JSON-RPC input, and session reconnect.

- McpSession class wrapping HTTP transport with typed methods
- 12 React/Ink components (header, protocol-log, menu, tool/resource/prompt views, etc.)
- 21 unit tests for McpSession against a mock MCP server
- Fish + Bash completions with project name argument
- bun compile with --external react-devtools-core

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 23:56:23 +00:00
Michal
ecc9c48597 feat: gated project experience & prompt intelligence
Implements the full gated session flow and prompt intelligence system:

- Prisma schema: add gated, priority, summary, chapters, linkTarget fields
- Session gate: state machine (gated → begin_session → ungated) with LLM-powered
  tool selection based on prompt index
- Tag matcher: intelligent prompt-to-tool matching with project/server/action tags
- LLM selector: tiered provider selection (fast for gating, heavy for complex tasks)
- Link resolver: cross-project MCP resource references (project/server:uri format)
- Prompt summary service: LLM-generated summaries and chapter extraction
- System project bootstrap: ensures default project exists on startup
- Structural link health checks: enrichWithLinkStatus on prompt GET endpoints
- CLI: create prompt --priority/--link, create project --gated/--no-gated,
  describe project shows prompts section, get prompts shows PRI/LINK/STATUS
- Apply/edit: priority, linkTarget, gated fields supported
- Shell completions: fish updated with new flags
- 1,253 tests passing across all packages

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 23:22:42 +00:00
Michal
50ffa115ca fix: per-provider health checks in /llm/providers and status display
The /llm/providers endpoint now runs isAvailable() on each provider in
parallel and returns health status per provider. The status command shows
✓/✗ per provider based on actual availability, not just the fast tier.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 02:25:06 +00:00
Michal
d2be0d7198 feat: tiered LLM providers (fast/heavy) with multi-provider config
Adds tier-based LLM routing so fast local models (vLLM, Ollama) handle
structured tasks while cloud models (Gemini, Anthropic) are reserved for
heavy reasoning. Single-provider configs continue to work via fallback.

- Tier type + ProviderRegistry with assignTier/getProvider/fallback chain
- Multi-provider config format: { providers: [{ name, type, tier, ... }] }
- NamedProvider wrapper for multiple instances of same provider type
- Setup wizard: Simple (legacy) / Advanced (fast+heavy tiers) modes
- Status display: tiered view with /llm/providers endpoint
- Call sites use getProvider('fast') instead of getActive()
- Full backward compatibility with existing single-provider configs

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 02:16:08 +00:00
Michal
637bf3d112 fix: warmup ACP subprocess eagerly to avoid 30s cold-start on status
The pool refactor made ACP client creation lazy, causing the first
/llm/health call to spawn + initialize + prompt Gemini in one request
(30s+). Now warmup() eagerly starts the subprocess on mcplocal boot.
Also fetch models in parallel with LLM health check.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 01:37:30 +00:00
Michal
61a07024e9 feat: per-project LLM models, ACP session pool, smart pagination tests
- ACP session pool with per-model subprocesses and 8h idle eviction
- Per-project LLM config: local override → mcpd recommendation → global default
- Model override support in ResponsePaginator
- /llm/models endpoint + available models in mcpctl status
- Remove --llm-provider/--llm-model from create project (use edit/apply)
- 8 new smart pagination integration tests (e2e flow)
- 260 mcplocal tests, 330 CLI tests passing

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 01:29:38 +00:00
Michal
de95dd287f feat: completions update, create promptrequest, LLM flag rename, ACP content fix
- Add prompts/promptrequests to shell completions (fish + bash)
- Add approve, setup, prompt, promptrequest commands to completions
- Add `create promptrequest` CLI command (POST /projects/:name/promptrequests)
- Rename --proxy-mode-llm-provider/model to --llm-provider/model
- Fix ACP client: handle single-object content format from real Gemini
- Add tests for single-object content and agent_thought_chunk filtering

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 00:21:31 +00:00
Michal
cd12782797 fix: LLM health check via mcplocal instead of spawning gemini directly
Status command now queries mcplocal's /llm/health endpoint instead of
spawning the gemini binary. This uses the persistent ACP connection
(fast) and works for any configured provider, not just gemini-cli.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 00:03:25 +00:00
Michal
ce19427ec6 feat: persistent Gemini ACP provider + status spinner
Replace per-call gemini CLI spawning (~10s cold start each time) with
persistent ACP (Agent Client Protocol) subprocess. First call absorbs
the cold start, subsequent calls are near-instant over JSON-RPC stdio.

- Add AcpClient: manages persistent gemini --experimental-acp subprocess
  with lazy init, auto-restart on crash/timeout, NDJSON framing
- Add GeminiAcpProvider: LlmProvider wrapper with serial queue for
  concurrent calls, same interface as GeminiCliProvider
- Add dispose() to LlmProvider interface + disposeAll() to registry
- Wire provider disposal into mcplocal shutdown handler
- Add status command spinner with progressive output and color-coded
  LLM health check results (green checkmark/red cross)
- 25 new tests (17 ACP client + 8 provider)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 23:52:04 +00:00